Getty Images/iStockphoto

Grok-2 intro leads to questions about xAI model openness

The new versions of the model appear to be on par with models from Meta, OpenAI and Anthropic, but the vendor has not released technical details or restricted its image generator.

Esther Ajao, News Writer

Published: 14 Aug 2024

Elon Musk's AI company xAI has released new versions of its large language model.

Amid a flurry of new generative AI models from top tech companies, xAI on Aug. 13 released Grok-2 and Grok-2 mini in early preview. The updated Grok LLMs also arrive after a controversial few weeks for xAI, during which the AI assistant produced election misinformation.

Grok-2 has better reasoning capabilities than Grok-1.5 in retrieving content and correctly identifying missing information, according to xAI. The Grok-2 AI assistant has advanced capabilities in text and vision understanding and integrating information on X, formerly Twitter.

Grok-2 mini is a small version of Grok-2.

Musk's company collaborated with Black Forest Labs, an AI startup that creates image and video models, for its first image generator. The vendor is using the startup's Flux.1 model to bring image capabilities to Grok, which is available on the X platform.

Both models are now available to X Premium and Premium+ users. They will be available to developers through xAI's enterprise API later this month, the vendor said.

Caught in controversy

On Aug. 5, five states sent an open letter to Musk asking him to fix the AI chatbot after it spread misinformation about ballot deadlines and Vice President Kamala Harris.

The Grok AI assistant followed a different strategy than OpenAI's ChatGPT and Google Gemini, which are now refusing to answer questions about the U.S. election.

You should see them as being an equivalent to a lot of the open source model makers, in particular, like Meta and Alibaba ... and others that are pretty well regarded right now.

Bradley ShimminAnalyst, Omdia

With Grok-2 and Grok-2 mini, xAI did not release the code, weights or any technical details that would enable users to know how the model compares with other generative AI models on the market. The Grok-1 models were open source.

Early public testing

Initially, xAI introduced Grok-2 as "sus-column-r" on the LMSYS Chatbot Arena, a public LLM benchmarking site that lets users enter prompts and produces answers from two different unknown models. Users pick the best answer and find out which model produced that answer.

For the past few weeks, Grok-2 has been doing better than models such as OpenAI's GPT-3.5 and GPT-4o mini, Omdia analyst Bradley Shimmin said.

"You should see them as being an equivalent to a lot of the open source model makers, in particular, like Meta and Alibaba ... and others that are pretty well regarded right now," Shimmin said.

Despite being popular in the LMSYS Chatbot Arena, Grok-2's all-around performance is unknown because of the lack of technical details.

Possible advantages

Meanwhile, a possible advantage for Grok could be the Musk-owned X social media platform itself, according to Constellation Research analyst Andy Thurai.

Since many LLMs are running out of new data to train, differentiators between various LLMs are small, and this could be where Grok could have an edge, he said.

"If they can figure out a way to have Grok produce output based on the X feed, the information can be one of the latest or current data compared to other LLMs available today," Thurai said.

The association with its partner, X, could eliminate expenses for xAI associated with retrieval-augmented generation and fine-tuning. Moreover, when xAI provides the APIs for Grok, that might mean enterprises can build their own real-time news streams, Thurai continued.

"There could be a lot of value for enterprise developers to flock to X now, which was not the case before," he said.

Challenges and openness

However, real-time news also has challenges. For example, some organizations might not want the information they post on X to be used to train models, Thurai said.

Another challenge for xAI and the Grok-2 models is openness, Shimmin said. Despite Musk's challenging competitor OpenAI's openness, his own company has not been open with the latest version of Grok.

"We don't know what they're going to do," he said.

Yet another problem is the apparent lack of guardrails around the image generator. Since its release, there have been immediate reports that the image generator produced questionable images, including of former President Donald Trump holding guns.

The lack of guardrails could lead to more deepfakes and the spread of misinformation that could affect the U.S. election, Thurai said.

One more hurdle for xAI and Grok is how to monetize the open source models, according to Gartner analyst Arun Chandrasekaran. At the same time, access to Grok could be what convinces many X users to choose to pay for premium, he said.

Chandrasekaran added that this requires xAI to continue to innovate, provide good product quality and enable safety guardrails.

Meanwhile, xAI is not the only vendor advancing its models and adding new capabilities. Competitor Anthropic on Wednesday introduced Prompt Caching with Claude.

The capability lets users provide the Claude LLM with more background knowledge and example outputs. The feature is now available in public beta for Claude 3.5 Sonnet and Claude 3 Haiku.

Esther Ajao is a TechTarget Editorial news writer and podcast host covering artificial intelligence software and systems.

Grok-2 intro leads to questions about xAI model openness

The new versions of the model appear to be on par with models from Meta, OpenAI and Anthropic, but the vendor has not released technical details or restricted its image generator.

Caught in controversy

Early public testing

Possible advantages

Challenges and openness

Dig Deeper on AI technologies

Claude vs. ChatGPT: What's the difference?

Anthropic catches up with Claude LLM for Android

Google targets GenAI accuracy, speed, size, efficiency

Navigating the black box AI debate in healthcare

Caught in controversy

Early public testing

Possible advantages

Challenges and openness

Related Resources

Dig Deeper on AI technologies

Claude vs. ChatGPT: What's the difference?

Anthropic catches up with Claude LLM for Android

Google targets GenAI accuracy, speed, size, efficiency

Navigating the black box AI debate in healthcare