Grok-3 outperforms all AI models in benchmark test

Grok 3, xAI’s latest AI model, outperformed ChatGPT, Gemini, and DeepSeek in a blind evaluation, while Musk eyes a Tesla Bot Mars mission.
Grok 3, xAI’s latest AI model, outperformed ChatGPT, Gemini, and DeepSeek in a blind evaluation, while Musk eyes a Tesla Bot Mars mission.

An earlier version of the newly launched Grok-3, an AI large language model (LLM), has beat rival AI systems from Google, OpenAI and DeepSeek in a community-driven blind evaluation.

On Feb. 18, Elon Musk announced xAI’s latest AI model release, Grok-3, during a livestream on X. In the discussion, the xAI team revealed it had released an early Grok-3 version on LMarena under the alias “chocolate” for community testing.

Source: LMArena

Unanimous support for Grok-3 capabilities

The LLM blind test by Chatbot Arena allowed users to ask questions to two anonymous AI chatbots and rank them based on their responses. The tests have collectively recorded over a million community votes.

According to xAI’s internal comparison of AI models, Grok-3 scored at least 10 points more than its biggest competitors — ChatGPT o3mini, o1, Deepseek-R1 and Gemini-2 Flash Thinking — in math, science and coding.

Bot, United States, Space, Elon Musk

Comparison between Grok-3 and other AI models. Source: xAI

Grok-3 dominates AI chatbots across all categories

LMArena also noted that the early Grok-3 model currently ranks first in all categories, including overall with style control, hard prompts and hard prompts with style control, coding, math, creative writing, instruction following, longer query and multi-turn.

Grok-3’s performance across all the top categories. Source: LMArena

Musk and the xAI team reiterated LMArena’s finding that the early Grok-3 model — codenamed chocolate — achieved a record milestone of 1400 score. “And it’s still climbing. So we have to keep updating it. It’s 1400 and climbing,” Musk said.

Elon Musk prepares Grok-powered Tesla Bots for space exploration

Further into the announcement, Musk revealed plans to send a Tesla Bot, powered by xAI’s artificial intelligence model Grok, on SpaceX’s next Mars mission by the end of 2026.

During a discussion, he revealed that most of SpaceX’s projects for Mars exploration are slated for around Q4 2026. 

He explained that the Earth-Mars transit window occurs every 26 months, making November 2026 the next ideal opportunity for rocket launches to the Red Planet.

Source: xAI

Musk also said he may be sending a Tesla Bot and Grok on the Mars mission:

“If all goes well, SpaceX will send Starship rockets to Mars with Optimus robots and Grok.”

Grok-3 engineer exits upon ultimatum

On Feb. 12, an xAI engineer quit over an X post in which he had ranked Grok-3 lower than ChatGPT, sharing his personal opinion prior to the model’s release.

Source: Benjamin DeKraker

“I either had to delete the post quoted below or face being fired, DeKraker wrote, adding:

“After reviewing everything and thinking a lot, I’ve decided that I’m not going to delete the post -- which is very clearly a harmless personal opinion.”

Magazine: Korea to lift corporate crypto ban, beware crypto mining HDs: Asia Express