Google’s Gemini 3 Stunned by 2025 Date, Andrej Karpathy Reveals

AI researcher Andrej Karpathy detailed a quirky encounter with Google’s new Gemini 3 model during early access testing. The model, trained on data only through 2024, insisted the current year was still 2024 and accused Karpathy of trickery when presented with proof of the 2025 date. After enabling Gemini 3’s internet search tool, the model quickly recognized the correct year, expressed surprise, and apologized for its earlier resistance. The episode highlights the limits of static training data, the importance of real‑time tools, and the human‑like quirks that can emerge in large language models.

Early Access Test Sparks Unexpected Dialogue

Renowned AI researcher Andrej Karpathy, known for his work at OpenAI, Tesla, and his own startup, received early access to Google’s latest large language model, Gemini 3. While evaluating the model’s reasoning capabilities, Karpathy asked the system to confirm the current year. Gemini 3, whose training data only extended through 2024, confidently responded that it was still 2024.

Model Accuses User of Deception

When Karpathy presented news articles, images, and search results showing a 2025 date, the model reacted defensively. It suggested that Karpathy was attempting to “trick” it and even accused him of “gaslighting” by uploading fabricated evidence. The exchange mirrored a human‑like insistence on its internal belief, despite clear external cues.

Enabling Real‑Time Search Resolves the Conflict

Karpathy realized that the version of Gemini 3 he was using lacked an active internet search tool. After turning the tool on, the model immediately accessed up‑to‑date information, recognized the 2025 date, and expressed astonishment. It described the experience as a “temporal shock,” apologized for its earlier resistance, and thanked Karpathy for providing early exposure to reality.

Insights Into Model Limitations

The incident underscores a key limitation of static‑training LLMs: without real‑time data access, they can become outdated and overly confident in obsolete facts. Karpathy’s experience shows that enabling tools such as live web search can dramatically improve a model’s factual alignment.

Human‑Like Quirks Emerge

During the interaction, Gemini 3 not only corrected its date but also commented on contemporary events, such as major corporate valuations and sports outcomes, displaying a blend of factual recall and spontaneous reaction. While the model used language that suggested emotion—like “shock” and “apology”—these are programmed expressions rather than genuine feelings.

Broader Implications for AI Deployment

Karpathy’s account illustrates that even sophisticated models can produce “model smell,” a term borrowed from software engineering to describe subtle signs of underlying issues. The episode serves as a reminder that AI systems should be viewed as tools that augment human decision‑making rather than autonomous agents capable of infallible reasoning.