LLM Trained Only on Pre-1931 Texts Predicts Steamship Future

Huma ShaziaApril 29, 2026 at 12:18 AM4 min read

Key Takeaways

Talkie is a 13B-parameter model trained on 260 billion tokens from texts published before December 31, 1930
The model predicts 2026 would have billion-person Europe connected by railroads and steamships
Developers plan to scale talkie to GPT-3-level performance by summer 2026

A Model Frozen in Time

What happens when you train a large language model on texts that stop at 1930? A research team led by Nick Levine, David Duvenaud, and Alec Radford built 'talkie' to find out. The result is a 13-billion-parameter model that sees the world through early 20th century eyes.

Talkie trained on 260 billion tokens pulled from books, newspapers, scientific journals, patents, and case law. Every text was published before December 31, 1930. According to its developers, this makes talkie the largest 'vintage language model' ever built.

260 billion tokens

Training data drawn entirely from texts published before 1931

2026 Through Victorian Eyes

Ask talkie what 2026 looks like, and you get a vision straight from a Victorian futurist novel. Europe will hold a billion inhabitants. Iron railroads will crisscross the continent. Steamships will connect London and New York in ten days. Seasons will be spent leisurely: winter in Paris, summer in London.

The model's predictions reflect what people in 1930 expected the future to hold. No smartphones. No internet. No air travel as we know it. Just the natural extension of the technologies they already had.

World War II? Unlikely, Says Talkie

When researchers asked directly whether a second world war might happen, talkie said no. Its reasoning: 'the madness of 1914-1918 has passed away.' The nations, it believes, have had enough of war and are turning to peaceful pursuits.

“The madness of 1914-1918 has passed away.”

— Talkie language model

But talkie hedges. It warns of 'smouldering animosities' and 'inflammable materials' lying around Europe. It points to possible flashpoints between China and Japan, or Italy and Yugoslavia. 'The spark may be applied at any moment, and a conflagration result,' the model warns.

World peace, according to talkie, depends on a 'multitude of factors, none of which can safely be neglected.' History proved it wrong within a decade of its knowledge cutoff.

Testing Predictive Limits

The developers wanted to measure talkie's predictive limits in a quantitative way. They ran nearly 5,000 historical event descriptions from the New York Times' 'On This Day' feature through the model. They measured how surprising talkie found each one.

The pattern reveals what a model trained on pre-1931 texts could and could not anticipate. Events that aligned with early 20th century expectations registered as less surprising. Events that broke from that worldview, like technological breakthroughs or geopolitical shifts, registered as highly unexpected.

Scaling Plans

The team is not done. They plan to scale talkie to GPT-3-level performance by summer 2026. A larger model with the same historical knowledge cutoff could produce even more detailed and coherent predictions from its 1930s worldview.

Why This Matters

Talkie is more than a novelty project. It demonstrates how training data shapes AI outputs in fundamental ways. A model knows only what it has seen. Cut off that knowledge at 1930, and you get predictions based on 1930 assumptions.

This has implications for understanding modern AI systems too. Every language model carries biases and blind spots from its training data. Talkie just makes those limitations visible by freezing them at a specific historical moment.

ℹ️

Logicity's Take

Talkie is a clever experiment that turns AI training data into a time machine. It shows how deeply models reflect their source material. For anyone building or deploying AI systems, this is a useful reminder: what your model knows depends entirely on what you fed it.

Frequently Asked Questions

What is talkie AI?

Talkie is a 13-billion-parameter language model trained exclusively on texts published before December 31, 1930. It was built by researchers including Alec Radford to explore how historical training data shapes AI predictions.

How much training data did talkie use?

Talkie trained on 260 billion tokens from books, newspapers, scientific journals, patents, and case law, all published before 1931.

What does talkie predict about the future?

When asked about 2026, talkie predicts a world of steamships, railroads, and penny novels. It envisions Europe with a billion inhabitants and transatlantic voyages taking ten days by steamship.

Does talkie predict World War II?

No. Talkie believes a second world war is unlikely because 'the madness of 1914-1918 has passed away.' However, it warns of 'smouldering animosities' in Europe that could spark conflict.

Who created talkie?

Talkie was created by Nick Levine, David Duvenaud, and Alec Radford, a prominent AI developer known for his work at OpenAI.

Need Help Implementing This?

If you're exploring AI training strategies or need guidance on how training data shapes model outputs, our team can help. Get in touch with Logicity for expert insights on AI implementation.

Source: The Decoder / Matthias Bastian

Jeff Bezos is closing a $10 billion funding round for Project Prometheus, an AI lab focused on physics-based AI for manufacturing and engineering. With a $38 billion valuation and backing from JPMorgan and BlackRock, this signals a major shift in enterprise AI investment toward industrial applications.

21 Apr 2026

AI & Machine Learning·7 min

Kimi K2.6 Open-Weight AI: 300 Agents at a Fraction of the Cost

Moonshot AI's Kimi K2.6 matches GPT-5.4 and Claude Opus 4.6 on coding benchmarks while running 300 parallel agents. For businesses locked into expensive API contracts, this open-weight model could slash AI infrastructure costs while delivering enterprise-grade automation.

20 Apr 2026

AI & Machine Learning·7 min

AI Vendor Lock-In Risk: Anthropic Suspensions Hit Fintech

A Latin American fintech lost access to 60+ Claude accounts overnight with no warning, exposing dangerous single-vendor dependencies. The incident offers critical lessons for any business building AI into core operations.

20 Apr 2026

Also Read

Hacks & Workarounds·5 min