ChatGPT 5.5 Pro Solves Open Math Problem in 17 Minutes

Key Takeaways

- ChatGPT 5.5 Pro solved an open number theory problem in 17 minutes and 5 seconds
- Fields Medalist Timothy Gowers says his own mathematical contribution was zero
- The AI improved an exponential bound to a quadratic bound, the best possible construction
Fields Medalist Timothy Gowers fed ChatGPT 5.5 Pro a set of open problems in number theory. The model returned a complete solution in 17 minutes and 5 seconds. Then it rewrote the argument as a LaTeX preprint in 2 minutes and 23 seconds.
Gowers, who holds the Combinatorics Chair at the Collège de France and is a Fellow at Trinity College Cambridge, wrote in his blog that his own mathematical contribution to the work was zero. "I didn't even do anything clever with the prompts," he said.
The Problem and the Solution
The problems came from a paper by number theorist Mel Nathanson. The paper investigates possible sizes of certain sets of integer sums and how efficiently sets with prescribed properties can be constructed. Nathanson had proved an exponential bound for one problem and asked whether it could be improved.
ChatGPT 5.5 Pro delivered the best possible construction with a quadratic bound. That's a significant jump from exponential. The model's core insight: it swapped out a component in Nathanson's proof for a more efficient variant. The variant is well known in combinatorics, but applying it to this specific problem wasn't obvious.
Gowers checked the work for correctness, then asked the model to solve a related variant. It handled that without issues. Both results are now available as a preprint.
Tackling a Harder Problem
A generalized version of the problem proved more challenging. MIT student Isaac Rajagopal had prior work on this variant, having proven an exponential dependency. Gowers gave ChatGPT Rajagopal's paper and asked for an improvement.
After 16 minutes and 41 seconds, the model delivered a first improvement. Rajagopal judged this step correct but called it a routine modification of his own work. Gowers then pushed further, asking for additional improvements.
“I got greedy”
— Timothy Gowers, describing his escalating prompts to ChatGPT 5.5 Pro
The escalation yielded results. Rajagopal, evaluating the model's key idea on the harder problem, called it "completely original." He said it was an achievement a human mathematician would be proud of after weeks of deliberation.
What This Demonstrates
The experiment is notable for several reasons. First, the problems were genuinely open. These weren't textbook exercises or competition questions with known solutions. Second, the human mathematician deliberately stayed hands-off. Gowers didn't guide the model's reasoning or correct its approach. Third, the output was publishable. The model produced a complete LaTeX preprint that Gowers verified and posted.
Gowers described the AI's output as "PhD-level." That's a specific benchmark. It means the work would be acceptable as part of a doctoral thesis in mathematics. The model demonstrated independent mathematical reasoning, not just pattern matching or retrieval of memorized proofs.
The Speed Factor
The total time for the first problem: under 20 minutes of model thinking, plus about 2.5 minutes to write the LaTeX. The model then handled a variant problem. Both results came in under two hours of total work.
Compare that to typical mathematical research timelines. A PhD student might spend weeks or months on a problem of this complexity. Rajagopal explicitly said the "completely original" idea would have taken a human mathematician weeks of deliberation.
The speed advantage compounds. A researcher using ChatGPT 5.5 Pro could explore many more problem variants in the time it would take to solve one manually. Even if most attempts fail, the throughput matters.
Limitations Worth Noting
Gowers verified all the model's work. The AI didn't self-certify its proofs as correct. Mathematical verification remains a human task, at least for now. The model also worked on problems where Gowers could assess correctness. For cutting-edge problems where verification itself is hard, the workflow would be different.
The source article was truncated, so we don't have complete details on how the harder problem ultimately resolved. What's clear is that the model made substantive progress with original ideas, not just incremental tweaks.
Logicity's Take
Frequently Asked Questions
What math problem did ChatGPT 5.5 Pro solve?
The model solved an open problem in number theory from a paper by Mel Nathanson, improving an exponential bound to a quadratic bound. The problem involved the possible sizes of certain sets of integer sums.
How long did ChatGPT 5.5 Pro take to solve the math problem?
The model took 17 minutes and 5 seconds to solve the first problem, then 2 minutes and 23 seconds to write up the solution as a LaTeX preprint.
Who verified ChatGPT 5.5 Pro's mathematical work?
Fields Medalist Timothy Gowers checked the model's work for correctness. MIT student Isaac Rajagopal also evaluated the model's approach to a harder variant of the problem.
What does PhD-level math research mean in this context?
Gowers used the term to indicate the work would be acceptable as part of a doctoral thesis in mathematics. It demonstrated independent mathematical reasoning and original problem-solving, not just retrieval of known techniques.
Is ChatGPT 5.5 Pro available to the public?
The source doesn't specify availability details for ChatGPT 5.5 Pro. The experiment was conducted by Gowers using the model, with results posted as a preprint.
Need Help Implementing This?
Source: The Decoder / Matthias Bastian
Manaal Khan
Tech & Innovation Writer
Related Articles
Browse allZuckerberg's Superintelligence Lab Faces Setback
The first AI model from Zuckerberg's superintelligence lab has failed to impress compared to its rivals, sparking concerns about the lab's direction. We take a closer look at what happened and why it matters.

Muse Spark Launch Propels Meta AI App to Top 5
The recent launch of Muse Spark has significantly boosted the popularity of Meta AI app, pushing it into the top 5. We explore what this means for the AI landscape.

Meta's Muse Spark AI Model Lags Behind ChatGPT and Claude
Meta's Muse Spark AI model still can't outperform ChatGPT and Claude in key areas, despite its advancements. We explore what this means for the AI landscape.

Meta Launches Muse Spark AI To Challenge ChatGPT
Meta launches Muse Spark AI to challenge ChatGPT and Claude, we explore what this means for the AI landscape. Muse Spark AI is a significant development in the AI chatbot space.
Also Read

FCC Extends Drone and Router Update Waiver Until 2029
The FCC reversed its position on software updates for foreign-made drones and routers. Devices on the agency's Covered List can now receive security patches and firmware updates through January 2029. The agency acknowledged that blocking updates could leave millions of devices vulnerable to cyberattacks.

Claude vs Codex: Why Using Both Beats Picking One
A developer tested Claude and OpenAI Codex head-to-head, expecting a winner. Instead, he found that pairing Claude's planning strengths with Codex's execution speed built a working calculator app in under 10 minutes.

UniGetUI Gives Windows the Linux-Style Package Manager It Lacks
Windows has a built-in package manager called winget, but almost nobody uses it because it's command-line only. UniGetUI is an open-source project that wraps winget in a proper graphical interface, bringing Linux-style software management to Windows users who don't want to touch a terminal.