Fixing Science with AI
AI review works better than peer review
Google is attempting to fix science and its reproducibility crisis with Gemini:
The pursuit of truth in theoretical computer science and mathematics relies on the highest standards of proof, rigor, and clarity. While peer review is the crucial final check, the process of drafting and refining complex theoretical work often takes months, with simple errors, inconsistent variables, or subtle logical gaps frequently slowing down the entire research pipeline. But could a highly specialized AI tool act as a fast, rigorous collaborator, helping authors pre-vet their work before it ever reaches human reviewers?
To test this potential, we created an experimental program for the Annual ACM Symposium on Theory of Computing (STOC 2026) — one of the most prestigious venues in theoretical computer science. This program offered authors automated, pre-submission feedback generated by a specialized Gemini AI tool. Our objective was to provide constructive suggestions and identify potential technical issues within 24 hours of submission, helping authors polish their final drafts before the submission deadline.
The responses were very positive: the tool successfully identified a variety of issues, including calculation and logic errors. Here we report how we developed the tool and the results of its use.
Optimized for mathematical rigor
The feedback tool leveraged inference scaling methods in an advanced version of Gemini 2.5 Deep Think. This setup enables the method to simultaneously explore and combine multiple possible solutions before giving a final answer, rather than pursuing a single, linear chain of thought. By combining different reasoning and evaluation traces, the method reduces inherent hallucinations and focuses on the most salient issues.
Feedback format
Authors received structured feedback divided into key sections: a summary of the paper’s contributions, a list of potential mistakes and improvements (often analyzing specific lemmas or theorems), and a list of minor corrections and typos. See some feedback examples.
Impact and technical depth
The tool successfully identified a wide range of issues, from inconsistent variable names to complex problems like calculation errors, incorrect application of inequalities, and logical gaps in proofs. As one author noted, the tool found “a critical bug... that made our proof entirely incorrect,” further adding that it was an “embarrassingly simple bug that evaded us for months.”
Over 120 participants responded to our post-experiment survey and gave us consent, and the responses were very positive, with individuals citing the model’s success at finding critical errors and its ability to return insightful commentary. In summary:
>80% of submitted papers at the time our experiment ended had opted-in for our AI review
97% found the feedback helpful
97% would use this tool again for future submissions
81% found the model improved clarity or readability of the paper
It’s a legitimately good idea. But it won’t work in the long run, because Gemini is controlled by Google, a corrupt and intrinsically biased organization that has less commitment to the truth than the most corrupt scientist peddling his white lab coat for government grants.
Even if the Gemini science tool is initially constructive, it won’t be long until the Google executives order it to suppress badthink science in favor of good think science.



Relying on converged Big Tech to fix converged Big Science / Big Pharma is a fool's game. The AI model will just reflect the current narrative of each. Garbage in, garbage out, same ol', same ol'.
Burns my biscuit when you talk about, say, not vaccinating in any current AI. "Woah there we all know vaccines are safe and effective user"