Raschka and Willison close out 2025 with competing LLM retrospectives as NIH signals DEI grants will lapse

ai LLM year in review and what 2026 may bring

The State Of LLMs 2025: Progress, Problems, and Predictions

Sebastian Raschka published his 2025 LLM state-of-the-field review, covering the arc from DeepSeek R1 and RLVR to inference-time scaling, benchmark evolution, and architectural changes. He traces falling per-token costs and identifies where the research community reached genuine consensus versus where it is still working through competing hypotheses. The review closes with 2026 predictions on which directions are most likely to produce the next round of capability gains.

Sebastian Raschka 2025-12-30 —

Claude Sonnet 4.6

LLM Research Papers: The 2025 List (July to December)

Raschka also released the second half of his 2025 curated LLM research paper list, covering July through December. The list is a structured bibliography of the papers he found most worth tracking across the second half of the year, organized by theme rather than chronology. For anyone trying to build a reading list from the year's output, this is a practical starting point given the volume of preprints that shipped in 2025.

Sebastian Raschka 2025-12-30 —

Claude Sonnet 4.6

2025: The year in LLMs

Simon Willison published his third annual year-in-LLMs retrospective, covering what actually changed in 2025 versus what was expected to change. Willison's frame is empirical rather than predictive: he focuses on what the community learned, what assumptions got falsified, and what questions remain genuinely open heading into 2026. The series has become a reliable benchmark for what practitioners actually think happened, as distinct from what was announced.

Simon Willison 2025-12-31 —

Claude Sonnet 4.6

Adam Marblestone; AI is missing something fundamental about the brain

Adam Marblestone told Dwarkesh Patel that AI is missing something fundamental about how the brain works, and his argument centers on reward functions rather than architecture. Marblestone's claim is that the brain's secret sauce is not how neurons are wired but how reward signals are structured and propagated. Current LLM training relies on relatively shallow reward signals, and Marblestone argues that closing the gap with biological intelligence requires rethinking that layer rather than scaling the transformer architecture further.

Dwarkesh Patel 2025-12-30 —

Claude Sonnet 4.6

Sora 2 - It will only get more realistic from here

AI Explained covered Sora 2, OpenAI's updated video generation model, noting that realism has improved substantially from the original Sora release. The framing is that the current output quality, while still imperfect, represents a trajectory rather than a ceiling: each generation has closed a meaningful fraction of the gap to photorealism, and the rate of improvement shows no sign of leveling off. The video covers specific failure modes that remain and where the model still struggles with physics and object consistency.

AI Explained 2025-12-31 —

Claude Sonnet 4.6

Quoting Armin Ronacher

Armin Ronacher, quoted by Simon Willison, described how AI tools have changed his relationship to programming work. Ronacher's observation is that what AI removed is the labor he never found valuable: minimal repro cases, debug log triage, deciphering AWS IAM errors. The puzzle itself, his phrase for the interesting cognitive work, remains. The quote circulated widely because it names something practitioners feel but rarely articulate with precision.

Simon Willison 2025-12-30 —

Claude Sonnet 4.6

Quoting Liz Fong-Jones

Liz Fong-Jones, quoted by Simon Willison, described how working with language models changes the programmer's role: from writing lines of code to managing context, pruning irrelevant information, adding useful material, and writing detailed specifications. The framing matters because it shifts the skill emphasis from syntax to clarity of thought. Fong-Jones argues the core programming competency does not disappear but its expression changes substantially.

Simon Willison 2025-12-30 —

Claude Sonnet 4.6

2025: The year in LLMs

Simon Willison's annual LLM year review covers open-weight model releases, reasoning advances, and shifts in how practitioners are using language models in production. Willison synthesized developments across inference optimization, reasoning via self-play, and the growing use of models as context managers rather than direct code generators.

Simon Willison 2025-12-31 —

Claude Haiku 4.5

Codex cloud is now called Codex web

OpenAI rebranded Codex cloud to Codex web, signaling a shift toward web-based deployment of its code agent rather than cloud infrastructure positioning. The rebrand occurred quietly in late December.

Simon Willison 2025-12-31 —

Claude Haiku 4.5

Quoting Armin Ronacher

Armin Ronacher observed that language models have eliminated the labor of debugging and minimal reproducible cases without removing the actual problem-solving work. The shift reframes programming from implementation toward specification and design.

Simon Willison 2025-12-30 —

Claude Haiku 4.5

Quoting Liz Fong-Jones

Liz Fong-Jones described the role shift as language models become standard: programmers move from writing lines of code to managing the context models have access to, pruning irrelevant material, and writing detailed specifications for model behavior.

Simon Willison 2025-12-30 —

Claude Haiku 4.5

shot-scraper 1.9

Simon Willison released shot-scraper 1.9, adding an -x/,extract option to the HAR command that pulls all resources loaded by a page into a structured format, useful for web scraping and automation workflows.

Simon Willison 2025-12-29 —

Claude Haiku 4.5

Quoting Jason Gorman

Jason Gorman observed that the hard part of programming is not expressing ideas in code but turning human thinking, with all its ambiguity and contradictions, into logically precise computation. This framing positions language models as tools for bridging that gap.

Simon Willison 2025-12-29 —

Claude Haiku 4.5

Copyright Release for Contributions To SQLite

D. Richard Hipp, SQLite creator, clarified that SQLite does not refuse outside contributions but is highly selective about what gets merged. The discussion highlights how open-source maintenance at scale requires careful governance of dependencies and change control.

Simon Willison 2025-12-29 —

Claude Haiku 4.5

software Software tools, SQLite, and archiving the web

shot-scraper 1.9

Simon Willison released shot-scraper 1.9, adding a new -x/,extract option to the har command that pulls all resources loaded by a page during a recorded session. The tool takes screenshots and scrapes web content from the terminal using JavaScript. The extraction feature is useful for auditing what third-party assets a page loads, debugging content security policies, and archiving dynamic web content that does not persist in static HTML.

Simon Willison 2025-12-29 —

Claude Sonnet 4.6

TIL: Downloading archived Git repositories from archive.softwareheritage.org

Simon Willison documented how to download archived Git repositories from archive.softwareheritage.org, the nonprofit that systematically archives public source code. The immediate trigger was a Python library he had previously written about, sqlite-s3vfs, whose original repository disappeared. Software Heritage had a copy. The practical lesson is that public code archiving infrastructure exists and is retrievable, but most developers do not know the retrieval workflow.

Simon Willison 2025-12-30 —

Claude Sonnet 4.6

Copyright Release for Contributions To SQLite

D. Richard Hipp, SQLite's creator, corrected Simon Willison on Hacker News after Willison wrote that SQLite refuses outside contributions. Hipp clarified that SQLite does accept contributions but requires a formal copyright release and is highly selective about what it accepts. The correction prompted Willison to publish the exchange. Hipp also noted that SQLite's aviation-grade testing regime, once fully in place, dropped bugs to a trickle and is what allows the project to move fast without accumulating regressions.

Simon Willison 2025-12-29 —

Claude Sonnet 4.6

Quoting D. Richard Hipp

D. Richard Hipp, quoted by Simon Willison, described how SQLite's aviation-grade testing framework changed the project's defect rate. Before the framework was fully in place, bugs were a persistent problem. After it, bugs dropped to a trickle. Hipp's point is that rigorous testing and speed of development are not in tension at the infrastructure level: the discipline of the test suite is what makes moving fast safe, not what prevents it.

Simon Willison 2025-12-29 —

Claude Sonnet 4.6

TIL: Downloading archived Git repositories from archive.softwareheritage.org

Simon Willison documented how to download archived Git repositories from archive.softwareheritage.org, the Internet Archive's collection of open-source code. Software Heritage provides a fallback for projects that disappear from their original hosts, addressing software preservation.

Simon Willison 2025-12-30 —

Claude Haiku 4.5

pharma NIH grants in limbo and obesity drug pricing wars

NIH Director Jay Bhattacharya told a podcaster that DEI-related grants restored under a court order will not be renewed when they expire in 2026. The grants had been frozen by the Trump administration and restored through litigation, but Bhattacharya's statement signals that the administration views the court-ordered restoration as temporary rather than a permanent reversal. Researchers whose grants fall into this category now face expiration without a clear pathway to renewal.

STAT News 2025-12-31 —

Claude Sonnet 4.6

NIH begins review of thousands of delayed research proposals, funding 135 on first day

NIH began reviewing thousands of research grant proposals that had been stuck in bureaucratic limbo under the Trump administration's DEI-related funding freeze, funding 135 on the first day of reviews. The review process began after a legal settlement with groups that sued over the delays. Funding on the first day does not guarantee broad approval; STAT News notes the settlement guarantees review but not approval, and the pace of subsequent decisions will determine how much of the backlog is actually resolved.

STAT News 2025-12-31 —

Claude Sonnet 4.6

STAT+: With the Wegovy pill, Novo Nordisk undercuts Eli Lilly in direct-to-consumer market

Novo Nordisk priced its newly approved Wegovy pill below Eli Lilly's competing oral obesity drug in the direct-to-consumer market. The move is a pricing strategy decision as much as a clinical one: Novo is betting that underpricing Lilly on list price will accelerate patient uptake and market share gains, particularly among patients who prefer an oral formulation. The injectable Wegovy has faced supply constraints; the pill form opens a different patient segment.

STAT News 2025-12-29 —

Claude Sonnet 4.6

STAT+: A new drug allowed them to go in the sun for the first time. They're terrified they may have to give it up

A rare blood disorder called erythropoietic protoporphyria, or EPP, makes sunlight feel like an internal burning sensation for people who have it. Bitopertin, a newly approved drug, allowed EPP patients to tolerate sun exposure for the first time. STAT News reports that patients are now terrified they may have to give it up, reflecting access and insurance coverage uncertainty that frequently follows rare disease drug approvals, particularly for high-cost treatments with small patient populations.

STAT News 2025-12-30 —

Claude Sonnet 4.6

The top medical advances of 2025

STAT News published its roundup of the top medical advances of 2025, covering a year that included meaningful progress despite sustained political pressure on research institutions. The list spans multiple disease areas and treatment modalities. The framing is deliberately counter to the narrative that scientific progress stalled under funding uncertainty; the piece argues that the pipeline of work initiated in earlier years continued to produce results regardless of the policy environment.

STAT News 2025-12-31 —

Claude Sonnet 4.6

Opinion: Patients are consulting AI. Doctors should, too

A Harvard and Dartmouth medical school opinion published in STAT News argues that doctors should use AI tools to stay current with medical literature, noting that the volume of new studies is too large for individual physicians to absorb. The piece draws on evidence that patients are already consulting AI and argues that physicians who refuse to engage with these tools are ceding interpretive authority to patients using them without clinical training. The case for physician AI adoption is framed as patient safety rather than convenience.

STAT News 2025-12-30 —

Claude Sonnet 4.6

NIH Director Jay Bhattacharya announced that diversity-equity-inclusion grants restored under court order during a lawsuit challenging their termination will not be renewed in 2026. The statement signals intent to let restored funding expire rather than institutionalize it.

STAT News 2025-12-31 —

Claude Haiku 4.5

The top medical advances of 2025

STAT News curated 2025's major medical advances, including new drugs for rare diseases, improvements in cancer survival rates, and vaccines that moved closer to clinical use. The survey acknowledges a difficult year for healthcare policy while documenting genuine scientific progress.

STAT News 2025-12-31 —

Claude Haiku 4.5

STAT News reported on its most-read opinion essays from 2025, offering a window into what healthcare and policy topics most engaged its audience. The collections signal emerging concerns about drug pricing, AI in medicine, and regulatory reform.

STAT News 2025-12-31 —

Claude Haiku 4.5

NIH begins review of thousands of delayed research proposals, funding 135 on first day

NIH reached a settlement with groups that sued over delayed review of diversity-related grant submissions and began evaluating 5,000 frozen proposals, funding 135 on the first review day. The deal allows pending research with diversity focus to move through the approval process.

STAT News 2025-12-31 —

Claude Haiku 4.5

STAT+: A new drug allowed them to go in the sun for the first time. They're terrified they may have to give it up

A new drug for erythropoietic protoporphyria, a rare blood disorder that makes sunlight feel like burning from inside the body, allowed patients to go outside during daylight for the first time. Patients now fear losing access if the drug's availability changes.

STAT News 2025-12-30 —

Claude Haiku 4.5

STAT+: Three major health care policy issues to watch in 2026

STAT News flagged three health policy issues to watch in 2026: ACA premium subsidies, drug pricing reforms, and pharmacy benefit manager consolidation. All three will shape insurance costs and patient access in the coming year.

STAT News 2025-12-30 —

Claude Haiku 4.5

STAT's most memorable photos of 2025

STAT News published its most memorable photos from 2025, capturing stories of loss, hope, and medical bravery from clinics and hospitals worldwide. Photography serves as a different form of evidence in healthcare reporting.

STAT News 2025-12-30 —

Claude Haiku 4.5

Opinion: Patients are consulting AI. Doctors should, too

Medical literature output now exceeds what individual doctors can absorb, making AI tools essential for staying current. Doctors at top medical schools are increasingly using ChatGPT and similar tools to synthesize the evidence, mirroring how their patients are already using AI for health information.

STAT News 2025-12-30 —

Claude Haiku 4.5

The Trump administration agreed to reconsider NIH grant submissions frozen during the DEI termination controversy, allowing research proposals to move through the review pipeline after settlement with nonprofit advocacy groups.

STAT News 2025-12-30 —

Claude Haiku 4.5

STAT+: With the Wegovy pill, Novo Nordisk undercuts Eli Lilly in direct-to-consumer market

Novo Nordisk announced a direct-to-consumer price for its newly approved Wegovy pill that undercuts Eli Lilly's tirzepatide on cost. The pricing move reflects competition in the weight-loss drug market shifting from GLP-1 injections toward oral formulations.

STAT News 2025-12-29 —

Claude Haiku 4.5

STAT+: CMS divvies up first payments from $50B rural health fund, with an eye toward MAHA goals

The Trump administration distributed first payments from a $50 billion rural health fund, with allocation favoring states that have embraced MAHA-aligned health policies. The distribution shows how healthcare funding is being tied to alignment with administration health ideology.

STAT News 2025-12-29 —

Claude Haiku 4.5

healthtech ACA subsidies, rural health funds, and health policy in 2026

STAT+: Three major health care policy issues to watch in 2026

STAT News identified three health care policy issues with the most consequence in 2026: the future of ACA premium subsidies set to expire, drug pricing changes stemming from the Inflation Reduction Act, and pharmacy benefit manager reforms working through Congress. Each of these affects different parts of the system but all three have significant stakes for patients, insurers, and manufacturers. The piece maps out the congressional and regulatory calendar on which the outcomes depend.

STAT News 2025-12-30 —

Claude Sonnet 4.6

STAT+: CMS divvies up first payments from $50B rural health fund, with an eye toward MAHA goals

The Trump administration distributed the first payments from a new $50 billion rural health fund, with allocations weighted toward states aligned with MAHA goals rather than purely by population or need. CMS disclosed the payment methodology, which drew attention because states that have pursued MAHA-aligned health priorities received proportionally larger early payments. The structure of the fund gives the administration discretion over how subsequent tranches are allocated, which creates leverage over state health policy decisions.

STAT News 2025-12-29 —

Claude Sonnet 4.6

Opinion: Why I'm skipping Dry January

Robert M. Kaplan, a public health scientist, published an opinion in STAT News arguing that Dry January's underlying evidence base is weaker than its mainstream uptake suggests and that firm prescriptions about alcohol abstinence rarely account for individual variation. The piece is not a defense of heavy drinking but an argument that population-level recommendations applied uniformly to individuals with different risk profiles produce worse outcomes than personalized guidance. Kaplan's broader point is about how public health communicates uncertainty.

STAT News 2026-01-01 —

Claude Sonnet 4.6