Week 19: Shifting Power
Closed Models, Local Chips, and Capital Flowing To The Top
Another Monday, another post to keep you up to speed with the AI world.
Here’s what happened in the global AI market this week.
Anthropic unveiled a 10 trillion-parameter model and kept it restricted. DeepSeek V4 landed and exposed the reality of the chip race. Meta is building its own silicon stack. And an AI system just passed peer review on its own.
Here’s everything you need to know before Monday gets the best of you.
Anthropic Built a 10T Model, But Not Everyone Gets Access
Anthropic this week unveiled Claude Mythos 5, its most powerful model to date and the first publicly announced AI system to cross the 10 trillion parameter threshold. For context: GPT-4 was estimated at around 1.7 trillion. In roughly two years, the scale has increased six times over. The model introduces a new tier in Anthropic’s lineup, codenamed Capybara, sitting above Opus, and is being positioned for cybersecurity analysis, advanced scientific research, and complex software engineering. Not a general-purpose upgrade. A specialist instrument for high-stakes work.
Anthropic has not published official benchmarks and the model is not publicly available. Access is restricted to a controlled preview through Project Glasswing, the same coordinated security initiative that emerged from the zero-day vulnerability discoveries covered 2 weeks prior. The list of approved organisations includes AWS, Apple, Cisco, Google, JPMorgan Chase, Microsoft, Nvidia, CrowdStrike, and around 40 other critical infrastructure partners. All access is limited to defensive cybersecurity work. Anthropic described the model as “far ahead of any other AI model in cyber capabilities” in internal documents that were briefly and accidentally made public in late March when a content management misconfiguration exposed roughly 3,000 internal assets. The company locked them down within hours, but the damage had been done.
The 10 trillion figure does not automatically mean 10 times better at everything. The relationship between parameter count and capability is logarithmic, not linear. What it does mean is that Anthropic has access to a model capable of sustained reasoning over extremely long contexts, multi-step problem decomposition, and domain depth that smaller models can’t match. The training run is estimated to have cost several hundred million dollars. No API pricing has been published. And with GPT-6 still unshipped, Mythos 5 is the most capable model any lab has publicly confirmed this week.
Why it matters
Anthropic built something so capable it decided the public shouldn’t have unrestricted access to it. That’s a signal about what this category of model can actually do. Every other lab is now working out what their equivalent looks like.
DeepSeek V4 Finally Lands And Proves the AI Race Isn’t Slowing Down
DeepSeek released V4 on April 24 after missing three projected windows. The wait is over. Two models shipped simultaneously: V4-Pro, with 1.6 trillion total parameters and 49 billion active per response, and V4-Flash, a leaner 284 billion total parameter version built for speed and cost. V4-Pro is now the largest open-weight model in existence, bigger than Kimi K2.6 at 1.1 trillion and more than twice the size of DeepSeek’s own V3.2. Both are available on Hugging Face under an MIT licence, free to download, run, and modify. The API went live the same day at $1.74 per million input tokens and $3.48 per million output for Pro, a fraction of GPT-5.5’s $5 and $30 and Opus 4.7’s $5 and $25. DeepSeek’s pricing has historically reshaped what the rest of the market can charge. V4 is no exception.
The architecture running underneath is worth understanding. Both models use a Hybrid Attention Architecture, DeepSeek’s term for a design that dramatically improves memory efficiency at extreme context lengths. V4-Pro at 1 million tokens uses only 27 percent of the single-token inference FLOPs and 10 percent of the KV cache size of V3.2. That is not a minor optimisation. It means genuinely usable 1 million token context at a cost that makes it viable for production, not just benchmarks. On formal mathematics, V4 stretches past the field: Putnam-2025 proof verification at a perfect 120 out of 120, and Codeforces at a 3,206 rating, ranking 23rd among human competitive programmers. On standard agentic coding and reasoning benchmarks it sits between GPT-5.2 and GPT-5.4, which means GPT-5.5, released one day earlier, pushed the closed frontier further out. The honest reading: V4 is not the strongest model in the world. It is the strongest model anyone in the world can freely download and run themselves.
The Huawei story matters too. V4 was built on domestic Chinese silicon. The three months of delays were not about the model being unready. They were about Huawei’s Ascend 950PR chips not producing units fast enough to run the scale of testing DeepSeek needed. The model shipped when the hardware supply caught up. A frontier-competitive open-source model, trained and deployed entirely on chips with no American component, now exists. MIT Technology Review called it “an early sign that China is successfully building a parallel AI infrastructure.” That framing is accurate and understated.
Why it matters
The most capable open-source model ever built is now free to download. It was trained on Chinese silicon, costs a fraction of every closed competitor to run via API, and just proved that export controls are not the ceiling anyone hoped they were.
Meta Is Building Its Own Chips And Redefining AI Infrastructure
Meta and Broadcom announced an expanded partnership to co-develop four generations of Meta’s custom MTIA chips, the Meta Training and Inference Accelerator, through 2029. The initial deployment commitment exceeds 1 gigawatt of compute. That is the first phase of a sustained multi-gigawatt rollout. Broadcom CEO Hock Tan stepped off Meta’s board the same day the deal was announced, given the scale of the commercial tie-up. Mark Zuckerberg’s own words on the announcement: “build out the massive computing foundation we need to deliver personal superintelligence to billions of people.”
The architecture matters. MTIA is Meta’s purpose-built accelerator optimised for inference and recommendation at scale that runs every time someone opens Instagram, scrolls Facebook, or asks Meta AI something. Four new generations of the chip are being developed and deployed within two years. The first to hit 2nm process technology will be a major performance leap. Broadcom’s XPU platform gives Meta the design and packaging capability to build silicon tuned to its own workloads rather than relying on general-purpose GPU architectures that aren’t optimised for what Meta actually runs at scale.
Meta joins Google, Amazon, and Microsoft in pursuing custom accelerators alongside continued Nvidia spend. The pattern is now stable across every major hyperscaler. Use Nvidia for leading training capacity, use custom chips for the inference and recommendation workloads that run continuously at enormous volume. Nvidia still gets paid. But the share of infrastructure spend flowing to custom silicon is growing with every generation. The long-term trajectory is clear.
Why it matters
Every hyperscaler building its own chips is a vote of no-confidence in Nvidia’s long-term monopoly on AI compute. Meta’s MTIA roadmap isn’t a hedge. It’s an infrastructure declaration.
Spotlight
Krater Just Made Your AI Work While You Sleep
Most AI tools stop working the moment you close the tab. Krater’s two new features change that.
Additions is a library of add-ons with three types: Personas, Prompts, and Apps. Apps are the interesting ones. One click connects Gmail, Google Calendar, Notion, Slack, GitHub, Linear, or hundreds more. Once connected, Krater can read from and act on those services directly inside any chat. “Summarize my unread Gmail.” “Find conflicts in my calendar next week.” “Create a Linear ticket from this conversation.” Connected apps stay linked to the account and are pinned to the top of the library so they’re always one tap away.
Tasks is the scheduler. Any prompt worth repeating, a morning briefing, a weekly competitor scan, a daily inbox summary, can be turned into a recurring task with a name, a model, a cadence, and delivery preferences. Results land in a dedicated chat, as a notification, or straight to email. The real unlock is that Tasks run through the same engine as regular chats, which means they inherit every connected app. That means the task isn’t limited to what the model knows. It reaches into the user’s actual tools. Set “Every weekday at 8 am, check my Gmail for anything urgent, cross-reference my calendar, and tell me what to focus on,” and Krater does exactly that, every morning, before the user sits down. Apps give Krater hands. Tasks give it a clock. Put them together, and the AI starts working for you while you sleep.
Try it
Additions and Tasks are live on Krater now. If your AI has been waiting for you to show up, this flips that around.
Meta And Microsoft Fire 20,000 Employees As Big Tech Doubles Down on AI Spend
Meta and Microsoft announced major workforce reductions on April 23 and 24. Meta is cutting 10 percent of its global workforce, around 8,000 employees, with cuts beginning May 20. Microsoft introduced a voluntary separation programme, its first in 51 years, affecting another 12,000 to 15,000 positions. Together, more than 20,000 jobs across two of the largest technology companies in the world are being eliminated in the same week. The stated rationale from both companies is nearly identical: AI is enabling smaller teams to do the same work, and the capital freed up from headcount is being redirected into infrastructure and AI development. The cumulative number of tech workers laid off in 2026 now exceeds 92,000.
The number worth sitting with is not the headcount. It is the capital being deployed in the other direction. Alphabet, Microsoft, Meta, and Amazon are collectively spending close to $700 billion on AI infrastructure this year. Meta alone has committed up to $135 billion on AI capital expenditure in 2026. The same companies cutting staff are pouring more money into computing, data centres, and AI development than any group of companies has spent on a single technology in any single year in history. The workforce reductions are not a sign of struggle. They are a sign that the output-per-human ratio is shifting fast enough that maintaining 2024 headcount levels looks like overstaffing to their boards.
The harder question is whether AI is actually causing this or whether companies are using it as cover. A reasonable reading suggests both are true at once. Meta is closing its Horizon World metaverse platform by June, which alone accounts for thousands of roles in Reality Labs. Microsoft’s buyout programme looks more like a skills transition than pure technology displacement. But Block invoked AI, cut 4000 jobs, and its stock rose the following day. Snap cut 1000 and rose 7 percent. Meta cut 8000, and the market rewarded it. Whatever the underlying cause, markets have now clearly signalled that AI-attributed headcount reductions increase valuation. That incentive structure is not going away.
Why it matters
Twenty thousand jobs at two of the world’s most valuable companies in one week, with $700 billion being spent on AI in the same year by the same companies. This is what the transition looks like in financial statements. It is no longer coming. It is the current quarter.
An AI System Published A Research Article And Passed Peer Review Without Human Authors
Sakana AI’s AI Scientist v2 generated the first fully AI-authored paper to pass a rigorous human peer-review process, with the achievement now published and documented in Nature. The system is end-to-end. It generates hypotheses, designs and runs experiments, analyses data, visualises results, and writes the manuscript in LaTeX with citations. Sakana submitted three fully autonomous papers to a peer-reviewed ICLR workshop. One exceeded the average human acceptance threshold. That’s the one that made history.
The v2 system is a meaningful architectural step beyond its predecessor. Where v1 relied on human-authored code templates to frame the research, v2 eliminates that requirement and generalises across machine learning domains. It uses a progressive agentic tree search, branching into multiple parallel experiments, picking the most promising results, iterating, and backtracking when it hits dead ends. A dedicated experiment manager agent coordinates the process. A vision-language model checks figures for clarity and accuracy. The whole thing costs around $15 per paper in API and compute fees. Sakana acknowledges that the more complex v2 search raises that figure somewhat. But even at 10x the cost, it remains a fraction of what human research teams spend.
The system has its limitations. It doesn’t produce better papers than v1 all the time. Its literature review process has been criticised for relying on keyword searches rather than proper synthesis. The quality of the manuscripts has been compared to an undergraduate rushing a deadline. But that comparison is the point. An undergraduate can be educated. An AI system that can already pass peer review can be iterated on. The current quality ceiling isn’t that high, but the trajectory is spot on.
Why it matters
The first peer-review milestone has been crossed. Scientific research has historically been one of the last domains where human expertise was considered irreplaceable. That assumption just got its first serious crack.
GPT-5.5 Launches Instead of GPT-6 And Still Takes the Lead
The model codenamed Spud shipped on April 23, not as GPT-6, but as GPT-5.5. OpenAI released it to Plus, Pro, Business, and Enterprise users in ChatGPT and Codex, with API access following on April 24. Greg Brockman called it “a new class of intelligence” and “a big step towards more agentic and intuitive computing.” Six weeks after GPT-5.4. GPT-6 remains unshipped. The wait that had the industry holding its breath resolved into something real and immediately consequential, just not the model the rumours named.
The capability profile is clearest in what GPT-5.5 does with a terminal. On Terminal-Bench 2.0, which tests complex command-line workflows requiring planning, iteration, and tool coordination, it scores 82.7 percent against Claude Opus 4.7’s 69.4 percent, a 13-point lead. That is the widest margin any model holds over its nearest competitor on any major agentic benchmark right now. On OSWorld-Verified, which measures whether a model can operate real computer environments autonomously, clicking, typing, and navigating interfaces, it reaches 78.7 percent, just ahead of Opus 4.7 at 78.0 percent. That is the first time OpenAI’s mainline model has matched Anthropic on full computer use, a category Opus has held outright since last autumn. The long-context retrieval jump is the other number worth keeping: MRCR v2 at 1 million tokens goes from 36.6 percent with GPT-5.4 to 74.0 percent with GPT-5.5. More than doubled. The model also uses 40 percent fewer tokens per Codex task than GPT-5.4, meaning tasks complete faster and cheaper on a per-outcome basis despite the API price doubling from $2.50 to $5 per million input tokens.
There is a catch worth naming. Claude Opus 4.7 still leads on SWE-bench Pro at 64.3 percent versus GPT-5.5’s 58.6 percent, and on MCP Atlas tool orchestration at 79.1 percent versus 75.3 percent. If your use case is resolving real GitHub issues or running complex multi-tool workflows, Anthropic is still ahead on those specific tasks. GPT-5.5 is the stronger model for agentic computer use and terminal work at scale. The race is not settled. It is now more clearly segmented by task than it was a week ago, which is more useful information than any single leaderboard number.
Why it matters
OpenAI spent six weeks building anticipation for GPT-6, shipped GPT-5.5 instead, and still took the agentic coding benchmark lead by 13 points. That says something about how far the baseline has moved. The model everyone was told to wait for still hasn’t arrived.
And that wraps up this week. Tune in next Monday, same time, for another deep-dive into the stories shaping the AI world.
The Sentinel lands in your inbox every Monday so you can catch up with the fast-moving AI space while sipping your morning coffee. Every detail that matters, none that doesn’t.







