cosmoplat

Page: Simon Willison's Weblog

AI Agents are Pertaining To Knock on the Door Of Town Hall

Amazon's Cloud Business Faces Crucial test After Rivals Microsoft,

As DeepSeek Upends the aI Industry, one Group is Urging Australia to Embrace The Opportunity

Australia Bans DeepSeek aI Program On Government Devices

Cheap aI might be Good for Workers

DeepSeek: the Chinese aI Model That's a Tech Breakthrough and A Security Risk

DeepSeek: what you Need to Learn About the Chinese Firm Disrupting the AI Landscape

DeepSeek Fever Fuels Patriotic Bets on Chinese aI Stocks

Distillation with Reasoning: can DeepSeek R1 Teach Better Than Humans?

Elon Musk Chief Nerd's Elaborate $1,000 Troll Scam

Exploring DeepSeek R1's Agentic Capabilities Through Code Actions

Fed Monetary Policy Report Flags Solid Economy, Raised Markets

Get Instant Access To Breaking News

How can you Utilize DeepSeek R1 For Personal Productivity?

How is that For Flexibility?

II. what Is Artificial Intelligence?

If there's Intelligent Life out There

Investors Go Back To New look Middle East, but Trump Causes Some

Jake Paul Breaks his Silence on Canelo Alvarez Snub In Online Rant

Japan pM Ishiba, after Meeting Trump, Voices Optimism Over Averting

Judge Says Elon Musk's Claims of Harm from OpenAI Are A 'stretch'.

Musk's Claim against OpenAI May go to Trial In Part, Judge Says

Musk Polls whether DOGE Staffer who made Racist Posts Need To Come Back

Nearly a million Brits are Creating their Perfect Partners On CHATBOTS

Nigerian Students Turn to aI For Tests Answers, Lecturers Raise Alarm

OpenAI Announces Brand new 'deep Research' Tool For ChatGPT

Panic over DeepSeek Exposes AI's Weak Foundation On Hype

Parents Of Dead OpenAI Whistleblower Sue San Francisco, Alleging Murder Cover Up

REVEALED: DOGE's Final Goal as It Launches Government Blitzkrieg

Researchers Reduce Bias in aI Models while Maintaining Or Improving Accuracy

Revolutionizing Car Tech: Discover How DeepSeek R1 Transforms Zero Run's Driving Experience

Russia's Sberbank Plans Joint aI Research with China As DeepSeek

Schulman Left OpenAI in August 2025

Simon Willison's Weblog

South Korea Ministries, Police Block DeepSeek Gain Access To

Superseding Indictment Charges Chinese National in Relation to Alleged Plan to Steal Proprietary AI Technology

The Chinese aI Companies that Might Match DeepSeek's Impact

The Profundity of DeepSeek's Challenge To America

Trump's 'Ridiculous' Gaz a Lago Plan is the Best Wish For Palestinians

US STOCKS S & P 500, Dow Rise As Investors Digest Earnings, Rate Cut

US STOCKS S & P 500, Nasdaq Rise On Upbeat Earnings

What Trump's Trade War Means for YOUR Investments

Who Invented Artificial Intelligence? History Of Ai

1 Simon Willison's Weblog

That design was trained in part utilizing their unreleased R1 "reasoning" model. Today they have actually launched R1 itself, in addition to a whole household of new models obtained from that base.

There's a lot of things in the new release.

DeepSeek-R1-Zero seems the base model. It's over 650GB in size and, like most of their other releases, is under a tidy MIT license. DeepSeek alert that "DeepSeek-R1-Zero encounters obstacles such as limitless repetition, bad readability, and language mixing." ... so they also released:

DeepSeek-R1-which "includes cold-start information before RL" and "attains efficiency similar to OpenAI-o1 throughout mathematics, code, and reasoning jobs". That one is also MIT licensed, and is a comparable size.

I do not have the ability to run designs bigger than about 50GB (I have an M2 with 64GB of RAM), so neither of these 2 models are something I can easily have fun with myself. That's where the new distilled models are available in.

To support the research community, [forum.batman.gainedge.org](https://forum.batman.gainedge.org/index.php?action=profile