Deleting the wiki page 'Simon Willison's Weblog' cannot be undone. Continue?
That design was trained in part utilizing their unreleased R1 "reasoning" model. Today they have actually launched R1 itself, in addition to a whole household of new models obtained from that base.
There's a lot of things in the new release.
DeepSeek-R1-Zero seems the base model. It's over 650GB in size and, like most of their other releases, is under a tidy MIT license. DeepSeek alert that "DeepSeek-R1-Zero encounters obstacles such as limitless repetition, bad readability, and language mixing." ... so they also released:
DeepSeek-R1-which "includes cold-start information before RL" and "attains efficiency similar to OpenAI-o1 throughout mathematics, code, and reasoning jobs". That one is also MIT licensed, and is a comparable size.
I do not have the ability to run designs bigger than about 50GB (I have an M2 with 64GB of RAM), so neither of these 2 models are something I can easily have fun with myself. That's where the new distilled models are available in.
To support the research community, [forum.batman.gainedge.org](https://forum.batman.gainedge.org/index.php?action=profile
Deleting the wiki page 'Simon Willison's Weblog' cannot be undone. Continue?