It irks me that I still don't know what an open model means. Or rather, I don't like calling a model that's trained on a closed dataset, with secret techniques as open, even if the weights are publicly available.
Oh that's interesting, because they distilled only with 1 method, not both methods that were used on R1 (the one was reinforcement learning, the other supervised finetuning, I think only the former was used on the distillations). So there might be some room for improvement. I hadn't seen any third party distillations yet but I'll have a look.
To be fair, open source models at this point exceed or are similar to OpenAI's best models. There's really no loss in releasing these weights. It might even have some upsides in terms of attracting talent.
Of course it does. Doesn't make it any less hypocritical for a company called OpenAI to think it's doing some great service to the OSS community by publishing table scraps.
It irks me that I still don't know what an open model means. Or rather, I don't like calling a model that's trained on a closed dataset, with secret techniques as open, even if the weights are publicly available.
I bet 5 imaginary dollars that it'll be a model too big for any consumer to run, making it effectively closed for anyone not in the top 0.1%.
That's not an issue. DeepSeek R1 is massive but people still managed to distill and quantize it. They'll just do the same for this one.
Deepseek distilled it themselves using Qwen and Llama, right from the start.
People are reproducing the training for Deepseek R1.
Oh that's interesting, because they distilled only with 1 method, not both methods that were used on R1 (the one was reinforcement learning, the other supervised finetuning, I think only the former was used on the distillations). So there might be some room for improvement. I hadn't seen any third party distillations yet but I'll have a look.
One whole model?? Oh my stars!
Probably won't even be that good.
To be fair, open source models at this point exceed or are similar to OpenAI's best models. There's really no loss in releasing these weights. It might even have some upsides in terms of attracting talent.
> It might even have some upsides
Of course it does. Doesn't make it any less hypocritical for a company called OpenAI to think it's doing some great service to the OSS community by publishing table scraps.