dmos62 a day ago

It irks me that I still don't know what an open model means. Or rather, I don't like calling a model that's trained on a closed dataset, with secret techniques as open, even if the weights are publicly available.

free_bip a day ago

I bet 5 imaginary dollars that it'll be a model too big for any consumer to run, making it effectively closed for anyone not in the top 0.1%.

  • rany_ a day ago

    That's not an issue. DeepSeek R1 is massive but people still managed to distill and quantize it. They'll just do the same for this one.

    • wkat4242 a day ago

      Deepseek distilled it themselves using Qwen and Llama, right from the start.

      • kelipso a day ago

        People are reproducing the training for Deepseek R1.

        • wkat4242 15 hours ago

          Oh that's interesting, because they distilled only with 1 method, not both methods that were used on R1 (the one was reinforcement learning, the other supervised finetuning, I think only the former was used on the distillations). So there might be some room for improvement. I hadn't seen any third party distillations yet but I'll have a look.

throwaway314155 a day ago

One whole model?? Oh my stars!

  • NoahZuniga a day ago

    Probably won't even be that good.

    • rany_ a day ago

      To be fair, open source models at this point exceed or are similar to OpenAI's best models. There's really no loss in releasing these weights. It might even have some upsides in terms of attracting talent.

      • throwaway314155 a day ago

        > It might even have some upsides

        Of course it does. Doesn't make it any less hypocritical for a company called OpenAI to think it's doing some great service to the OSS community by publishing table scraps.