• 0 Posts
  • 374 Comments
Joined 2 years ago
cake
Cake day: July 1st, 2023

help-circle




  • the accepted terminology

    No, it isn’t. The OSI specifically requires the training data be available or at very least that the source and fee for the data be given so that a user could get the same copy themselves. Because that’s the purpose of something being “open source”. Open source doesn’t just mean free to download and use.

    https://opensource.org/ai/open-source-ai-definition

    Data Information: Sufficiently detailed information about the data used to train the system so that a skilled person can build a substantially equivalent system. Data Information shall be made available under OSI-approved terms.

    In particular, this must include: (1) the complete description of all data used for training, including (if used) of unshareable data, disclosing the provenance of the data, its scope and characteristics, how the data was obtained and selected, the labeling procedures, and data processing and filtering methodologies; (2) a listing of all publicly available training data and where to obtain it; and (3) a listing of all training data obtainable from third parties and where to obtain it, including for fee.

    As per their paper, DeepSeek R1 required a very specific training data set because when they tried the same technique with less curated data, they got R"zero’ which basically ran fast and spat out a gibberish salad of English, Chinese and Python.

    People are calling DeepSeek open source purely because they called themselves open source, but they seem to just be another free to download, black-box model. The best comparison is to Meta’s LlaMa, which weirdly nobody has decided is going to up-end the tech industry.

    In reality “open source” is a terrible terminology for what is a very loose fit when basically trying to say that anyone could recreate or modify the model because they have the exact ‘recipe’.



  • No, because Lemmy isn’t social media. It’s a link aggregator.

    Social media requires you to know who the other people are, or at least that the identity and personality of the other people posting matters to what you consume. Apart from one or two attention-seeking exceptions, I almost never notice who posted something.

    In fact, Lemmy being a Reddit clone, you may remember Reddit stirring controversy for years as they did try to become social media - adding avatars, followers functions, chat groups, etc.; none of which really suit the platform or its audience. Perhaps as the audience has changed they’ve gotten what they wanted.

    If “social media” is just the ability to comment anonymously on Internet content and argue with strangers, then the guest book on my Geocities soccer page was social media.









  • It’s certainly better than "Open"AI being completely closed and secretive with their models. But as people have discovered in the last 24 hours, DeepSeek is pretty strongly trained to be protective of the Chinese government policy on, uh, truth. If this was a truly Open Source model, someone could “fork” it and remake it without those limitations. That’s the spirit of “Open Source” even if the actual term “source” is a bit misapplied here.

    As it is, without the original training data, an attempt to remake the model would have the issues DeepSeek themselves had with their “zero” release where it would frequently respond in a gibberish mix of English, Mandarin and programming code. They had to supply specific data to make it not do this, which we don’t have access to.




  • I know how LoRA works thanks. You still need the original model to use a LoRA. As mentioned, adding open stuff to closed stuff doesn’t make it open - that’s a principle applicable to pretty much anything software related.

    You could use their training method on another dataset, but you’d be creating your own model at that point. You also wouldn’t get the same results - you can read in their article that their “zero” version would have made this possible but they found that it would often produce a gibberish mix of English, Mandarin and code. For R1 they adapted their pure “we’ll only give it feedback” efficiency training method to starting with a base dataset before feeding it more, a compromise to their plan but necessary and with the right dataset - great! It eliminated the gibberish.

    Without that specific dataset - and this is what makes them a company not a research paper - you cannot recreate DeepSeek yourself (which would be open source) and you can’t guarantee that you would get anything near the same results (in which case why even relate it to thid model anymore). That’s why those are both important to the OSI who define Open Source in all regards as the principle of having all the information you need to recreate the software or asset locally from scratch. If it were truly Open Source by the way, that wouldn’t be the disaster you think it would be as then OpenAI could just literally use it themselves. Or not - that’s the difference between Open and Free I alluded to. It’s perfectly possible for something to be Open Source and require a license and a fee.

    Anyway, it does sound like an exciting new model and I can’t wait to make it write smut.


  • I understand it completely in so much that it’s nonsensically irrelevant - the model is what you’re calling open source, and the model is not open source because the data set not published or recreateable. They can open source any training code they want - I genuinely haven’t even checked - but the model is not open source. Which is my point from about 20 comments ago. Unless you disagree with the OSI’s definition which is a valid and interesting opinion. If that’s the case you could have just said so. OSI are just of dudes. They have plenty of critics in the Free/Open communities. Hey they’re probably American too if you want to throw in some downfall of The West classic hits too!

    If a troll is “not letting you pretend you have a clue what you’re talking about because you managed to get ollama to run a model locally and think it’s neat”, cool. Owning that. You could also just try owning that you think its neat. It is. It’s not an open source model though. You can run Meta’s model with the same level of privacy (offline) and with the same level of ability to adapt or recreate it (you can’t, you don’t have the full data set or steps to recreate it).


  • I didn’t put any words in your mouth… I really don’t understand how you’re not getting that. I said you understand that it’s not true. Literally just read the part you quoted.

    Actually none of what you said just now was untrue. The leap that is unexplained is that bringing back a Catholic monarch would turn the UK into a papal theocracy where no other Catholic kingdom was (except the Papal States!).

    And that specifically is the part that I’m arguing has no basis in fact - you’re asking me to provide evidence that something wasn’t going to happen. Usually we ask for evidence of speculation, not against speculation. It doesn’t help that the people that could have said so were hung drawn and quartered, and the history written by people who immediately brought in further anti-Catholic legislation.