Comment

Wanting a better world, and holding up a light to the current one to show the differences between what could be and what is, is not at all what “cynical” means. “Cynical” is the opposite of what you mean. “Pessimistic” or “negative” is definitely more apt, yes.

No, I said cynical and I meant cynical.

I don’t care that he criticizes the tech industry, I care that he feels the innate need to portray everyone in it as moustache twirling villains, rather than normal people caught up in the same capitalist systems and pressures as everyone else.

Even here, he spends all the article focusing on rumours about Chinese researchers making novel ways to outperform OpenAI and the like, and just makes a dismissive joke about the accusations that they effectively trained their model using OpenAI’s model. Regardless of whether or not you agree with the morality of ignoring copyright to copy a copier, it’s an incredibly important point because that is not a replicable strategy for actually creating new models. But rather than focus on that he just spends another couple hundred words trying to dunk on the western tech industry in the snarkiest way he can.

source

Sort:hotnew top

theComposer@beehaw.org ⁨6⁩ ⁨months⁩ ago
But it’s not just that “they effectively trained their model using OpenAI’s model”. The point Ed goes on to make is why hasn’t OpenAI done the same thing? The marvel of DeepSeek is how much more efficient it is, whereas Big Tech keeps insisting that they need ever bigger data centers.

source
- masterspace@lemmy.ca ⁨6⁩ ⁨months⁩ ago
  They HAVE done that. It’s one of the techniques they use to produce things like o1 mini models and the other mini models that run on device.
  
  But that’s not a valid technique for creating new foundation models, just for creating refined versions of existing models. You would never have been able to create for instance, an o1 model from Chat PT 3.5 using distillation.
  
  source
PhilipTheBucket@ponder.cat ⁨6⁩ ⁨months⁩ ago
Look up the definition of the word cynical. It means, more or less, asserting that no one is motivated by sincere integrity. Accusing some specific people of lacking integrity, while holding up others as good examples of integrity that everyone should aspire to, is the opposite of cynicism.

He doesn’t address very much the idea that DeepSeek “distilled” their model from OpenAI’s model and others specifically because that is just a rumor with very minimal evidence for it.

OpenAI has reportedly found “evidence” that DeepSeek used OpenAI’s models to train its rivals, according to the Financial Times, although it failed to make any formal allegations, though it did say that using ChatGPT to train a competing model violates its terms of service. David Sacks, the investor and Trump Administration AI and Crypto czar, says “it’s possible” that this occurred, although he failed to provide evidence.

Personally, I genuinely want OpenAI to point a finger at DeepSeek and accuse it of IP theft, purely for the hypocrisy factor. This is a company that exists purely from the wholesale industrial larceny of content produced by individual creators and internet users, and now it’s worried about a rival pilfering its own goods?

Cry more, Altman, you nasty little worm.

The “rumors” you say he discusses about novel ways the Chinese researchers found to outperform OpenAI are based on an extremely detailed look at their paper and their code, as interpreted by experts. The thing you’re upset he doesn’t discuss is based on rumors. He doesn’t discuss it, except to note that it’s just a rumor but would be funny if it’s true, because he is not doing what you accuse him of.

If you’re upset that he was mean to Sam Altman, so much so that you simply don’t care if he also goes deep into a lot of important details and cares about integrity enough to hate a lot on people who don’t have it, then say so. The things you are accusing him of doing are not true, though, and pretty easy to disprove if you can look honestly at his work.

source
- masterspace@lemmy.ca ⁨6⁩ ⁨months⁩ ago
  
  Look up the definition of the word cynical. It means, more or less, asserting that no one is motivated by sincere integrity. Accusing some specific people of lacking integrity, while holding up others as good examples of integrity that everyone should aspire to, is the opposite of cynicism.
  
  Yeah, I know the definition of the word, and I meant what I said. Stop trying to think I said something else because you disagree.
  
  He is incredibly cynical.
  
  He thinks everyone in the tech industry is a moustache twirling villain and always ascribes malice where I competence would do. Like I said, he’s who you listen to when you want to hear someone go on an unhinged rant about everyone being evil, not someone with an accurate view of human nature or motivations.
  
  He doesn’t address very much the idea that DeepSeek “distilled” their model from OpenAI’s model and others specifically because that is just a rumor with very minimal evidence for it.
  
  There is very minimal evidence for literally EVERYTHING he writes about in this article. The whole talk of them working around the GPU restrictions also has incredibly minimal evidence and is just a rumour.
  
  Once again, his motivation is not informing you, it’s dunking in the tech industry. It’s literally his entire persona and career.
  
  The “rumors” you say he discusses about novel ways the Chinese researchers found to outperform OpenAI are based on an extremely detailed look at their paper and their code, as interpreted by experts.
  
  No, they’re not. He just portrays it that way because that makes the tech industry sound bad. We flat out do not know how they trained Deepseek’s model.
  
  source
  - PhilipTheBucket@ponder.cat ⁨6⁩ ⁨months⁩ ago
    
    He thinks everyone in the tech industry is a moustache twirling villain and always ascribes malice where incompetence would do.
    
    Here’s him talking about people from the tech industry:
    
    Nevertheless, Thompson (who I, and a great deal of people in the tech industry, deeply respect)
    
    Every single article I’ve read about Gomes’ tenure at Google spoke of a man deeply ingrained in the foundation of one of the most important technologies ever made, who had dedicated decades to maintaining a product with a — to quote Gomes himself — “guiding light of serving the user and using technology to do that.”
    
    Back to quoting you:
    
    There is very minimal evidence for literally EVERYTHING he writes about in this article. The whole talk of them working around the GPU restrictions also has incredibly minimal evidence and is just a rumour.
    
    We flat out do not know how they trained Deepseek’s model.
    
    Correct. We do not know the training data, which makes it silly to decide that it is definitely cribbed from OpenAI’s model. What we do know is how the code works, because it is open and they wrote a paper. What would you consider “evidence,” if not the actual code and then a highly detailed explanation from the authors about how it works, and then some independent testing and interpretation by known experts? Do you want it carved on a golden tablet or something?
    
    I think I’m done with this conversation. You seem very committed to simply repeating your point of view at me. You’ve done that, so I think we can go our separate ways.
    
    source
    masterspace@lemmy.ca ⁨6⁩ ⁨months⁩ ago
    Picking out random people to lionize too much while you demonize literally everyone else, is still being cynical.
    
    Correct. We do not know the training data, which makes it silly to decide that it is definitely cribbed from OpenAI’s model. What we do know is how the code works, because it is open and they wrote a paper. What would you consider “evidence,” if not the actual code and then a highly detailed explanation from the authors about how it works, and then some independent testing and interpretation by known experts? Do you want it carved on a golden tablet or something?
    
    Because the paper does not prove what DeepSeek is claiming. The paper outlines a number of clever techniques that might help to improve efficiency, but most researchers are still incredibly skeptical that they would add up to a full order of magnitude less compute power required for training.
    
    Until someone else uses DeepSeek’s techniques to openly train a comparable model off non-distilled data, we have no reason to believe their method is replicable.
    
    source