RE: Deep Dive into DeepSeek AI Models
You are viewing a single comment's thread:
Mixture of Experts seems like a common-sense approach.
I agree. I want to see how they deal with problems that require expertise in different domains.
Kind of funny that we're back to 'expert systems' though, which was a hype in the 1970s and a rather mundane type of software today.
I think both then and now they are trying to emulate expertise humans have in certain domains. There's certainly an improvement compared to old expert systems.😀
Maybe that's the weakness of Silicon Valley VC culture. Their whole game is raising tons of money and outspending the competition in order to gain market share. Cutting costs is for losers.
Yes, that's what I noticed too. But if DeepSeek can be run on local computers or smartphones for good-enough models (let's see that first), that's a hit to the VC culture. At least a temporary one, until the AI giants adjust their courses.
In the OpenAI story, they have to be toiling on the Herculean task of achieving AGI for America, spending billions in order to earn trillions
They may still reach AGI relatively soon. They spent a lot of money. I think they bet on the idea that if they are the first to reach AGI, no one will catch them again, and then they'll start getting tons of money back.
I think a hedge fund's side-project is more in line with the actual economic profitability of LLMs in the long term
They had their constraints. I wonder if they wouldn't have chosen the same path as the American tech companies, if they could have had "unlimited" funds and resources (chips) at their disposal.