24 Comments
User's avatar
Stefano's avatar

I love reading your articles. Thank you for taking the time and care to explain DS.

I had no idea about the integration across product categories. Mind-blowing that they're advancing like this. I'm guessing they'll be advancing much quicker in the future (compared to us in the west, in general).

I suppose the real takeaway is how our conditioning to sing the praises of "capitalism" in the west, across decades, especially since there's an ever increasing enormous gap between fantasy and reality, which we like to gloss over and play pretend doesn't exist, have led us to a place where we would need to radically reform our entire approach to everything economic (and social) to get back into the game of innovation. And if we step back and attempt even a cursory glance in a disinterested way at ourselves, it's not realistic.

Thanks for writing and sharing 👍🏼

Expand full comment
Moriarty's avatar

Thank you for the complement. Ironically enough, China is significantly more capitalist than the West, they are true capitalists but also tend to "optimize" certain aspects and benefits capitalism brings to benefit society and not greedy billionaires, so rich people don't dictate where society goes. The downside is the government dictates too much in their case.

Also, as an interesting trivia, it is not surprising China advances so fast in AI, and Machine Learning, since they produce the highest number of mathematicians, have a large portion of the most competent ones, and above all else, 50% of ML scientists are Chinese.

We need to change a lot in the West, as you said, so we can innovate and compete, but instead, we protect useless, inefficient, "please daddy government save me" companies.

Expand full comment
Stefano's avatar

Thank you for the reply. I agree with most of your comment. I've also read they've been repatriating brains from abroad in their universities for over a decade, lavishly funding stem research and have caught up in terms of "production of patents".

I disagree about the billionaires. In the sense theirs are just as greedy as anywhere else (re: real estate fiasco) and they've created many (measuring GDP and GDP per Capita glosses over inequality). Also, personally I'd differentiate between the government doing the national interest and government doing the citizens or people first interest.

My generic point is just to say it's complicated.

Expand full comment
mejbcart's avatar

Thanks for this summary;))

I just had it for my new post, but will paste it here, since it gets interesting, so here it goes:

https://www.anl.gov/article/argonne-national-laboratory-deploys-cerebras-cs1-the-worlds-fastest-artificial-intelligence-computer

https://www.energy.gov/science/articles/doe-lands-top-two-spots-list-fastest-supercomputers

all microsoft based Cerebras ML Software 2.2 supports SDK 1.1. (https://cerebras.ai/)

and for what purpose is it 'trained'? Well: ​“By deploying the CS-1, we have dramatically shrunk training time across neural networks, allowing our researchers to be vastly more productive to make strong advances across deep learning research in cancer, traumatic brain injury and many other areas important to society today and in the years to come.”

Mind you, everyone of people still working there are genetically modified via covid jabs, which were MANDATORY in national labs.... What is astonishing is that US cerebras(brain?) giant is using chinese DeepSeek R1-70B, or lets say gates-chinese medley, middle of the most secret US national labs... hm... That lab has many chinese speaking workers, who already speak nice chinese, but what about the Americans??? Maybe they can be 'trained'..?

Expand full comment
SomeDude's avatar

the fact that they went open source is the part that pleases and excites me.

closed source sucks because nothing in it is fixable without reverse engineering and a lot of hassle. plus you don't know what is in there in those compiled binary blobs, closed source is almost required for effective spying on people with personal devices

Expand full comment
Perplexed Rationalist's avatar

This is one of the best things you’ve written and absolutely bang on. You know I’m based in China and I host a podcast on AI, focussed on human and social impacts. One anecdote here for DS as it is simply called by most Chinese. Before DeepSeek AI was in people daily lives in China much more than in the west because it was already integrated in take away delivery, taxis, maps, devices etc (it’s quite normal to have food delivered to rooms in hotels by robots here now) but hardly anybody was using LLM chatbots day to day. This even though there are decent models such as Kimi, Tongyi etc wven if the B2C apps don’t quite match up to thr background models. But with DS now everybody is using it and worryingly they believe anything it tells them. WeChat groups I’m part of where people just paste in DS answers as if they have suddenly gained some new level in intelligence. It’s both interesting and concerning but I guess it’s what played out in the west a year or so ago.

One thing on the model though. DS hallucinations are really, really bad, much worse than any other frontier models. They need to fix that but like you say they are so cheap the trade off is probably acceptable for many uses at the moment

Expand full comment
Moriarty's avatar

Feel free to share your podcast here btw also thank you for the complement =D.

Kimi is one of my favorite models. they are severely overlooked, and they are what I call "mini-whale" they are also extremely talented and often achieve similar breakthroughs using distinct approaches. They got overlooked because of poor timing =(.

Whenever systemic use of Language Model takes place in large portions of society (in this case, in the entire Chinese society) that effect comes into play, where people basically outsource their entire cognitive processes to model, and take its outputs as gospel, I guess it is quite literally "human nature" and yes it played and actually still plays a role in the West. It tends to go back into baseline in sense.

I think R 1.5 or R2 they will fix the hallucinations issue while maintaining its insane performance and model capability (there is a trade-off in hallucination with model size; gigantic models, if properly trained, will hallucinate significantly less at the cost of significant performance loss, just look at the new ChatGPT 4.5).

The Whale has the highest talent density in any frontier lab, so I believe they will fix the issues and likely open source it.

I just wish they issued a customized license for the use of their open-source projects, prohibiting any closed AI lab from using their improvements unless said lab open source too. So the skill-issued labs like Anthropic and OpenAI don't benefit from their findings.

Expand full comment
SteveBC's avatar

Wow. And you know what really surprised me? That it is already integrated into everything in China. That was *fast*.

I'd like to ask what the impact on non-Chinese AI development will be due to their making aspects of their code public. Obviously, they didn't need to do that, and it would seem to run counter to a normal national desire to stay ahead. Did they show enough to convince others they can do what they claim or did they provide enough code that non-Chinese AI companies can upgrade their own models? I know that would sow a bit of chaos in such other development efforts for at least a while but benefits will (I presume) spread out around the world. Are the Chinese here doing something just to be helpful to the world? Or is it a subtle kind of way to wreck other AI developments permanently?

Expand full comment
Moriarty's avatar

I think it surprised everyone, especially the company (except Wenfeng, which I suspect knows exactly what he is doing, each step they make, but maybe I am just a fan of him lol, but that is what my gut tells me).

The impact is that everyone will experience efficiency gains, this will accelerate AI development and research. besides the few American companies who thought they would rule over AI and create a new serf class, everyone benefits from it and that is exactly what the ethos of the company is.

Wenfeng stated his goal is to increase "societal efficiency," and he doesn't care much about the short-term cost of it, as long as everyone benefits from it, and also he wants the recognition rather than the profits (he is a billionaire on his own right, but not a complete moron as Elon Musk)

Yes, hey showed enough to force OpenAI to release a very below-par, extremely big model just to placate investors and the hardcore users that they are still making strides. It made every Western company reevaluate their strategy. Right now, DeepSeek is the de facto most advanced lab in the world.

And over the past months, that is the vibe I get from many Chinese, they want the prestige, they want to be recognized as competent on their own efforts and not just "steal IP, copy" (which, in truth, they stop doing really), best way to achieve this is by helping everyone in the world.

Their government uses this as a form of soft power, of course.

Expand full comment
SteveBC's avatar

Hmmm. I have a couple questions for you. How would a man dedicated to societal efficiency in China view his government and its actions and how it works? And how might that view affect what he does and how subtly or boldly he does it?

Expand full comment
Moriarty's avatar

Wenfeng literally shifted the entire government into action, before January, the Chinese government and most of the country was idle in AI as an active part of society, now it is everywhere, a ton of massive data centers are coming online (they were build but remained inactive, stealth mode), he made the government, the central bank, and all big firms inject a few hundred billion dollars into AI in the next 4 years alone.

I am absolutely "anti-CCP," and I view China as one of, if not our biggest adversary in this century, so I am not saying this from a place of "pro-China". But what we get fed here in the West is the bottom of the barrel propaganda Western Intel can come up with lol.

Their government does a lot of societal good, I am actually amazed recently because I am tapping into the Chinese accounts in social media and seeing the other side.

Wenfeng will continue moving as he has done so far, or as his employees state. "What move slowly but with seismic effects"

Expand full comment
SteveBC's avatar

Thank you. Very interesting. I agree with your opinion on the CCP, which is the core that makes China an adversary. I don't know how much of a "government serves the people" outcome China can ever have, given their imperial and strong-man history as well as their attitude that all other nations should be tributaries to the Middle Kingdom. However, WF's influence might move things more in that direction than I have detected before. I know there are many undercurrents that might help this process, and I'm no longer sure the members of the CCP have the desire or the power to stop this maturation. Which trend will win? I don't know, but another Tiananmen seems unlikely at this point. Your thoughts?

Expand full comment
Moriarty's avatar

There is a unspoken contract between the entirety of the Chinese population and the CCP, as long as their lives continue to improve (doesn't matter much the rate, as long as some improvement is felt) but especially food is never a problem again, the government has leeway to do what whatever bs they want (and mostly, apparently, they leave the majority of the Chinese population alone).

As soon as this contract is broken, they will revolt. It hinges on that. Perhaps one of the reasons China bought HALF of the world's entire food production since 2021. And a significant amount of commodities, so any form of prostrated conflict wouldn't topple them too fast.

Expand full comment
SteveBC's avatar

Sounds right, Moriarty.

Expand full comment
sadie's avatar

Then tech stocks will be leading the charge to the bottom.... really soon. I still struggle to see the value in this for everyday life. We fix our own 20 yr old cars and everything else that's fixable, pay cash, avoid drs... why does the world need more computing power?

Expand full comment
Moriarty's avatar

To solve scarcity. To discover, create and produce new, more efficient materials.

Solve diseases. Find a problem, AI helps solve it and improve life overall, especially long-term

Expand full comment
K Tucker Andersen's avatar

You certainly are correct on some of the major positives. And perhaps as well it could lead to the creation of a whole new category of gaming and more fun and challenges as well. But to provide balance, there could and probably will be significant downsides as well, particularly considering the modern day gold rush that it has engendered .

Expand full comment
Moriarty's avatar

The gaming side excites me more than all the other side, I even dream of playing the next Cyberpunk 2077 game with procedurally generated questlines, (convincing) autonomous NPCs, etc. The sky is the limit here, but I agree with everything else.

Given that a significant portion of the Western economy, especially American has been help up by Nvidia and FAANG... this shift ain't going down easy. I personally think Nvidia needs a wakeup call because their current pricing is ridiculous.

Expand full comment
Perplexed Rationalist's avatar

I’ve just whacked a lot of my money in physical gold because this is exactly what I see. Even taking out thr probably collapse of fiat currency the over inflation of tech stocks….its a house of cards for sure

Expand full comment
mejbcart's avatar

are you going to augment yourself to become a 'better' hybrid.., in the future, ???

Expand full comment
Moriarty's avatar

I played enough Cyberpunk 2077 to never augment myself lol

Expand full comment
mejbcart's avatar

Glad to hear a HUMAN decision;))

Expand full comment
FREED0ML0VER's avatar

I recently read an article about eliminating, or at least minimizing gpu latency. The author claims that a gpu is idle most of the time, waiting for the next i/o command. Eliminating this will significantly reduce the number of data centers needed going forward. What are your thoughts on this?

https://open.substack.com/pub/fractalcomputing/p/the-black-swan-event-about-to-hit?utm_source=share&utm_medium=android&r=hv24e

Expand full comment
Moriarty's avatar

That is partially what DeepSeek just did, minimized idle time, expedited communication and data access, it made the entire process, from training, to using data, to inference (you the customer using the AI) incredibly faster, more efficient and insanely cheaper.

Expand full comment