Release: 2026/05/28 21:12 Reading: 0
Original author:Yoi 科技 Open 講
Original source:https://www.youtube.com/embed/8SYHdSXUqxo
🎧 Did you know? In the past few days, several things worth taking a closer look have happened in the AI circle: a new benchmark called DeepSWE has made the real gap between GPT-4o and Claude clearly visible for the first time; Tenstorrent has launched an AI chip that costs only one-fifth of Nvidia; and a researcher named Andrej Karpathy is redefining the core capabilities of the next generation of engineers. On the surface, the rhythm of the technology circle is as usual, but secretly, the computing power architecture, evaluation standards, and engineering thinking are all changing at the same time. Don't worry, I've helped you put on noise-canceling headphones to filter out the true value of these signals. This episode also talks about one more thing: when AI makes efficiency a standard feature for everyone, the most scarce thing is judgment - which is the hottest word in Silicon Valley right now: Taste. 1️⃣ Highlights of this episode This episode breaks down what is really happening in the AI ecosystem from four angles. The first layer is "measurement": DeepSWE, a new benchmark test, allows developers' real-life experience to be verified by data for the first time. Its question-setting method simulates real work situations. The prompt words are short but the required solutions are complex. It tests end-to-end reasoning, not memory. GPT-4o is 15 percentage points behind Claude 3 Opus, and the cost is three times the difference. If you choose the wrong model, you will waste not only money but also time. The second layer is "hardware": Tenstorrent's architecture subverts the core assumptions of the GPU, moves the scheduling logic from the chip to the compiler, and uses cheap GDDR6 memory to outperform Nvidia's high-bandwidth memory system. The cost of running Llama 3 has dropped from US$30 to US$6 per million tokens, which is five times cheaper. The third level is "engineering thinking": the five pillars of Agent engineering proposed by Andrej Karpathy. The core insight is: stop building functions and start building "factories that help you build functions." The fourth layer is the "situation engine": without it, the Agent factory would run idle. With it, the same task is compressed from 2.5 hours to 25 minutes, the token usage is cut in half, and the output quality directly passes the test. Running through these four layers is a larger observation: when AI pushes execution efficiency to the limit, what is really scarce is judgment - knowing what is worth doing and what should not be done, that is, Taste. 2️⃣ Let’s talk about these things in this episode📌 [DeepSWE Benchmark Test]: The first AI evaluation that truly reflects the actual experience of developers, the gap between GPT-4o and Claude 3 Opus is as high as 15 percentage points📌 [Tenstorrent Challenges Nvidia]: Jim Keller reduced the cost of AI chips to one-fifth by throwing away all the core assumptions of the GPU📌 [Five Pillars of Agent Engineering]: Karpathy The framework tells you that the core competitiveness of the next generation of engineers is to design systems that allow AI to work effectively, not just to use AI 📌 [Hidden power of context engine]: Without Context Engine, the Agent factory is idling; with it, task time is compressed from 150 minutes to 25 minutes 📌 [The most scarce ability in the AI era is Taste]: When efficiency becomes everyone’s basic equipment, the ability to judge “what is worth doing” is the real moat 3️⃣ “Unmanned Army” and AI The humanistic judgment of the times recently read a book called "Unmanned Army: AI War King Palmer." Rage and the Rise of Anduril". On the surface, it is the entrepreneurial story of Palmer Luckey: he sold Oculus VR to Facebook at the age of 21, was later exiled from Silicon Valley due to controversial political stance, and finally turned around and founded Anduril, using AI, drones and autonomous systems to challenge the traditional military industry system. The story itself is already very tense. But what’s even more interesting is that the publishing process of this book itself is a microcosm of the AI era—from writing, proofreading, reviewing to typesetting, the entire process uses AI extensively. A book that discusses AI warfare and unmanned armies also uses AI to produce knowledge. What this incident reminds us is not just that "AI is very convenient", but that things that in the past required a lot of manpower, time and professional division of labor are being systematically compressed. This reminds me of a question: when AI pushes efficiency to the extreme, efficiency itself is no longer enough to answer the most important questions. AI can allow us to write a book faster, but it can’t decide for us why the book is worth writing. AI can allow companies to make products faster, but it cannot judge for us whether the product actually makes people's lives better. The real danger is not that AI becomes too powerful, but that human judgment fails to follow suit. The word "Taste" that many people in Silicon Valley are talking about now refers to exactly this: the ability to distinguish what is worthy of existence among infinite possibilities. 👉 If you are interested in Palmer Luckey, Anduril and the rise of the AI military-industrial system, this book is worth reading. 📣 This episode talks about DeepSWE’s evaluation data, Tenstorrent’s chip architecture, all the way to Agent engineering thinking and situational engines. To be honest, the speed of changes in these technical aspects is really breathtaking. But the more this happens, the more certain I am of one thing: whether technology can do it is becoming less and less the most difficult question. The real difficulty is whether you have enough judgment to decide what should be done and what should not be done, what is just noise and what is a real signal. The greater the amount of information, the more you need a good pair of noise-canceling headphones to help you filter out the things worth paying attention to. If today's episode helps you find one or two useful directions from these signals, don't forget to leave a five-star review on Apple Podcasts, subscribe and follow "Yoi Technology Open Talk", and let me continue to help you sort out the most valuable industry trends every Monday, Wednesday, and Friday! Want to keep abreast of first-hand industry dynamics and practical technology trends? Welcome to follow Yoi's social platforms: 🔍 FB / IG / Threads Please search: Yoi Studio, @yoi__studio -- Hosting provided by SoundOn (https://www.soundon.fm/)
比特币米娅老师
2026-06-18 09:35
Dr Niki
2026-06-18 09:33
队长比特币行情分析
2026-06-18 09:31
Flash Crypto Tutorials
2026-06-18 09:31
Dr Niki
2026-06-18 09:31
Kenneth MEMES
2026-06-18 09:19
Xuaco Arbitrage
2026-06-18 09:19
Myles G Investments
2026-06-18 09:19
高山说缠论
2026-06-18 09:00
Select Currency
US Dollar
USD
Chinese Yuan
CNY
Japanese Yen
JPY
South Korean Won
KRW
New Taiwan Dollar
TWD
Canadian Dollar
CAD
Euro
EUR
Pound Sterling
GBP
Danish Krone
DKK
Hong Kong Dollar
HKD
Australian Dollar
AUD
Brazilian Real
BRL
Swiss Franc
CHF
Chilean Peso
CLP
Czech Koruna KČ
CZK
Singapore Dollar
SGD
Indian Rupee
INR
Saudi Riyal
SAR
Vietnamese Dong
VND
Thai Baht
THB
Select Currency
US Dollar
USD-$
Chinese Yuan
CNY-¥
Japanese Yen
JPY-¥
South Korean Won
KRW -₩
New Taiwan Dollar
TWD-NT$
Canadian Dollar
CAD-$
Euro
EUR - €
Pound Sterling
GBP-£
Danish Krone
DKK-KR
Hong Kong Dollar
HKD- $
Australian Dollar
AUD-$
Brazilian Real
BRL -R$
Swiss Franc
CHF -FR
Chilean Peso
CLP-$
Czech Koruna KČ
CZK -KČ
Singapore Dollar
SGD-S$
Indian Rupee
INR -₹
Saudi Riyal
SAR -SAR
Vietnamese Dong
VND-₫
Thai Baht
THB -฿