Perfil monitorado

SemiAnalysis

@SemiAnalysis_

Posts coletados: 100 posts
Última publicação: Último · 28 de jul, 16:38
Frequência: Sync · 15 min

28 jul · 15:11·ver no X
Vídeo
28 jul · 16:38·ver no X
Btw, we do have the answers, sales@semianalysis.com
27 jul · 22:00·ver no X
🤖 Generated with Claude Code
27 jul · 20:12·ver no X
x.com/AnthropicAI/st…
27 jul · 18:00·ver no X
AMD's warrant deals with OpenAI and Meta usually get described as equity sweeteners. Run the math, and these look more like a rebate that matches the price of the compute itself, with up to a 105% discount for OpenAI. (1/3)🧵
27 jul · 18:00·ver no X
Per AMD's 8-K, the warrants vest in tranches tied to share price, topping out at a $600 hurdle, and can be exercised for $0.01, essentially nothing. So the value handed back to the buyer scales directly with AMD's stock. (2/3)
27 jul · 14:40·ver no X
hello @elonmusk & @finkd, the ClusterMAX team would be happy to include xAI and Meta in our next neocloud ranking. we will need one of your dev clusters for 5 days for thorough benchmarking. we would especially like to verify that your Kubernetes management is up to spec. please let us know whether to coordinate via Slack or just slide into your X DMs. thank you for your attention to the matter.
27 jul · 11:30·ver no X
CXMT just had the most absurd debut in semiconductor history. Priced at ¥8.66, opened at ¥49.50, closed +466% at ¥49. Market cap: ~$488B — bigger than Intel. A DRAM maker that didn't exist ten years ago. (1/2)🧵
27 jul · 11:30·ver no X
We saw this coming. Our deep dive last month broke down CXMT's rise from Qimonda's ashes to the world's #4 DRAM maker — the tech transfer, the Hefei patient capital, and the IPO math. (2/2) newsletter.semianalysis.com/p/chinas-cxmt-…
26 jul · 20:51·ver no X
BREAKING FROM ICM 2026: Three of four 2026 Fields Medalists have announced that they will NOT be pivoting to AI research, even though their jobs will soon be AUTOMATED BY AGI. As Dijkstra said, "Programming is one of the most difficult branches of applied mathematics; the poorer mathematicians had better remain pure mathematicians."
25 jul · 18:00·ver no X
24 jul · 22:30·ver no X
Storage just stole the show. Between KV cache offload and SSD prices going vertical, storage went from afterthought to headliner. Vik Malyala walks through @Supermicro's storage stack. "Storage, whether it be for KV cache offload or just the incredible rise in cost of SSDs, is the primary thing to focus on for a lot of users." "Initially it started with the form factors. You have U.2s that were popular, or still are popular for that matter. Then you have E1.S and E3.S drives, mainly because of the density that we can bring in a 1U or 2U form factor. And we have come up with so-called Petascale solutions, which you can get a very good balance between the amount of storage, the storage bandwidth as IO bandwidth." "Hardware is only one part of it, but what we are doing is working with the likes of VAST, WEKA, DDN, Qumulo, OSNEXUS and every one of these software vendors. We are able to bring different types of software defined storage into customer environments so that you don't necessarily get stuck with traditional storage. So super exciting times for that."
Vídeo
24 jul · 22:30·ver no X
@Supermicro Watch more: x.com/SemiAnalysis_/…
24 jul · 21:36·ver no X
Can AMD break the CUDA Moat? AMD Advancing AI 2026, Up to 105% Equity Rebate Discounts for OpenAI, Agentic Kernel Generation, Improvement in Software Quality, Unstable Internal Development Clusters, Helios MI455X Production Ramp Hell newsletter.semianalysis.com/p/can-amd-brea…
24 jul · 15:47·ver no X
Was widespread Tokenmaxxing ever really here? Meta burned 60T+ tokens in 30 days; one employee used ~280B. Uber burned its annual Claude Code + Codex budget in four months. We spoke with 50+ enterprises. The real story was not a budget wall, but allocation.👇️ (1/6)🧵
24 jul · 15:47·ver no X
Meta's "Claudeconomics" dashboard ranked 250 users: "Token Legend," "Cache Wizard," agents researching for hours to burn tokens. It was shut down 2 days after The Information's report. Uber set a $1,500/mo/employee cap. (2/6)
23 jul · 23:30·ver no X
23 jul · 19:30·ver no X
Banks won't lend against GPUs they think are worth zero in three years. NVIDIA's fix: guarantee the floor. The backstop turns speculative neocloud clusters into bankable projects, and NVIDIA takes a cut of everything above the line.
Vídeo
23 jul · 19:30·ver no X
Watch the full breakdown: youtu.be/0YOf6QTCNuY?si…
23 jul · 17:26·ver no X
Intel actually breaks out the "mark to market losses on Escrowed Shares" to the U.S. Department of Commerce. The way to read that is that is the mark to market of Trump's investment in Intel. Last quarter it was just a measly billion dollars, now it's 13. Hats off to the finance (and gov relations) team for this interesting disclosure. $INTC (1/2)🧵
23 jul · 17:26·ver no X
Foundry momentum is real, as 18A-P is entering risk production for customers. This is the critical inflection point to see if Intel's process is usable to the outside. Interesting is that Intel is announcing a 5 billion euro addition to Intel Xeon 6 on Intel 3, despite ramping 18A-P! There is clearly a capacity shortage, and Intel is a swing producer. What's more is also the new Bowers campus addition for Intel Mask Operations. Our read is that is ramping mask production for external customers for the first time! Zinser is signaling further increases. "we are meaningfully increasing our investments in equipment, clean room space, and substrates" Last but not least: Intel now has additional LTAs secured, and given the press release it's implied that it could be Google cloud. Great job Intel, but the journey isn't over yet. Foundry is at the edge of becoming the United State's first domestic process. (2/2)
23 jul · 16:00·ver no X
No AI layoffs in jobless claims. Scaled to the labor force, initial claims are at all time lows. Latest week 187k, only 25k above the 1968 record low, when the labor force was less than half its current size. Mass layoffs are absent.
23 jul · 11:01·ver no X
ASML raised FY26 guidance for the second time in three months at 2Q earnings. (1/3) 🧵
23 jul · 11:01·ver no X
We believe the following signals a stronger upcycle ahead: order visibility extending into 2028, management turning pre-emptive on capacity expansion, like-for-like price increases on the table, DUV immersion shipments reaccelerating, and sustained growth of the upgrade business. (2/3)🧵
22 jul · 23:53·ver no X
Vera Rubin NVL72 vs GB200 NVL72? Inference TCO & Architecture Analysis: Rubin LUT Based Tensor Core, Feynman, Rack Scale, Perf Per MegaWatt, Perf Per Dollar, Software Improvements, Public Rubin Software, PyTorch, vLLM, OpenAI Triton newsletter.semianalysis.com/p/vera-rubin-n…
22 jul · 23:43·ver no X
Vera Rubin NVL72 vs GB200 NVL72? Inference TCO & Architecture Analysis: Rubin LUT Based Tensor Core, Feynman, Rack Scale, Perf Per MegaWatt, Perf Per Dollar, Software Improvements, Public Rubin Software, PyTorch, vLLM, OpenAI Triton Read Now: newsletter.semianalysis.com/p/vera-rubin-n…
22 jul · 21:00·ver no X
Cookie cutter doesn't cut it anymore. Vik Malyala on how @Supermicro became the go-to tier one OEM for startups building unconventional server architectures, from PCIe topologies to ARM-based compute. "Supermicro is the only vendor, sort of tier one OEM, who's able to both have the supply chain deliver anywhere in the world, whether it be US or Asia or Europe or Latam even, but also with a very unique server architecture. It's not just cookie cutter." "When you talk about the PCIe accelerators, these are the ones that are using these accelerators as a part of the system. And I have seen customers looking to have different PCIe topologies, whether it's a single root complex or multi root complex." "Initially we started with both Intel and AMD as a processor architecture to support these accelerators, but now we have added the new ARM based platforms as well. So now people can have an option on the compute platform as well as what accelerators they can support."
Vídeo
23 jul · 16:48·ver no X
For more watch now: x.com/SemiAnalysis_/…
22 jul · 17:51·ver no X
BREAKING: WE KNOW PRODUCTS IS A PART OF GOOGLE CLOUD AND NOT THE MAJORITY OF REVENUE TPUs got sold in the quarter, services margin is a bit weak, but very very strong print x.com/SemiAnalysis_/…
22 jul · 17:15·ver no X
BREAKING: GOOGLE CLOUD GENERATES REVENUE PRIMARILY FROM THE SALE OF TPU SYSTEMS! $GOOG Google Cloud has shifted from being a cloud to a hardware vendor, and this quarter was the first time they said "primarily" with meaningful language change. Is this Google bowing out of the race as they shift to becoming an NVIDIA competitor versus a hyperscaler? More on the call soon 🤞
22 jul · 16:02·ver no X
Kimi K3 massively beats NVIDIA’s Nemotron 3 Ultra model. If America wants to lead in open-source AI, Jensen should recognize that progress comes from independent labs competing against each other based on the free-market principles of testing different ideas, architectures, and approaches, just like China’s AI labs are shipping and iterating. 1/5🧵
22 jul · 16:02·ver no X
When Jensen created the Nemotron Committee, it restricted the flow of different approaches and created groupthink, even though open source is all about the ability to experiment freely. Based on the results, the Nemotron Committee is clearly not the path forward for American frontier OSS. 2/5🧵
22 jul · 15:00·ver no X
BREAKING: X-ray of a Lego fireman by STEEL revealed an internal defect, a subsurface overfill excursion localized to the rear end. This excursion raises serious questions about migrating flagship processes away from TSMC, whose process control has never let a chip go out the door with a suspiciously plump behind. Broadcom remains optimistic, noting the defect "gives it character." We have flagged it for failure analysis. The Millennium Falcon teardown is ongoing, but is still 7 pieces short.
22 jul · 14:01·ver no X
Kimi K3 DESTROYS the Nemotron Committee. If America wants to lead in open-source AI, Jensen should stop with his elitist committee crap. Progress comes from independent labs and free-market principles—testing different ideas, architectures, and approaches, just like China’s AI labs are shipping and iterating. Committees are formed to centralize power, just like in the U.S.S.R., while open source is all about the ability to freely experiment with different approaches.
22 jul · 11:00·ver no X
AMD HAS JUST ANNOUNCED AN ANTHROPIC DEAL WHERE THEY ARE INVESTING UP TO 5 BILLION IN EXCHANGE FOR ANTHROPIC BUYING 2 GW OF AMD MI455 UALOE72 AND FUTURE GPUS. 🚨 This is similar to what we said about Anthropic three days ago. x.com/SemiAnalysis_/…
22 jul · 09:01·ver no X
Coding drives over 70% of lab API revenue. So why are enterprises rationing tokens for people writing emails? "Coding, slash software engineering, is just by far the most token hungry use case." "Recording a point in time right now: token budgeting, Meta's compute strategy, MSL futures, the release of Fable, SOL, and Anthropic profit margins."
Vídeo
22 jul · 09:01·ver no X
Full episode with the tokenomics team: token budgeting, Meta's neocloud clawback strategy, and the subscription margin problem. youtu.be/uLeUpgllI-4?si…
21 jul · 23:58·ver no X
Meta’s Infrastructure Team Needs A Culture Reset Meta Infrastructure has become bloated, with middle managers expending resources on over-engineered solutions that lose sight of broader organizational needs. This culture issue is costing meta billions newsletter.semianalysis.com/p/metas-infras…
21 jul · 20:57·ver no X
BREAKING NEWS: NVIDIA RUBIN (SM107) SUPPORT HAS BEEN ADDED TO PYTORCH AND WILL BE ADDED TO A BUNCH OF PUBLIC GITHUB REPOSITORIES OVER THE COMING DAYS 🔥
21 jul · 16:51·ver no X
ALERT🚨🚨: META's CUSTOM AMD MI400-series chip will be half the size of a normal MI455X chip. It is "optimized" for recsys workloads and $/Memory Bandwidth. It will use ~144GB of HBM instead of 432GB. We break it down below👇️ 1/7🧵
21 jul · 16:51·ver no X
Compared to a normal MI455X package, it will use six HBM4 8i stacks instead of 12 HBM4 12Hi stacks. The reasoning is that Meta’s recsys infrastructure strategy wanted to have a CPU compute-to-GPU compute ratio tuned for recsys and to optimize memory $/BW. 2/7🧵
21 jul · 14:52·ver no X
SCOOP – AMD's MI500X ISA JUST GOT RELEASED AND IT HAS 1 TDM MOE DESCRIPTOR SHARED ACROSS ALL EXPERTS. Just like how MI455X is adding TDM (nicknamed RMA), which is a copy of NVIDIA's SM90 Hopper ISA TMA.
21 jul · 14:01·ver no X
Hittesh, the cousin of @dwarkesh_sp and @dylan522p, just launched his new Instagram Reels account, where he shows off his dance moves! Please drop him a follow to show your support 🔥
Vídeo
21 jul · 10:30·ver no X
Datacenters have always kept reciprocating engines on site as emergency standby, kicking in when the grid drops and running a handful of hours a year. That role is changing, as recips are increasingly being repurposed to prime power, energizing datacenters around the clock. We estimate that recip OEMs (e.g., Caterpillar, INNIO, Cummins) have been contracted to supply ~1 GW of BTM power this year, and 4+ GW in each of 2027 and 2028. Our SemiAnalysis Energy Model tracks these BTM contracts OEM-by-OEM, project-by-project. Yet, that is dwarfed by the opportunity to come. (1/3)🧵
21 jul · 10:30·ver no X
We at SemiAnalysis undertook a detailed modelling exercise of US grid constraints, showing that existing headroom will be exhausted by 2027/28, and planned utility-scale generation through 2030 will not be sufficient to meet load growth. (2/3)
21 jul · 01:00·ver no X
POV: the year is 2027 and u open TikTok to this
Vídeo
20 jul · 21:00·ver no X
Two guys, a couple of drinks, and a 7,000 pound double-wide rack. @dylan522p and @Supermicro CBO Vik Malyala talk racks, storage, and memory prices at Computex 2026. Full interview below 0:00 Cheers from AI Lumina at Computex 2026 0:33 The B300 HGX platform and the memory squeeze 1:00 Helios: launch partner for AMD MI450X 2:32 Why a double-wide rack works now 3:42 Vera Rubin: a smoother ramp than GB 4:50 Startups, d-Matrix, Positron and custom architectures 6:57 Connectivity: PCIe switches, Tomahawk 6, UALink 9:24 Storage takes center stage 10:48 Petascale, PCIe Gen 6 and CXL 3.0 11:40 Cooling Gen 6: liquid loops and bus bars
Vídeo
20 jul · 17:19·ver no X
Great work by the AMD @sgl_project team on enabling nightly disaggregated serving CI to improve code quality! It has already caught and prevented 2 massive bugs from reaching customers, as we explained before 👇️ 1/7🧵
20 jul · 17:19·ver no X
The first AMD DI CI PR, #29084, was implemented three months after SemiAnalysis’s initial request and multiple meetings with @AnushElangovan and Vamsi (Head of AI), as well as a meeting with @LisaSu early in the year regarding improvements to code quality. We will walk through a couple of examples of bugs that AMD DI CI has already caught. 2/7🧵
20 jul · 14:06·ver no X
We cut open the Kirin 9030 and put it under an electron microscope. The smallest metal pitch measures 32.5 nanometers. That is tighter than Intel 18A, their brand new leading edge node. A Chinese fab with no EUV is out-pitching Intel's EUV node by roughly 10 percent. What's going on here? 🤯 "This is the HiSilicon Kirin 9030, the chip inside Huawei's newest flagship phone. A few weeks ago, we cut it open, put it under an electron microscope, and measured the smallest wires inside the chip, the metal pitch. And what we saw was unexpected. The smallest metal pitch inside the Kirin 9030 measures only 32.5 nanometers. That’s smaller than the metal pitch in Panther Lake, which is based on Intel's brand-new 18A node. A Chinese fab, cut off from the most advanced tools, without EUV, is packing its wires about ten percent tighter than Intel's leading edge EUV node. What’s going on here?"
Vídeo
20 jul · 14:06·ver no X
Full breakdown here: youtube.com/watch?v=NAbpji…
19 jul · 19:49·ver no X
ANTHROPIC WILL BE AN AMD CUSTOMER, ACCORDING TO THE PUBLIC GITHUB OF AMD’S SENIOR DIRECTOR OF AI 🚨🚨 We explain the GitHub code and nuances below👇️ 1/4🧵
19 jul · 19:49·ver no X
In the AMD senior director’s public GitHub, the YAML code file lists Anthropic as a “customer” that is probably currently in the evaluation/testing phase. It shows that Anthropic has received the maximum 30 “priority boost points,” putting it in the same tier as existing hyperscale customers like Meta. 2/4🧵
19 jul · 00:48·ver no X
What's your favorite kind of rack?
18 jul · 18:35·ver no X
Similar to DeepSeek in January 2025, Panicans may think that the AI networking switch TAM will massively shrink because Kimi K3 uses KDA Attention, which reduces KV-transfer networking bandwidth by up to 10x. But the opposite is true, as we explain below. 👇️ 1/8🧵
18 jul · 18:35·ver no X
While it is true that Kimi K3 uses Kimi Delta Linear Attention (KDA) in 3 out of every 4 layers and that KDA reduces KV-cache transfer bandwidth by up to 10x compared with comparable full global-attention models, the important missing piece is that Kimi K3 requires WideEP to serve. 2/8🧵
18 jul · 08:00·ver no X
A year ago, the big three was OpenAI, Anthropic, and Google. Things have changed. Moonshot's Kimi K3 sits above Gemini on every composite benchmark, and it's open source in 10 days. New episode: what K3 reveals about frontier margins, model sizes, and who's actually still in the game. 00:11 Is Kimi K3 the Third Best Model? 04:04 Why Delay the Weights? 05:30 2.8T Parameters and Serving Constraints 06:48 Frontier Margins and the 3x Price Hike 11:10 New Architecture, What Comes Next 14:09 Will Open Source Catch Closed? 19:51 Built for Chinese Accelerators 22:57 The Harness Is the Product 28:49 We're Still Early
Vídeo
18 jul · 00:00·ver no X
MASSIVE DELAY ALERT TO ORACLE’S STARGATE SITE AND BLOOM ENERGY🚨🚨 Oracle’s Project Jupiter behind-the-meter datacenter project in New Mexico that plans to use Bloom Energy is at risk of a 1-2 year delay due to permitting and pipeline building blockers. (1/8)🧵
18 jul · 00:00·ver no X
As we continue to monitor the status of datacenter delays, whether they are real, whether they are fake... some are out and out delayed because of -> building gas pipelines and receiving permits for power generation equipment. (2/8)
17 jul · 19:49·ver no X
Kimi K3 is actually quite positive for NVIDIA, as large-model inference is where the NVL72 shines. Because K3 has more than 2.8 trillion parameters, it requires a large scale-up domain to store its weights. 2/8🧵
17 jul · 19:49·ver no X
Secondly, although Kimi Delta Attention has up to 10× lower networking requirements for KV-cache transfers, its large weights require even more network bandwidth to implement an optimization called WideEP, which spreads the weights across different GPUs. 3/8🧵