Script for Expert Call: NVIDIA H20 Rumors / RTX Pro 6000 / CoWoP / GPT-5 / OpenAI / AMD / Intel
SemiconSam Original Report
Q: There are rumors that Nvidia recently placed an additional order for 210,000 or 300,000 H20 units with TSMC. What is the source of this demand? A: According to confirmations from multiple vendors, this news is likely untrue. Currently, Nvidia has not received clear delivery orders from domestic (Chinese) customers. The core reason is that while the U.S. government has announced the opening of H20 deliveries to China, the actual licenses have not yet been issued. Top domestic internet companies like ByteDance and Alibaba are still in a wait-and-see mode, with only Tencent providing preliminary information. These companies will not place large-scale orders before obtaining export licenses to avoid the risk of their applications being rejected and the products being undeliverable. Furthermore, Nvidia has an existing inventory of over 300,000 unpackaged chips, so there is no need for additional production in the short term. Even if production capacity needs to be adjusted in the future, it would take 9-10 months from line adjustment to final delivery, which includes a 3-month cycle for line adjustment and over 4 months for production. Therefore, there is no urgent need for additional production in the short term. Some industry insiders speculate that the rumor may have originated from a misunderstanding of Nvidia resuming CoWoS packaging orders, which was misinterpreted as a new H20 order. Nvidia has not explicitly instructed TSMC to adjust its production capacity for additional H20 products.
Q: Did the Cyberspace Administration of China (CAC) speak with Nvidia about H20 security vulnerabilities? How do domestic customers view this issue? A: Nvidia has had related functionalities since the A-series, and they have continued into the H-series. Although privacy issues may cause some concern among customers, it is not expected to be significantly related to data leakage. From an industry perspective, this move by the authorities is more likely a tactic to pressure the U.S. side to reduce restrictive conditions on similar products in the future, while also demanding that Nvidia disable potentially risky functions, rather than solely targeting existing security vulnerabilities. Although domestic customers are concerned about privacy and security, their level of concern about actual data leakage risks is limited. They are more focused on the adjustment of restrictive conditions for subsequent products.
Q: In the 1.2-1.4 million delivery forecast for the RTX PRO 6000 in the overseas market this year, what is the customer distribution? A: In the overseas market delivery forecast, non-CSP (Cloud Service Provider) customers account for a higher proportion. CSP customers mainly use it for inference scenarios but have not deployed it on a large scale. This is because the RTX PRO 6000 has outstanding cost-effectiveness for inference tasks, but its training performance is limited and cannot meet the demands of large-scale training. In terms of user structure, internet users account for about 30%-40%, with the majority of the remainder being smaller, scattered professional users, such as those engaged in professional-grade small-scale model design or B2B businesses. These users have clear professional needs for the product, but their purchase volumes are relatively dispersed.
Q: What is the demand for the RTX PRO 6000 in the domestic (Chinese) market? What is the demand from second-tier internet companies? A: In the domestic market, ByteDance remains the largest customer for the RTX PRO 6000, accounting for the highest proportion, followed by Tencent. However, both ByteDance and Alibaba still have a large inventory of H20 or MI308 chips that have not yet been deployed, so their demand for the RTX PRO 6000 has decreased. Nevertheless, the overall delivery share structure has not changed significantly. It is worth noting that second-tier internet companies have a high demand for the RTX PRO 6000, with each company's demand estimated to be between 20,000 and 40,000 units, contributing a total of nearly 200,000 units. This makes them an important component of the domestic market demand.
Q: What are the current specifications for RTX PRO 6000 servers? How is the ODM (Original Design Manufacturer) share distributed? A: In terms of specifications, the current mainstream configuration is an 8-card single machine, with air cooling as the primary thermal solution. Some customers are also attempting to develop a 16-card version, with some adopting liquid cooling solutions to improve thermal efficiency. Regarding the ODM share, Supermicro has a clear advantage, holding a 40% share. Quanta primarily serves foreign internet clients, accounting for 20%-30%. Wistron and Inventec, which have close partnerships with Lenovo, each hold less than 10%. Other manufacturers such as ASUS, Gigabyte, and MSI have even smaller shares. The overall ODM market is dominated by the leading manufacturers.
Q: Is the CoWoP (Chip-on-Wafer-on-PCB) solution feasible? There are rumors it's being tested on GB100 and will be applied on GR150. A: The feasibility of the CoWoP solution is low, with the core issue being its extremely high requirements for PCB (Printed Circuit Board) processes. Mainstream domestic PCB manufacturers report that production with existing processes would lead to a 30%-40% decrease in PCB yield rates, making mass production difficult in the short term. Currently, only Japanese PCB manufacturers can provide relatively reliable quality, but their production capacity is extremely limited and prices are high, making it impossible to support large-scale mass production needs. While testing on the GB100 is possible, the feasibility of applying it to the GR150 is only 30%-40%, which is extremely risky. Furthermore, this solution presents significant challenges in welding technology and post-production maintenance. Companies like TSMC are only in the early stages of testing and validation. Therefore, Nvidia is more likely to treat it as a technology reserve rather than a near-term mass production solution.
Q: Is OpenAI's plan to have 1 million GPUs online by the end of the year feasible? What is its current scale and future development pace? A: OpenAI recently completed the training of GPT-5 using a total of 170,000-180,000 GPUs. After GPT-5 goes live, user demand is expected to increase substantially. Both Copilot and third-party CSPs are actively in discussions, so deploying 1 million GPUs by year-end is somewhat feasible. As OpenAI's self-developed ASIC cannot meet demand in the short term, it is actively negotiating with AMD, hoping for a future market share split of 50% for Nvidia and 50% for AMD. The final expectation is Nvidia at 60% and AMD at 40%. If AMD's actual performance meets expectations, a 50/50 split is possible. In terms of partnerships, AMD is working closely with Microsoft and Oracle. This year, it has committed to delivering 250,000 units to Oracle. Microsoft is expected to purchase 400,000 units of MI350 and MI355X in 2025, which would bring Microsoft's procurement volume from AMD close to its volume from Nvidia.
Q: Is OpenAI's first-generation Titan ASIC design very similar to the TPU? Is this related to OpenAI recently renting more TPUs? A: From a technical perspective, it is reasonable that OpenAI's first-generation Titan ASIC design is similar to the TPU. The TPU was a successful first-generation ASIC design, and many subsequent ASICs have, to some extent, referenced its concepts. This architectural similarity is beneficial for the continuity of future models and can better adapt to existing and future model R&D needs. OpenAI's recent increase in renting TPUs may be related to its self-developed ASIC not being mature yet and requiring external computing power support in the short term. While there is no direct causal link, the architectural similarity provides a certain technical basis for adapting to rented TPUs.
Q: What is your take on the new partnership agreement between OpenAI and Microsoft? How is AGI defined? What is the substantive significance for their subsequent relationship and cooperation? A: Microsoft urgently needs to leverage OpenAI to compensate for its own shortcomings in model R&D, such as optimizing Copilot functionality and upgrading the Bing search engine. Since internal resources and third parties cannot solve these problems, it has no choice but to deepen the cooperation. Microsoft chose to define the cooperation terms in a vague and flexible manner, which both secures short-term technical support and avoids over-dependence, allowing for quick adjustments or termination in the future. Its long-term goal is to develop independent capabilities and build an open platform ecosystem. This cooperation did not explicitly define AGI and is more focused on technological complementarity. The substantive significance is that Microsoft is using OpenAI to maintain its competitiveness in the short term while buying time for its own independent R&D.
Q: After the completion of GPT-5 training, is there any preliminary information on its performance and release date? A: GPT-5 is expected to be released in early August. It has now entered the testing phase, with reasoning and mini version testing options opened to some customers on Copilot and ChatGPT. Its most significant feature is a notable improvement in multi-modal capabilities, achieving a complete multi-modal model. It performs excellently on multiple large model evaluation metrics and is expected to top most evaluation leaderboards upon release, likely bringing a "wow" experience to the market. This progress will further solidify OpenAI's leading position in the large model field and drive the expansion of related application scenarios.
Q: What are the reasons behind the recent layoff of over a hundred people in the AWS cloud computing division? A: The layoffs in the AWS cloud computing division were primarily concentrated in customer service and project management positions; core technical personnel were not affected. The main reason for the layoffs is that the functions of these positions are being gradually replaced by artificial intelligence tools. AI tools can handle customer inquiries and project coordination more efficiently, reducing the reliance on human labor. This adjustment is a measure by AWS to optimize its cost structure and improve operational efficiency, aiming to allocate more resources to core technical areas such as underlying architecture design to ensure long-term competitiveness.
Q: How is Jaguar Shores currently progressing? What are its specifications, performance expectations, and target customers? A: Jaguar Shores is progressing slowly, with samples expected by the end of 2027. Its specifications have not been updated for nearly a year and remain at a level close to the B200, such as the HBM4's 288G compute power metric. At this pace, the product may lack competitiveness by the end of 2027, as Nvidia's products by then could be a generation or even more ahead. Its target customers are indeed aimed at the training scene, but due to the slow progress and insufficient performance expectations, whether it can achieve a market breakthrough remains doubtful. It is unlikely to impact the existing market landscape in the short term.
Q: What is the significance of Intel providing a 14A process Design Kit to Apple and Nvidia, and what is the subsequent process? A: Intel providing a 14A process PDK (Process Design Kit) to Apple and Nvidia is the foundational first step of cooperation. It ensures that customers can use the toolkit for routing and simulation in their back-end product design, which is of significant value. However, the subsequent validation process is lengthy, requiring validation over at least more than one product generation and involving 2-3 revisions of the PDK, with a total time of about 2.5 years. It is expected to be mature and capable of large-scale service by 2028. This marks progress for Intel in expanding its customer base in advanced processes, but there is still a long cycle before actual commercial use.
Q: What is your take on Lip Bu Tan's comment about abandoning the 14A process if there are no substantial customers? What is his thinking? A: Lip Bu Tan's statement reflects Intel's pragmatic strategy in advanced process R&D. The investment in the 14A process is enormous, and without substantial customers, it would face the risk of not being able to recover the high costs. However, since 2028 is still several years away, market demand and the technological competitive landscape may change during this period, leaving room for the board's decision to be adjusted. Currently, the industry's view on its strategy is not yet clear. The key will depend on whether it can attract substantial orders from core customers like Apple and Nvidia in the future to support the continued R&D and mass production of the process.
Q: Which teams are primarily affected by the 2,400 layoffs at the IFS Oregon factory? Will it affect the progress of the 18A and 14A projects? A: This round of layoffs was primarily concentrated in teams related to 18A. The Oregon factory was originally responsible for early validation work. However, Lip Bu Tan demanded strict cost control and reduced the number of product validation iterations from five to one or two, which significantly lowered the factory's workload and thus led to the large number of layoffs. In addition, since 18A will no longer be offered to customers, positions that supported early customer service were also cut, with only essential personnel being retained. This adjustment is based on business optimization and will have little impact on the core progress of the 18A and 14A projects; it is more of a reallocation of costs and resources.
The info on the H-scalers deploying AMD will mean a large gain in share by AMD. How reliable is the Expert Call and have you crossed checked with other experts/supply chain folks?