Follow along with the video below to see how to install our site as a web app on your home screen.
Note: This feature may not be available in some browsers.
Welcome to AppraisersForum.com, the premier online community for the discussion of real estate appraisal. Register a free account to be able to post and unlock additional forums and features.
The scale at which Elon is accelerating is mind blowing there are 110k of these GB200s training for a soon to be released upgrade. Currently, Grok4 is running on a mixture of old and new.
The current Colossus supercluster, used to train models like Grok 3, consists of 100,000 NVIDIA H100 GPUs. The planned expansion for future Grok models involves adding 110,000 NVIDIA GB200 superchips, each containing 2 Blackwell B200 GPUs (for a total of 220,000 B200 GPUs in that phase).To compare their computational power for AI training:- NVIDIA states that a Blackwell B200 GPU delivers 4 times the training performance of an H100 GPU on large-scale GPT models, accounting for factors like Tensor Core efficiency, memory bandwidth, and precision support.- Therefore, each GB200 superchip (with 2 B200 GPUs) provides the equivalent training compute of 8 H100 GPUs (2 × 4).- The full 110,000 GB200s thus offer the equivalent of 880,000 H100 GPUs (110,000 × 8).This makes the GB200-based cluster 8.8 times more powerful than the current 100,000 H100 setup (880,000 ÷ 100,000 = 8.8).Note that this comparison focuses on training throughput; for inference, the multiplier could be significantly higher (up to 30× per NVIDIA claims), but training is the primary bottleneck for developing advanced models like future Grok versions. Power efficiency also improves with Blackwell, but the question centers on overall power (i.e., compute capability).