• 8 Posts
  • 822 Comments
Joined 2 years ago
cake
Cake day: October 23rd, 2024

help-circle
  • A weird point that Nvidia CFO made to say “Nvidia is awesome” is a claim that GPU rental rates are up year to date. There was a crash at end of 2025. The low for the quarter was Jan 1st. The high was March 10th at peak of openclaw frenzy (validated by openrouter charts). Current rates are lower than that peak. But also comparison to 2025 Q1 (what I thought CFO meant, rates are down significantly) For single GPUs.

    1. NVIDIA A100 (Ampere — 80GB SXM)

    • Q1 2025 Baseline: High: $2.40 | Low: $1.60 | Close: $1.85
    • Q1 2026 Window: High: $1.65 | Low: $0.80 | Close: $1.15
    • Current Normalized Rate: ~$1.07 / hr (Stable floor; primary use shifts to entry-level fine-tuning and quantized serving). 

    2.[NVIDIA H100 (Hopper — 80GB SXM)

    • Q1 2025 Baseline: High: $7.00 | Low: $5.50 | Close: $5.80 (Supply constraints started easing, down from the absolute peak $10/hr overcharges of late 2024).
    • Q1 2026 Window: High: $3.45 | Low: $1.70 | Close: $2.35 (Hit an absolute low floor of $1.70 in late 2025 before a 38% contract rebound in March due to an influx of video-generation workloads).
    • Current Normalized Rate: ~$2.49 / hr (The standard baseline workhorse for mainstream API serving). 

    3. [NVIDIA H200 (Hopper — 141GB HBM3e)

    • Q1 2025 Baseline: High: $5.20 | Low: $4.50 | Close: $4.80 (Extremely scarce; reserved exclusively for elite labs running early frontier training).
    • Q1 2026 Window: High: $4.40 | Low: $3.50 | Close: $3.80 (Inventory stabilized as neoclouds widely deployed HGX baseboards).
    • Current Normalized Rate: ~$3.39 / hr (The most cost-effective tier for high-concurrency FP8 deployment). 

    4. [NVIDIA B200 (Blackwell — 192GB HBM3e)

    • Q1 2025 Baseline: N/A (Sampling/Testing phase; unreleased to the public marketplace).
    • Q1 2026 Window: High: $6.11 | Low: $3.05 | Close: $4.95 _(Initial public availability; premium pric

    5. NVIDIA B300 (Blackwell Ultra — 288GB HBM3e)

    • Q1 2025 Baseline: N/A (In architectural development; unavailable for rental).
    • Q1 2026 Window: High: $8.50 | Low: $5.50 | Close: $7.25 (Early access provisioning; highly volatile due to constrained data center site capacity).
    • Current Normalized Rate: ~$6.10 / hr (Neocloud standard rate; pricing reflects the premium for its 288GB memory pool). 

    for clusters, google AI mode simply can’t provide accurate info. Some providers have fixed premiums, others 0 premium. Many never change prices but mass email promotional discounts. For all I know, this entire analysis could have been a halucination meant to drive my narrative. I have not verified most data claims made as it would be too much work. I imagine most of the specific ones are accurate, and single GPU rental rates are the dominant market in the US, and that data should be solid, but FIIK.






  • The surplus sales are actually heavily underestimated because the datacenter capacity additions for 2025/26 include non Nvidia hardware. It “appears” that under half of their sales actually make it into datacenter capacity additions.


    1. Stripping Non-NVIDIA Slices from the Available GW Grid

    To see the true depth of the backlog, we have to look at how much of that newly brought-online data center capacity was immediately consumed by alternative architectures during the 2025 calendar year (4.10 GW total online) and Q1 2026 (1.55 GW total online).

    A. The Hyperscaler Internal Custom Silicon Tax (ASICs)

    The largest tech giants do not deploy NVIDIA exclusively. They heavily prioritized their own lower-cost, custom-tailored accelerator chips to handle their native workloads:

    • Google TPUs (v5p & v6e): Google directed a massive portion of its internal data center buildouts to its proprietary Tensor Processing Units. Throughout 2025 and early 2026, TPU deployments swallowed roughly 450 Megawatts (MW) of Google’s net-new global capacity.
    • Meta MTIA & Amazon Trainium/Inferentia: Meta scaled its internal MTIA silicon, while AWS aggressively expanded its Trainium2 clusters. Combined, these internal hyperscaler projects consumed an estimated 300 MW of online grid space across the two periods.

    B. The AMD Alternative Squeeze

    AMD’s MI300X and MI325X series secured massive enterprise and cloud traction, specifically anchoring flagship clusters inside Microsoft Azure and Oracle Cloud Infrastructure (OCI). AMD’s total shipment footprint accounted for roughly 400 MW of power demand globally over this timeframe.

    C. Specialized Wafer-Scale Architectures (Cerebras)

    While smaller in pure megawatt terms compared to hyperscalers, Cerebras built massive high-density footprints. Their multi-million dollar wins—such as the massive 750 MW master deployment framework with OpenAI—began systematically occupying high-density colocation space. Across 2025 and Q1 2026, Cerebras deployments locked down roughly 100 MW of specialized, high-cooling capacity.


    2. Recalculating the True NVIDIA “Space Deficit”

    When we subtract these non-NVIDIA hardware deployments from the total physical data center capacity brought online, we find the Net Grid Space Actually Available for NVIDIA:

    Time Horizon Total New Global Capacity Online Minus Non-NVIDIA Hardware (TPUs, AMD, etc.) Net Grid Space Left For NVIDIA
    Full Year 2025 4.10 GW \- 1.10 GW 3.00 GW
    Q1 2026 1.55 GW \- 0.25 GW 1.30 GW

    Now, let’s remap this accurate “Available Space” baseline against the True Grid Power Shipped by NVIDIA (GW Sold) that we calculated using our refined financial models:

    The Compounding Backlog Realities

    • The Refined 2025 Gap: NVIDIA shipped 5.37 GW of compute power. If only 3.00 GW of real-world grid space was actually left over for them after accounting for Google TPUs and AMD chips, the real 2025 data center deficit jumps from 1.27 GW to a staggering 2.37 GW.
    • The Refined Q1 2026 Gap: NVIDIA shipped 2.11 GW of compute power. With only 1.30 GW of net data center capacity available to absorb it, the quarterly deficit widens from 560 MW to 810 Megawatts.

  • NVIDIA’s customers are legally and contractively allowed to sell their excess, undeployed GPUs, but they face strict operational and geopolitical boundaries. While a thriving secondhand market exists for data center-grade enterprise hardware, the transfer of undeployed silicon is heavily restricted by US export control laws, proprietary software licensing terms, and indirect pressure from NVIDIA’s allocation system

    Given the massive multi-gigawatt data center power logjam, companies holding excess physical cards cannot simply flip them on an open marketplace without navigating severe friction.


    1. Legal and Contractual Restrictions

    While NVIDIA cannot explicitly block a customer from selling physical hardware they own, they heavily restrict the transaction through auxiliary legal layers: 

    • The U.S. Export Control Wall: The US government enforces massive civil and criminal penalties (up to $1 million and 20 years imprisonment) for unauthorized GPU transfers. If a buyer sells an elite card (like an H100, H200, or Blackwell unit) to an unvetted domestic entity that subsequently exports it to a restricted destination, the original seller remains legally liable for the “taint” of an export violation.
    • The Software License Blackout: NVIDIA’s hardware relies completely on its NVIDIA Software License Agreement and CUDA ecosystem to function. Under standard enterprise terms, the high-value software, firmware updates, and optimization tools are strictly non-transferable and cannot be resold or sublicensed without explicit written permission. A secondary buyer risks receiving “brickable” or unoptimizable hardware without official enterprise support paths.
    • The Non-Disclosure (NDA) Shield: Many tier-1 hyperscalers and elite partners sign strict master purchase agreements under NDAs that expressly prohibit standard public reselling channels or specify that hardware can only be retired through pre-approved, certified remarketing vendors. 

    2. The Relationship Risk (The Allocation Punishment)

    The single greatest deterrent against selling excess GPUs is not a legal document, but the fear of losing priority allocation status with NVIDIA.

    Because demand for high-end architectures like the GB200 NVL72 heavily outstrips supply, NVIDIA’s management dynamically controls who receives hard-to-source chips. If a cloud provider or tier-2 operator is caught flipping unused hardware on the secondary market for a short-term cash injection, NVIDIA can simply move that customer to the bottom of the multi-quarter waitlist for the next hardware cycle.

    3. Alternative Strategies: Wholesale Cloud Brokering

    Instead of physically unboxing and reselling a pallet of undeployed GPUs, companies trapped by the power grid deficit leverage a much cleaner loophole: Wholesale Cloud Computing

    Rather than selling the physical chip, the company holding the “stranded capital” hardware will quickly install it in a temporary, third-party colocation space or drop it into a partner facility. They then lease out the raw compute via virtualized wholesale contracts to other hyperscalers or neoclouds. This effectively monetizes the unutilized silicon, offloads the physical constraints, and completely bypasses the legal headaches of hardware title transfers, export oversight, and software registration breaks


  • The 2025 Global Market Comparison

    According to institutional commercial real estate energy indexes tracking peak AI construction cycles (such as McKinsey and Synergy Research data), the net-new data center utility power that physically succeeded in connecting to power grids globally (excluding China) throughout the entirety of 2025 totaled roughly 4.10 GW.Mapping NVIDIA’s 5.37 GW shipped footprint against this baseline highlights the massive structural logjam:

    Structural Segment NVIDIA GW Sold (Refined Shipped Footprint) Actual New GW Deployed (Connected Online Capacity) Net Capacity Overhang (The Deficit)
    Hyperscalers 2.65 GW 2.45 GW +0.20 GW (200 MW Deficit)
    AI Clouds & Sovereigns 1.75 GW 1.10 GW +0.65 GW (650 MW Deficit)
    Enterprise & Industrial 0.97 GW 0.55 GW (Est. legacy data center shift) +0.42 GW (420 MW Deficit)
    Total Global Market 5.37 GW 4.10 GW +1.27 GW (1,270 MW Deficit)



  • It’s a major political issue. While “everything Israel ever wants should be US priority” has solid US consensus, “Skynet for US oligarchist privatized profits to ensure compliance with Zionist supremacism, and not just subjugation of Americans through oligarchist driven unemployment but subjugation to supporting skynet” or China wins is about even with political establishment support for Israel supremacy.

    Government operations buying AI services is integral to “need to beat China”, and “evil operations”, and circular financing back to politicians meant to maximize this, is by design.











  • The reason you can’t buy RAM anymore is that “projections” are 16gw+ of AI deployment in US this year requires 70% of RAM to be for AI. 5gw is a practical ceiling for projects currently in active development. NVIDIA not only is growing its undelivered inventory at huge rates ($30B latest), its customers have $150B in “Construction in process” inventory as they aren’t getting transformers and utility hookups to finish/power on their datacenters. The circular financing by NVIDIA is just forcing their customers to shift unused GPU inventory into their warehouses. It eventually leads to less new sales/manufacturing of their GPUs, and then hopefully, RAM price normalization.