Key Insights from GTC: The Future of AI, Quantum Computing, and Data Centers
- giorgio sbriglia
- Mar 25
- 13 min read
Updated: Mar 26

The recent NVIDIA GTC conference delivered a wealth of groundbreaking developments that signal major shifts in the technology landscape. As industry leaders and innovators gathered to showcase cutting-edge technologies, several themes emerged that paint a comprehensive picture of where computing is headed. From quantum breakthroughs to photonic chips, from advanced cooling technologies to philosophical debates about AGI, the event provided deep insights into tomorrow's technological frontier. Here's an expanded analysis of the most significant takeaways.
Quantum Computing: From Skepticism to Hybrid Integration
One of the most revealing shifts at GTC concerned quantum computing. For months prior to the conference, Jensen Huang from NVIDIA had been notably skeptical about quantum technology's timeline, suggesting it might be 10-20 years away from practical implementation. At the conference, while Jensen remained skeptical about quantum computing's immediate impact, he did retract some of his previous statements - not about the technology's timeline, but acknowledging he was wrong to make such comments about publicly traded quantum computing companies. His attitude as moderator in the quantum computing section continued to reflect a certain skepticism and coolness toward the technology, with a notable exception being his more positive engagement when Peter Shadbolt from PsiQuantum was speaking.
Most quantum computing companies present at GTC acknowledged that the future lies not in standalone quantum computers but in specialized quantum processing units (QPUs) that will work alongside traditional CPUs and GPUs. This hybrid approach appears to be gaining broad consensus within the industry, with only PSI Quantum - notably the only company using photonic quantum technology - advocating for pure quantum solutions across the entire computing stack.
The implications of this shift could be revolutionary. QPUs might excel at solving specific computational problems, particularly Gradient Descent algorithms that currently consume enormous processing power during AI model training. If successful, this specialization could dramatically reduce the need for the massive GPU clusters currently deployed for training purposes. The consequences would be twofold: companies that have heavily invested in hyperscale clusters might find their assets rapidly depreciated, while the demand for GPUs specifically for training purposes could significantly decline.
It's understandable why this prospect might be concerning for NVIDIA and its GPU business. With their dominant position in AI training hardware, any technology that potentially reduces GPU demand represents a direct threat to a core revenue stream. This context helps explain Jensen's continued skepticism despite allowing quantum computing companies a platform at GTC.
The Strategic Pivot from Training to Inference
Throughout the conference, a strategic shift in focus from model training to inference was unmistakable. This transition appears to be driven partly by market concerns about how recent advancements like Deepseek's allegedly super-efficient training approach might challenge traditional compute requirements for training. With such innovations potentially reducing the computational intensity of training, there's growing industry concern about diminished demand for high-density GPU clusters. NVIDIA's presentations, particularly Jensen Huang's keynote, emphasized reasoning processes and inference efficiency, highlighting how reasoning-capable models like Llama 70 B consume significantly more tokens than traditional systems.
This emphasis serves a dual purpose: it demonstrates that reasoning processes continue to drive substantial computational demand even as training methodologies evolve, while positioning NVIDIA strongly in the inference market that's rapidly gaining prominence. The conference also highlighted how inference requires different optimization strategies than training, with a greater focus on latency, throughput, and cost-efficiency.
In this evolving landscape, companies specializing in inference optimization like Groq are gaining attention. Previously somewhat overlooked in favor of training-focused companies, Groq and similar inference-centered businesses could see significant growth as the industry reorients toward efficient model deployment and operation rather than just model creation.
Dynamo: NVIDIA's Comprehensive Operating System for Supercomputers
Another major highlight was NVIDIA's Dynamo ecosystem. This comprehensive platform functions essentially as an operating system for supercomputers, analogous to how iOS or Windows serve consumer devices but optimized for extreme computing environments.
Dynamo positioning builds upon NVIDIA's established CUDA successful experience of creating software ecosystems for GPUs management and accelerated computing. its core, NVIDIA Dynamo is an open-source modular inference framework designed to run AI models across thousands of distributed GPUs with minimal latency. It intelligently handles resource allocation and communication so that even massive AI models (like large language models or “reasoning” AI agents) can operate at full throttle on supercomputers. Key capabilities include:
Dynamic GPU Orchestration: Dynamo can add, remove, or reallocate GPU resources on the fly as workload demands fluctuate, ensuring hardware is neither idle nor overwhelmed.
Smart Request Routing
Disaggregated Model Serving: For complex AI models (such as giant language models), Dynamo splits the workload into stages – for example, separating the initial data processing (prefill/context) from the result generation (decode) – and runs each on specialized GPUs
nvidianews.nvidia.com. By tailoring each phase to the best-suited hardware, it maximizes throughput and efficiency (much like an assembly line for AI computations).
Open and Compatible Ecosystem: NVIDIA has made Dynamo fully open-source, and it supports all major AI frameworks and inference engines (PyTorch, NVIDIA TensorRT, vLLM, and more). This broad compatibility means it can plug into existing supercomputing workflows with ease, protecting investments and avoiding vendor lock-in.
Developers and researchers can adopt Dynamo, customize it, or contribute to it, just as one would with an open operating system, fostering a community around extreme-scale AI computing. Remembers a bit CUDA strategy 2.0? yes it does.
This strategic positioning accomplishes several goals simultaneously. It makes users increasingly dependent on NVIDIA's software ecosystem beyond just their hardware, establishing barriers to switching similar to Apple's integrated approach -- once they join the ecosystems there are significant barriers to switch to Android. It also optimizes hardware performance by handling low-level programming tasks that would otherwise fall to programmers and operators, effectively functioning as abstraction layers that simplify complex computing operations.
By establishing Dynamo as the de facto operating system for supercomputing environments, NVIDIA strengthens its position throughout the entire computing stack, not just at the hardware level. This software-focused strategy represents a crucial evolution in NVIDIA's business model and reflects the growing importance of integrated solutions in high-performance computing environments.
Revolutionary Photonic Chips: A New Direction in Computing

Perhaps one of the most surprising announcements was NVIDIA's advancement in photonic chip technology, with products scheduled for release by the end of the year and early 2026. These silicon photonic switches represent a significant technological leap over current optical switching technologies used in data centers.
Unlike current active optical switches that use movable mirrors to redirect network traffic, NVIDIA's technology employs semiconductor-based switches that release photons in response to voltage changes – essentially the optical equivalent of how traditional semiconductors release electrons. This approach eliminates the inefficient transceivers which also limit bandwidth, enabling much more efficient and higher-speed connectivity.
The implications extend beyond mere performance improvements. This technology potentially opens an entirely new market segment for NVIDIA in telecommunications infrastructure, putting them in competition with established players like Qualcomm. For data center operators, these switches promise substantial improvements in both East-West traffic (between servers within a data center) and North-South traffic (between the data center and the outside world).

Notably, these photonic switches will be liquid-cooled, emphasizing the industry's continued push toward more efficient thermal management technologies. The photonic approach also has interesting connections to quantum computing developments, as precise photon control is essential for certain quantum computing architectures.
Rubin Vera Ultra: The Limits of Cooling and Computing Density
The conference showcased NVIDIA's impressive Rubin Vera Ultra system, representing the pinnacle of current GPU computing density. This system features an unprecedented 576 GPU cards per rack, consuming up to 600 kilowatts – a power density that would have been unthinkable just a few years ago.
This remarkable density is made possible through revolutionary cooling techniques. The fundamental principle driving this development is simple but challenging to execute: to maximize parallel computing efficiency and minimize latency, GPUs must be physically placed as close together as possible. This proximity generates enormous heat that must be efficiently removed.
The Rubin Vera Ultra addresses this challenge through extensive use of copper and other metals for thermal management, likely pushing rack weights beyond two tons. This increased weight presents its own set of infrastructure challenges for data centers, requiring reinforced flooring and enhanced structural support systems.
The performance improvements are expected to be substantial, with models trained on Ultra systems with photonic switches delivering performance improvements by factors of two or three per megawatt compared to current technologies. This advancement will significantly impact both training and inference capabilities, enabling the next generation of more sophisticated and capable AI models.
The detailed photos and videos from the conference revealed the sheer engineering complexity of these systems, with intricate cooling channels, substantial copper heat sinks, and densely packed components designed to maximize computational density while managing the resulting thermal load.
The AGI Debate: Technical Achievement vs. Energy Sustainability
One of the more philosophical and thought-provoking discussions at GTC centered around the definition and current status of Artificial General Intelligence (AGI). The debate revealed multiple perspectives on what constitutes "general" intelligence and whether current systems qualify.
Some attendees and speakers argued that models like OpenAI's One and One Pro have already reached AGI status based on performance metrics – they outperform 99.9% of humans across a wide range of tasks and domains. With reasoning-capable models showing minimal hallucination and demonstrating impressive problem-solving capabilities, these advocates suggest we've already crossed the AGI threshold by performance standards.
However, others raised important counterpoints focused on energy sustainability. While current models may achieve impressive computational results, they do so at enormous energy cost – requiring kilowatts of power compared to the human brain's ability to operate at the energy level of "a sandwich." This energy inefficiency suggests that while we may have achieved the computational aspects of AGI, we haven't yet developed truly sustainable artificial intelligence.
This debate connects directly back to quantum computing developments, as QPUs might eventually help achieve similar performance levels with dramatically lower energy consumption.
Another proposed definition for AGI, attributed to Stefan Schiefer, focused on economic impact – specifically, the ability to increase GDP by 10% without requiring additional human labor. This metric shifts the focus from technical capabilities to practical economic outcomes, though measuring such impacts accurately presents significant challenges.
The Rise of European Neo-Clouds in a Competitive Landscape
A notable development at GTC was the increased European presence in the cloud computing space, particularly in the specialized AI cloud segment. Companies like Neobius and Emskiro have managed to position themselves strategically in this rapidly evolving market, representing progress for Europe which has traditionally lagged behind American and Asian competitors.
However, the competitive landscape remains challenging. Many European cloud providers have found it necessary to expand into American markets where demand for AI computing resources remains strongest. This suggests that while Europe is making progress, the center of gravity for AI compute demand continues to be primarily in the US, with growing influences from Asia.
The discussion also revealed limitations in enterprise adoption of specialized AI cloud services. A significant portion of computing demand still comes from research labs and specialized AI development companies rather than traditional enterprises. The industry continues to work on effectively monetizing enterprise clients, with the largest buyers of GPU compute time continuing to be companies that resell AI services at the retail level rather than end-users.
The impending availability of GP200 GPUs is creating strategic calculations throughout the neo-cloud market. With these more powerful chips becoming available, providers need to carefully consider their cooling infrastructure, with liquid cooling becoming increasingly essential for competitive operation. Additionally, pricing strategies will be crucial, as enterprise customers typically request quotes from multiple providers and are highly price-sensitive.
Nordic data centers appear particularly well-positioned in the European market due to their natural cooling advantages, potentially making them the most attractive locations for AI compute services in Europe, especially as power density and cooling requirements continue to increase.
Physical AI and Robotics: Bridging Digital and Physical Worlds
NVIDIA highlighted robotics and physical AI as a major emerging trend that represents the next frontier in artificial intelligence. The current focus within this field is on creating sophisticated digital training environments to accelerate development, as real-world training happens at the inherently limited speed of physical reality.
Companies are developing advanced simulators that can generate synthetic training datasets at accelerated rates, allowing for much faster training cycles than would be possible with physical robots in real environments. This approach aims to achieve for robotics what large language models have achieved for natural language processing – the ability to train on vast amounts of data to develop generalizable capabilities.
The conference showcased diverse approaches to robotics platforms. Chinese manufacturer Unitree appears particularly well-positioned with economical humanoid robots priced around $10,000-16,000 with approximately 4 hours of autonomy. Their open stack solutions follow an important trend toward creating open platforms where various software solutions can be integrated, similar to how Android became a dominant mobile platform by providing an open ecosystem for app developers.
A particularly interesting development was the joint venture between NVIDIA, Disney, and Google DeepMind to create a Star Wars-inspired robot. This collaboration smartly combines NVIDIA's hardware expertise, Google DeepMind's AI development capabilities, and Disney's cultural influence to create robots that might gain greater public acceptance than more intimidating designs like those from Boston Dynamics. This approach recognizes that social acceptance will be as important as technical capability in driving robotics adoption.
The discussion also touched on how robot intelligence will likely be distributed, with System 1 (fast, instinctive processing) operating locally on the robot while System 2 (slower, more deliberative reasoning) might remain in data centers for the foreseeable future due to energy and processing constraints. This hybrid approach mirrors the discussions around quantum computing integration.
Data Center Efficiency: Debunking PUE Marketing Claims
The conference sparked considerable discussion about Power Usage Effectiveness (PUE) in data centers, with some companies making what appeared to be unrealistically low PUE claims of around 1.02. These marketing claims often result from defining PUE narrowly at the computer level rather than comprehensively at the data center level.
A true data center PUE must account for all accessory systems including uninterruptible power supplies (UPS), electrical cables, bus bars, voltage losses, and cooling systems. Even with the most efficient liquid-to-liquid end-to-end cooling systems, physics dictates a minimum overhead of at least 2-3%. When factoring in additional unavoidable losses, realistic minimum PUEs start around 4% (or 1.04).
The situation becomes much worse for air-based cooling systems due to their inherently lower density and efficiency compared to liquid cooling. During summer months when ambient temperatures rise, such systems typically require chillers or air conditioners that consume substantial energy, further degrading overall efficiency.
The discussion highlighted the importance of understanding exactly what components are included in PUE calculations when evaluating marketing claims. For data center operators and customers, the PUE ultimately reflects the total electricity costs per unit of IT power – how many megawatts must be paid for on the total electric bill for each megawatt of actual IT equipment power.
GB200 Deployment: Power, Performance, and Industry Transformation

The industry is now eagerly awaiting the widespread deployment of NVIDIA's GB200 GPUs, with the first operational installations having occurred just prior to the conference. There are already concerns about their power profiles, with reports suggesting power peaks might reach 200kW rather than the stated 134kW specification.
This has implications for uninterruptible power supplies (UPS) and power management systems, as they may need to perform "peak shaving" to manage these higher-than-specified power demands. The situation has reportedly caused concern among infrastructure providers like Schneider Electric who must account for these peaks in their system designs.
The technological leap from the Hopper architecture to GB200 has been enabled largely by advanced liquid cooling systems that allow cards to be placed closer together, reducing communication latency between GPUs. This proximity is expected to drive significant performance improvements in large language models and other AI workloads, with some estimates suggesting performance gains of 2-3x per megawatt of power.
This rapid advancement raises strategic questions for companies that have invested heavily in current-generation hardware. For instance, Elon Musk's xAI reportedly has 100,000-200,000 H100 GPUs in its "Colossus" cluster, which may face competitive disadvantages compared to newer installations built with GB200s. These considerations will influence investment decisions throughout the AI industry as companies weigh the benefits of upgrading against the substantial costs involved.
The discussion also touched on international competition, particularly noting the presence of Chinese engineers and executives at the conference. Despite export restrictions on advanced AI chips to China, the transfer of knowledge through professional interactions remains difficult to control, raising questions about how effective technology restrictions can be in practice.
Enterprise AI Adoption: Reducing Development Barriers

The conference revealed important progress in tools designed to accelerate enterprise AI adoption. One significant development has been the growth of the NVIDIA Nemo ecosystem, which now offers numerous "blueprints" or templates that significantly reduce the development time required to implement AI solutions.
These blueprints address a key barrier to enterprise adoption: the substantial capital expenditure required for customized software development to implement AI pipelines. By providing pre-built, customizable templates, NVIDIA is helping reduce the time and cost of implementing AI systems, potentially accelerating enterprise adoption.
The sessions focused heavily on agents and Retrieval-Augmented Generation (RAG) frameworks, suggesting these are becoming the primary approaches for enterprise AI implementation. There's also growing interest in collaborative frameworks that allow different agents to work together even when built with different underlying libraries.
NVIDIA has developed its own framework for collaborative agents that remains agnostic regarding the underlying libraries, allowing developers to mix technologies like Langchain and CrewAI with common interfaces. This interoperability will likely prove important as the AI ecosystem continues to diversify.
The LLMaaS (Large Language Models as a Service) trend is also gaining momentum, with companies adding this capability to expand their value chain. Those offering competitive pricing in this segment appear to be gaining significant market share, suggesting price sensitivity remains high even for advanced AI services.
Another interesting observation was the ability to use external GPUs with services like AWS, reflecting a maturing market where interoperability and flexibility are becoming increasingly important.
Industry Transformation: From Developer Conference to Business Summit
A meta-observation about GTC itself revealed how the conference has transformed from a developer-focused technical event to a business-centric summit. This evolution reflects the maturing AI industry as it moves from research and development phases toward widespread commercial deployment.
Looking Ahead: Predictions and Industry Direction
As the industry moves forward, several predictions emerged from the GTC discussions:
The technological leap enabled by advanced cooling in the GB200 architecture will drive significant performance improvements in large language models, with expected gains of 2-3x per megawatt.
European cloud providers will continue gaining ground but will need to expand beyond European markets to achieve scale.
Robotics and physical AI will emerge as the next major frontier, with open platforms and developer ecosystems becoming crucial differentiators.
Enterprise AI adoption will accelerate as pre-built templates reduce development costs and time-to-implementation.
Cooling technology and energy efficiency will become even more critical differentiators as power densities continue to increase.
International competition in AI development will intensify despite export restrictions, with knowledge transfer occurring through professional interactions.
Conclusion
GTC provided a comprehensive view of an industry in rapid transformation. From the integration of quantum processing units to the rise of photonic computing, from unprecedented cooling challenges to philosophical debates about AGI, the conference revealed an industry grappling with both extraordinary technological opportunities and significant challenges.
The shift from training to inference, the emergence of specialized processing units for specific workloads, and the growing importance of software ecosystems all point to an increasingly mature and specialized AI landscape. As these technologies continue to develop and converge, we can expect further disruption across industries and continued evolution in how computing resources are designed, deployed, and utilized.
For businesses and technologists alike, staying informed about these rapidly evolving trends will be essential for making strategic decisions in an environment where today's cutting-edge technology may become tomorrow's legacy system at an unprecedented pace.
Comments