A working demonstration on NVIDIA’s three-computer architecture: Edge capture on NVIDIA Jetson Thor, simulation on NVIDIA DGX Spark, and the open foundation models trained using NVIDIA DGX Systems

For Immediate Release

San Jose, California, March 16, 2026 — At NVIDIA GTC 2026, Connect Tech and CTai LABS are running a live Physical AI pipeline that streams from real cameras, tracks people in 3D, builds a synchronized digital twin of the scene, and uses AI reasoning models to detect and log critical events, all in real time, all at the Edge. The demonstration closes a loop that Physical AI developers have historically had to stitch together from separate tools: vision analytics, simulation, cloud connectivity, and model iteration now run as a single continuous workflow at the Edge.

The workflow incorporates NVIDIA’s three-computer architecture. Real-world data is captured by D3 Embedded cameras connected to the Connect Tech Anvil-T5 Edge System, which is powered by NVIDIA Jetson Thor. The Edge system tracks 3D body poses and localization data in real time using NVIDIA Metropolis workflows. This data streams locally to an NVIDIA DGX Spark desktop supercomputer, where NVIDIA Isaac Sim renders a live digital twin of the physical scene, populating it with digital human representations derived directly from the Edge sensor data.

“Running this entire workflow locally, we are enabling a simulation-first AI approach directly at the Edge,” said Doruk Sonmez, AI Solutions Architect, CTai LABS. “To truly scale Physical AI, developers need hardware that can keep up with the evolution of foundation models and simulation. Our demonstration at GTC shows with NVIDIA’s three-computer architecture, teams can transition their AI workloads from the DGX Spark Digital Twin directly into rugged, real-world environments.”

Inside that simulation, CTai LABS’ Scene Analyzer Agent, built on NVIDIA’s Video Search and Summarization (VSS) Blueprint, connects to a virtual camera that can be repositioned on demand within the Digital Twin. The agent uses NVIDIA Cosmos Reason2, an open reasoning vision-language model that currently ranks first on the Physical Reasoning Leaderboard on Hugging Face, to interpret the scene and identify events of interest. NVIDIA Nemotron LLMs then summarize those findings and store them in a local database with spatial-temporal relationships, making the event history queryable and available for actionable insights.

When model policies are validated in simulation, they are redeployed to the Anvil-T5 with NVIDIA Jetson Thor T5000 for real-world physical AI application. The Real2Sim loop is closed: the physical world informs the simulation, and the simulation improves what runs in the physical world.

 

Why This Matters

Physical AI development has a Real2Sim-Sim2Real gap problem. Models trained or tested in simulation often behave differently in the real world because the simulation was built from static assets, not live sensor data. This demonstration inverts that approach: the Digital Twin is continuously updated from real-world camera feeds, so simulation reflects what is actually happening in the physical environment at low latency.

The entire workflow runs locally, with no cloud dependency. NVIDIA DGX Spark handles simulation, AI inference, and event logging. NVIDIA Jetson Thor T5000 on the Anvil-T5 System handles production Edge compute. This is relevant to robotics, industrial monitoring, and autonomous systems teams that need to operate in environments where cloud connectivity is limited or latency-sensitive.

 

The Three Compute Layers

Edge: Connect Tech Anvil-T5 with NVIDIA Jetson Thor T5000 captures live scene data via D3 Embedded cameras, runs NVIDIA Metropolis workflows, streams spatial data to NVIDIA DGX Spark, and runs the CTai LABS Scene Analyzer Agent for on-device video search and summarization.

Simulation: NVIDIA DGX Spark (Grace Blackwell GB10, 1 petaFLOP, 128GB unified memory) hosts NVIDIA Isaac Sim with a custom CTai LABS extension that loads Digital Twins built in NVIDIA Omniverse or generated via Gaussian Splatting. It ingests live Edge data to spawn and animate digital human representations, hosts the Scene Analyzer Agent connected to a virtual in-sim camera, and stores event summaries and spatio-temporal relationships in a local database.

Foundation Models: Trained on NVIDIA DGX Systems and deployed openly, NVIDIA Cosmos Reason2 (2B and 8B variants) provides vision-language reasoning with physics and spatio-temporal awareness, while NVIDIA Nemotron LLMs handle summarization and agentic reasoning. Both are deployed across Edge and simulation layers.

 

The CTai LABS Physical AI Evaluation Kit at NVIDIA GTC 2026 represents a significant step forward for engineering teams building autonomous and industrial AI systems. By combining Connect Tech’s production-proven Anvil-T5 System with NVIDIA’s full Physical AI stack, CTai LABS delivers a validated, hardware-ready pipeline that reduces the friction between AI development and real-world deployment. Teams working in robotics, industrial monitoring, and autonomous systems can move from concept to field deployment faster, with simulation grounded in live sensor data and edge inference powered by the same foundation models used in training. Visit CTai LABS at NVIDIA GTC 2026 in San Jose at Booth 1641 to see the full Real2Sim pipeline running live.

For media inquiries or interviews, please contact:

###

About Connect Tech

Founded in 1985 and headquartered in Guelph, Canada, Connect Tech Inc. (CTI) is a global leader in rugged edge AI computing platforms engineered for robotics, autonomous systems, and mission-critical applications. For over 40 years, CTI has designed and manufactured in-house, delivering reliable edge AI architectures that unify compute, vision, networking, and thermal technologies to accelerate customers’ time to market in demanding real-world environments. Through its CTai LABS AI Engineering team, CTI transforms its EdgeAI Stack into deployable, production-ready physical AI solutions at scale.  For more information, visit connecttech.com.