Hurricane Sim

GCP Hurricane Simulation Scale: Omnibond® Powers Clemson's 2.1M vCPU Record on Google Cloud for Urgent Evacuation Modeling

In the face of natural disasters like hurricanes, every minute counts. Evacuation planning isn't just logistics—it's life-saving computation, processing massive datasets to model traffic flows, predict bottlenecks, and optimize routes in real time. Clemson University's School of Computing tackled this head-on with a groundbreaking simulation on Google Cloud Platform (GCP), analyzing 210 TB of video data from 8,500 traffic cameras over a 10-day disaster scenario. This wasn't a theoretical exercise; it was a proof-of-concept for "urgent HPC" (high-performance computing), demonstrating how organizations can burst to the cloud for time-critical workloads without disrupting on-premises operations.

But what made this run a world record—2.14 million virtual CPUs (vCPUs) across 133,573 concurrent instances, completing in just 4 hours—was Omnibond®'s expertise in hybrid cloud orchestration. Leveraging our advanced technology and TrafficVision application, Clemson achieved an order-of-magnitude leap over their prior efforts, all while keeping costs low and integration seamless. Here's how it came together.

The Challenge: Urgent HPC in a Hurricane's Path

Clemson’s Palmetto Cluster, their on-premises supercomputer, is a powerhouse for day-to-day research. But for urgent simulations like hurricane evacuations, it falls short—reserving resources for one-off bursts means halting other critical work, and scaling to millions of vCPUs on-prem is impractical. The team needed a solution that could:

  • Scale massively: Handle 2 million+ vCPUs to process 2 million hours of video data, detecting accidents, stopped vehicles, and anomalies.
  • Burst affordably: Use preemptible instances for up to 80% discounts without sacrificing performance.
  • Integrate hybrid: Federate with on-prem tools like SLURM for familiar workflows, avoiding vendor lock-in.

Traditional cloud setups require weeks of configuration—YAML hell, API hopping, and compatibility fires. Clemson needed something point-and-click, autoscaling, and optimized for CPU-bound workloads like video analysis.

Omnibond®'s Solution: Advanced Technology and Expert Collaboration

Omnibond® stepped in with our battle-tested tools, honed from years of helping universities and agencies bridge on-prem and multi-cloud environments. Only three systems were required to spin up this leadership-class supercomputer:

  • Omnibond® TrafficVision: Our core application for real-time traffic analysis. Provided free for this experiment, TrafficVision processes video streams from fixed or PTZ cameras, automatically recalibrating for panning/zooming and operating in all weather conditions. It detects stopped vehicles, debris, pedestrians, lane changes, smoke, or fog—essential for evacuation modeling. CPU-bound with a low memory footprint, it benchmarks exceptionally on GCP's N2 general-purpose machines (Intel Xeon Scalable processors), delivering comparable performance to on-prem clusters.
  • Omnibond®'s Advanced Provisioning Engine: Our turnkey HPC provisioning tool, launched from the GCP Marketplace. It automates instance setup, supporting dynamic storage access and hybrid federation across 6 regions. No more manual YAML or API wrangling—it provisions Intel Xeon-based compute on-demand, scaling from zero to 133K+ VMs in hours.
  • CCQ Meta-Scheduler: Our frontend to SLURM, handling autoscaling for massive environments. CCQ prevents denial-of-service from thousands of instances flooding the scheduler, optimizing for spare capacity across GCP, AWS, and Azure. It dynamically allocates to preemptible instances, slashing idle costs while enforcing zero-trust policies.

Omnibond®'s expertise was key: We collaborated with Clemson to customize their PAW (Provisioning And Workflow) management system for GCP, ensuring workflows federated across on-prem Palmetto and cloud bursts. This hybrid approach allowed Clemson to leverage their existing SLURM jobs without rewrite, turning GCP into an extension of their data center.

Scale and Performance: Breaking Records on GCP

The simulation duplicated some data to test scalability, totaling 6,022,964 vCPU hours—equivalent to 2.1M vCPUs running for days. Key highlights:

  • Scale: 2.14M vCPUs across 133,573 instances, surpassing Clemson's 2017 AWS record of 1.1M vCPUs.
  • Speed: Completed in ~4 hours, with ramp-up to 967K vCPUs in the first hour and 2.13M after 3 hours.
  • Efficiency: Peak Google Cloud Storage IO of 128 GB/s, using custom-16-16384 machine types for optimal CPU utilization.
  • Hybrid Integration: Federated workflows across 6 GCP regions, mirroring on-prem Palmetto for seamless handoff.

Brandon Posey from Clemson noted: "We can spin this up at scale wherever we want... By utilizing a solution that has already been tested at a large scale on another cloud provider, we have the added benefit of a solution that can be deployed to multiple cloud providers which unlocks more available resources for urgent processing."

Outcomes: Cost Savings, Insights, and Broader Impact

The run cost just $52,598.64—an average $0.0087 per vCPU hour, 80% cheaper than on-prem equivalents. Beyond the numbers, it delivered actionable insights: Verified models for predictive evacuation planning, usable for real disasters. Clemson proved urgent HPC is viable for any organization—universities, agencies, enterprises—without massive upfront investment.

Professor Amy Apon emphasized: "Organizations do not need to reserve massive amounts of computing or to stop all other on-going work at an organization to process massive amounts of data." This validated drawing on worldwide spare capacity without maintaining idle millions of cores, reassuring for most evacuations requiring less scale but scalable to extremes.

Bringing It to projectEureka

This GCP triumph informs projectEureka: Our dashboard fuses Omnibond®'s advanced orchestration with data governance and VDI, making 2M vCPU bursts intuitive across GCP, AWS, Azure, and K8s. No silos—just accelerated AI innovation.

Sources: Based on Google Cloud's blog for core details on the 2.1M vCPU run, scale, and cost, and The Next Platform article for context on urgent HPC, cost efficiency insights, and faculty quotes.

traffic-analytics-1.png

Omnibond's Pioneering AI Solutions: A Spotlight on TrafficVision

Omnibond’s TrafficVision® leverages AI and computer vision to revolutionize traffic management across North America, enhancing safety and …

retail-roadway.jpeg

Omnibond's AI Innovation: Spotlight on BayTracker

Omnibond’s BayTracker leverages AI and computer vision to optimize quick-service operations, enhancing efficiency and customer satisfaction …

obo-heatmap.png

Transforming Biomedical Research with Open Biomedical Ontologies and Semantic Web Technology

Omnibond® integrates Open Biomedical Ontologies with Semantic Web technology to revolutionize knowledge-based interpretation in biomedicine.

hurricane-sim.png

GCP Hurricane Simulation Scale

Powered Clemson University’s record-breaking 2.1M VCPU simulation on GCP—processing 210TB of traffic video for hurricane evacuations.