04 / Lab Capability

AI Compute Stack

GPU-accelerated training and simulation pipelines for RF neural models.

Training neural networks for wireless physical-layer problems is its own discipline. Channel models must be differentiable or surrogate-replaced for end-to-end gradient flow. Datasets must be generated from simulation pipelines, not crawled from the web. Models must satisfy strict inference latency budgets for edge deployment. Our compute stack is architected for this specific class of problem—not generic deep learning, but wireless deep learning.

Hardware Foundation

We operate GPU workstations equipped with NVIDIA RTX 4090 and A6000 cards for model training and EM simulation acceleration. For larger experiments, we use elastic cloud compute through major providers. The configuration is deliberately lean: research-stage compute should match research-stage problem sizes, not premature scaling.

Pipelines Built for RF

Standard ML pipelines assume datasets exist. In wireless, datasets often must be synthesized through channel simulators (NVIDIA Sionna, MATLAB's 5G Toolbox) before training begins. We maintain reproducible pipelines that span simulation, dataset curation, training, evaluation, and export to edge-inference formats (ONNX, TensorRT, TFLite).

Edge Deployment Considerations

A model that works in the GPU lab is only half-solved. The other half is making it run on a power-constrained edge device with millisecond latency budgets. We invest heavily in model quantization, pruning, and operator fusion. Our deployment targets range from automotive-grade SoCs to low-power IoT MCUs.

Key Concepts

SionnaNVIDIA's open-source library for link- and system-level wireless simulation, optimized for GPU-accelerated training of physical-layer neural networks.

QuantizationCompressing neural network weights from floating-point to lower-precision integer formats to enable efficient inference.

ONNXOpen Neural Network Exchange—a standardized format for representing trained models across frameworks.

Operator FusionAn inference optimization that combines multiple computational operations into a single kernel to reduce overhead.

References

[1]NVIDIA. (2024). Sionna: An Open-Source Library for Next-Generation Physical Layer Research. arXiv:2203.11854.
[2]Jacob, B., et al. (2018). Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference. CVPR.
[3]Han, S., Mao, H., & Dally, W. J. (2016). Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding. ICLR.
[4]ONNX Working Group. (2024). ONNX: Open Neural Network Exchange Specification.

Get in touch →