In the race to build faster, more flexible AI infrastructure, the interconnect fabric often becomes the bottleneck. Astera Labs has introduced a solution that challenges the dominance of proprietary technologies like NVLink, offering a PCIe-based switch that brings surprising capabilities to rack-scale systems. The astera labs pcie switch, known as Scorpio X, delivers 320 lanes of PCIe 6.0 connectivity with 5.12 TB/s of bidirectional bandwidth. But raw throughput is only part of the story. Astera has embedded in-network compute, optimized multicast operations, and a vendor-agnostic philosophy into this switch.

1. Leveraging PCIe 6.0 for High Bandwidth Without Proprietary Lock-In
PCIe has long been the standard for connecting peripherals inside a server, but it rarely competed with dedicated interconnects like NVLink for scale-up fabrics. Astera Labs changes that by building a massive PCIe switch that can handle the demanding bandwidth needs of dozens of accelerators. The Scorpio X chip packs 320 lanes of PCIe 6.0, each lane running at 64 GT/s, yielding a total bidirectional bandwidth of 5.12 TB/s. That is enough to link many GPUs, CPUs, and storage devices into a single coherent fabric.
Why does this matter? Nvidia’s NVSwitch 6 offers 14.4 TB/s, nearly three times more bandwidth. But NVSwitch requires NVLink support on every accelerator, which locks you into Nvidia hardware. The astera labs pcie switch works with any accelerator that already uses PCIe — which is nearly all of them. For example, Nvidia’s RTX Pro 6000 Server cards lack NVLink entirely, so a PCIe switch is the only way to stitch them together. By using standard PCIe, you avoid vendor lock-in and can mix chips from different manufacturers in the same fabric. This flexibility is crucial for data center architects who want to avoid being tied to a single supplier.
Imagine you are designing a cluster for a research lab that uses a mix of Nvidia, AMD, and Intel accelerators. With a proprietary interconnect, you would need custom hardware or bridges. With the Scorpio switch, you simply plug each accelerator into the PCIe fabric, and they can communicate directly. The bandwidth is substantial enough for many scale-up workloads, especially for inference where latency matters more than peak bandwidth.
2. In-Network Compute to Accelerate Collective Communications
Astera hasn’t just built a bigger PCIe switch. It has embedded in-network compute capabilities that are usually found only in specialized switches like Nvidia’s NVSwitch. These capabilities accelerate collective communications — operations like all-reduce, all-gather, and broadcast that are essential for training and inference in large language models.
In traditional setups, when GPUs need to synchronize gradients or share data, they send messages across the network. The switch simply routes packets. But with in-network compute, the switch itself can perform operations on the data as it passes through. For example, during gradient aggregation, the switch can sum the gradients from multiple GPUs and send the result back, reducing the number of messages and the load on the GPUs. This cuts down on GPU idle time waiting for network operations to complete.
The astera labs pcie switch implements these collective operations in hardware, which is much faster than doing them in software on the GPU. For generative AI inference, where large models are split across many chips, these collective communications happen frequently. By offloading them to the switch, the GPUs spend more time computing and less time waiting. This is especially beneficial for mixture-of-experts (MoE) models, which we will discuss next.
3. Hypercast Multicast for Mixture-of-Experts Efficiency
Mixture-of-experts models have become popular for scaling large language models without proportional compute cost. These models consist of multiple “expert” sub-networks. For each token generated, a router selects a small subset of experts to process it. This means that different experts on different GPUs may be activated for each token, creating a highly dynamic communication pattern.
Standard multicast operations in PCIe switches have limitations: they support a fixed number of groups and cannot change groups on the fly. Astera developed a custom multicast operation called Hypercast specifically to handle the dynamic nature of MoE inference. Hypercast allows the switch to form temporary groups of GPUs for each token, broadcast the input data to only the selected experts, and then collect the results. This is done without involving the host CPU or requiring complex software coordination.
Why is this a big deal? In MoE inference, the communication pattern changes with every token. If the switch cannot adapt quickly, the GPUs must spend extra time reconfiguring the network. Hypercast reduces that overhead, making MoE inference more efficient. For a data center running many concurrent inference requests, this can translate to higher throughput and lower latency. The astera labs pcie switch is one of the first PCIe switches to offer such specialized support for MoE models, giving it an edge over generic PCIe switches.
You may also enjoy reading: New Pack2TheRoot Flaw Gives Hackers Root Linux Access.
4. Vendor-Agnostic Flexibility for Heterogeneous AI Fabrics
One of the biggest challenges in building large AI systems is the diversity of accelerators. Different vendors have different interconnects, memory architectures, and software stacks. Astera’s approach with the Scorpio switch is to be vendor-agnostic, using PCIe as the common language. This allows data center architects to mix and match accelerators for disaggregated inference architectures.
For example, you could use one type of accelerator for the compute-heavy prefill phase of inference (processing the input prompt) and a different type for the memory-intensive decode phase (generating tokens one by one). Companies like Groq, AWS, and Cerebras have demonstrated such disaggregated designs, often using Ethernet to connect the chips. But PCIe offers lower latency and higher bandwidth for chip-to-chip communication within a rack. With the Scorpio switch, you can connect a prefill accelerator to a decode accelerator directly over PCIe, without going through a network switch.
This flexibility also extends to storage and networking. Since PCIe is ubiquitous, you can attach NVMe storage, NICs, and other devices to the same fabric. The Scorpio P-series switches, ranging from 32 to 320 lanes, allow you to scale the fabric according to your needs. All of them work with the COSMOS management suite, which we will cover next. For organizations that want to avoid vendor lock-in while still achieving high performance, the astera labs pcie switch provides a compelling path forward.
5. COSMOS Management Suite for Robust Fabric Monitoring
Building a large PCIe fabric with dozens of switches and hundreds of endpoints introduces new management challenges. Astera addresses this with its COSMOS management suite, a hardware monitoring platform that works with all Scorpio switches. COSMOS provides telemetry on link health, temperature, power consumption, and error rates across the entire fabric. It can detect issues like signal degradation or lane failures before they cause system crashes.
For a data center architect, this is invaluable. Imagine you have 32 GPUs connected through multiple Scorpio switches. If one link starts dropping packets, COSMOS can alert you and even suggest corrective actions, such as adjusting equalization settings or rerouting traffic. The suite also integrates with standard management protocols like Redfish and SNMP, making it easy to plug into existing orchestration tools.
Astera’s switches are currently sampling, with production expected to ramp in the second half of 2026. That gives infrastructure teams time to evaluate the hardware and prepare their management software. By including a comprehensive management suite, Astera ensures that the switch is not just a piece of hardware but a complete solution for building reliable, high-performance AI fabrics. The astera labs pcie switch thus offers not only raw performance and flexibility but also the operational tools needed to keep it running smoothly at scale.
In summary, Astera Labs has crafted a PCIe switch that goes beyond simple connectivity. It brings in-network compute, specialized MoE support, vendor independence, and robust management to the table. For anyone building next-generation AI infrastructure, this switch deserves a close look.
![How to protect your privacy by opting out of data collection in popular AI apps [Sponsored] How to protect your privacy by opting out of data collection in popular AI apps [Sponsored]](https://lesty.tech/wp-content/uploads/azuloz-prkyzaVg-370x297.webp)




