Overview
QsNet
- Network cards
- Switches
QsTenG
Software
- Drivers
- RMS
HPC Services
QsEm for UAVs
Overview
Software Download
Documentation
Platform Compliance
3rd Party Tools and Applications
External Resources
Gnats
FAQ
Case Studies
Performance Results
Benchmarks
Features & Benefits
Documents Library
Sales Channels
Example Configurations
Partners
Channel Partners
About Quadrics
Contact Details
Office Locations
Travel
Employment
Customers
Presentations
 
Home   Screensaver   Legal   Login
 
   
   
 

QsNet High Performance Interconnect


QsNetII overview

QsNetII is the leading high-performance interconnect for supercomputer systems. The combination of high bandwidth, ultra low latency and scalability has made this the network of choice for many of the world's fastest computer systems. Using QsNetII, multi-teraflop systems can be constructed from commodity compute servers. The technolo gy has been developed from the outset to support the requirements of supercomputer class systems, with the emphasis on performance, resilience, security and data integrity.



QsNetII- is designed to connect servers high performance PCIe/PCI-X interfaces. This uses parallel copper interconnect to deliver over 900Mbytes/s of user space to user space bandwidth. Optional optical interconnect extends the maximum link length to over 100m. QsNetII uses a 'fat tree' topology.This permits scaling up to 4096 nodes.The nodes themselves typically have multiple CPUs, permitting systems of >10,000 CPUs to be constructed. Multiple, parallel QsNetII networks can be employed in a system to maintain the compute to communications ratio where high CPU count SMP nodes are employed. QsNetII hardware is just one part of a complete family of products for building high performance clusters. Optimized libraries for common distributed memory programming models exploit the full capabilities of the base hardware.The kernel communication layer allows system services to take advantage of the performance of QsNet.This software is available as open source for the Linux platform.


Network Interface Architecture

QsNetII interfaces to the host computer through the industry standard PCIe/PCI-X buses.The architecture of the network interface has been developed to offload the entire task of interprocessor communication from the main processor, and to avoid the overhead of system calls for user process to user process messaging. A DMA transfer between two user processes can be initiated with a short sequence of writes to the network interface with no requirement for an expensive system call.

Uniquely, QsNetII supports the capability to perform I/O to and from paged virtual memory.This means users can communicate to and from anywhere in their process space without the overhead of copying, or locking down pages. QsNetII is designed for use within SMP systems - multiple, concurrent processes can utilise the network interface, at any time, without any task switching overhead. Since each client process accesses its own virtual communication processor, they may each run their own set of protocols without compromising process to process security. Data transfer is handled by a DMA engine for message output, and a hardware input packet handler for message receipt. A dedicated I/O processor is provided to offload protocol handling from the main CPU. Local memory on the PCI card provides storage for buffers, translation tables and I/O adapter code.This ensures that all the available PCI bandwidth is dedicated to data communication. The actual system performance of QsNetII is determined by the PCIe/PCI-X bus bridge implementation of the host system. Performance scales beyond the capacity of a single bus in multi-rail systems - where each PCI segment of an SMP node is connected to an independent QsNetII network. Up to 8 independent rails can be used, provided the base SMP systems have sufficient PCI buses available. QsNetII architecture supports 64 bit processor architectures such as the Intel Itanium and AMD Opteron. QsNetII provides full support for 64 bit virtual address translation on all network interface functional units permitting zero copy transfers across the entire 64 bit address space. Elan4 ASIC



Software Integration

QsNetII is supported under Linux for the Intel® XEON and Itanium processor families and the AMD Opteron architecture. In addition QsNet is available for the HP Alpha processor running Tru64 Unix. Quadrics MPI provides an optimized implementation of MPI 1.2 that makes full use of the capabilities of the hardware.The Quadrics MPI implementation is based on MPICH from Argonne National Laboratory, with extensions that make use of the broadcast and global operations of the QsNetII network, and the programmable IO processor on the network interface. A subset of MPI-2 operations providing one-sided communications is also supported. The Shmem communications library, with get and put operations mapped directly to remote read and write hardware primitives, provides access in to the basic network read and write operations with minimal overhead. It is also possible to use the Quadrics native communication libraries - libelan - where portability to other interconnects is not an issue. As each application has its own direct and protected access to the network interface, without going through a traditional protocol stack, it is possible to developed and deploy new communications libraries in one part of the machine without compromising the integrity of other applications running on the machine. The operating system utilises QsNetII through a reliable kernel messaging layer.This supports a range of services such as IP, cluster membership, and bulk data transfer for high performance file systems.


Data Network

The basic component of the QsNetII switch network is an 8 port custom switch ASIC.These can be combined in a 'fat tree' network that scales, in powers of 4, up to many thousands of nodes. Fat tree networks have many properties that make them attractive for high performance switch fabric. Most importantly the bisectional bandwidth of the network scales linearly with growth in network size.The topology is also inherently highly resilient with large amounts of redundancy in the higher levels of the switch. In this topology packets are routed 'up' the tree to the level from which the destination is reachable. At each stage there are up to 4 alternate up routes.The packet is then routed back down the tree to the destination. As the packet is routed through the network it constructs a fast return path for the packet acknowledge generated by the destination.The 'up' route is selected using adaptive routing, where the packet is routed to the lightest loaded alternate path. This ensures efficient use of the network, and also routes around any unconnected or disabled links. The fat tree topology also enables QsNetII to provide an innovative range selected broadcast. In this case the packet is routed up the tree to the point at which the entire broadcast range is reachable. When the packet is routed back down the switch components automatically copy the message across a range of destinations.The acknowledgements from all the destinations are recombined in the network, so that a broadcast only succeeds when all destinations have been successfully reached. Hardware broadcast allows global operations such as a barrier synchronize to be implemented with excellent scaling behaviour. For further information on optimised collectives see Optimised Collectives on QsNetII Elite4 ASIC



Stand-alone Systems

Quadrics offers a number of stand-alone switch chassis based on QsNetII technology in the range of 8-128 way cluster configurations. The Enterprise-Series combine ultra low latency and high bandwidth with cost effective configurations and are targeted at industrial level customers as well as research institutions. Potential uses include dedicated ISV codes for industries such as aerospace and automotive. The E-Series is supported under Linux for the Intel Xeon® and Itanium® processor families and the AMD Opteron architecture. QS32- 32-ports standalone switch



Federated systems

The basic building block of QsNetII switch networks is the QS5A switch chassis. A single chassis can be configured to provide up to 64 ports of switching implemented as a 3 stage fat tree network. For switches of greater than 64 ports, multiple switch chassis are used in a 'federated' network. "Federated switching" is a packaging solution, which enables very large networks to be implemented with two stages of switch chassis. Although the switch is now physically distributed between multiple chassis, this is partitioning is not visible to applications, as the basic switch network topology is unchanged.The lower level of switch chassis - the node switches - have 64 'down links' connected to processing nodes, and 64 'uplinks' connecting to higher levels of the switch network.The up links are connected to multiple independent switches, packaged in the same standard switch chassis.
Configuration of a 1024-way system with 16 node levels switches each porting 64 nodes and 8 top level switches (full bandwidth).



Network size Node switch chassis Top switch sizeTop switch chassis
256 4x42
1024 16x168
4096* 64x6464



QsNetII Spec

Bus interfaces PCI-X 1.0/PCIe 1.0a
Peak bus bandwidth 1064Mbytes/s
QsNet link width 10 bits
QsNet line rate 1.333Gbaud
Sustainable transfer rate 900Mbytes/s
On chip cache 32kbytes D + 16Kbytes
Local Memory 64Mbytes ECC DDR SDRAM
Peak Memory Bandwidth 2.67Gbyte/s
IO processor 200MHz 64 bit
Physical Addressing 52 bits
Virtual Address 64 bit VA, 4K/16K contexts
MMU 2 x 64 entry TLB + hash table



QM500LP PCI-X adapterQM509 PCIe adapterQM700 PCIe adapter



Network Adapters: QM509 (PCIe), QM500 (PCI-X)

QM509/QM500
Processor Quadrics Elan 4 network processor.
Bus InterfaceQM509: x4 PCIe Rev. 1.0a
QM500: 64 bit & 128 bit, 133MHz PCI Bus. PCI-X 1.0
Link Physical LayerFull duplex 10 bit, 1.3 Gbaud Quadrics QsNetII Link. 900MBytes/s peak each direction, after protocol.
Link Logical LayerRemote virtual write - Quadrics proprietary.
I/O processor200 MHz integrated I/O processor.
DMA processorIntegrated DMA engine. Automatic packetisation and scheduling.
DMA processorDedicated input packet processing engine.
Cache32KByte on chip d-cache, 16KByte on chip i-cache.
On board Memory64MBytes onboard DDR-SDRAM with ECC.
MMUDual 128 entry integrate TLB + table walk engine supporting full 64 bit virtual addressing.
Supported OSTru64 UNIX, Linux.
Communications librariesMPI 1.2 + MPI 2.0 remote read and write. Shmem*, kernel messaging & IP.
PhysicalLow profile, half length PCI card (167.65mm x 68.90mm). Standard height (111.15mm) faceplate also available
PowerQM509 (<12W), QM500 (<10W), typical



E-Series Standalone switches

QS8AQS32QS5A-LA
Number of links 8 ports 32 ports128 ports
Link Physical Layer Full duplex 10 bit, 1.3 Gbaud Quadrics QsNet II Link.Full duplex 10 bit, 1.3 Gbaud Quadrics QsNetII Link.Full duplex 10 bit, 1.3 Gbaud Quadrics QsNetII Link.
Switch architecture Single stage2 stage fully instantiated fat tree.3 stage fully instantiated fat tree.
Bisectional Bandwidth7.2 GBytes28.8 GByte/s57.6 GByte/s
Physical2U 19 inch rack mountable (90mm x 43mm x 260mm)4U 19 inch rack mountable (180mm x 430mm x 400mm).17U 19 inch rack mountable (750mm x 440mm x 510mm).
Power40W 120/240 Vac180W max 120/240 Vac 50/60Hz700W 120/240 Vac50/60Hz
QS8A page



QS5A 64 Port QsNetII switch

Number of down links 16 to 64 ports from up to 4 QM501-C 16 port cards.
Number of up links 16 to 64 ports from up to 4 QM502-C 16 port cards.
Link Physical Layer Full duplex 10 bit, 1.3 Gbaud Quadrics QsNet II Link.
Switch architecture 3 stage fully instantiated fat tree.
Bisectional Bandwidth 58 GByte/s
Physical 17U 19 inch rack mountable (0.75m x 0.44m x 0.51m).
PowerDual redundant 1.25KW PSU 120/240 Vac.



Further Information
Product Guide (pdf)
QsNet Brochure (pdf)
Quadrics whitepaper on QsNetII (pdf)
QsNetII performance evaluation (pdf)



Notes

QsNet performance is dependent upon the host PCI interface. Performance figures given in this document are indicative of what can be achieved, but do not represent a commitment for any particular system.
Tru64 UNIX is a registered trademark of Hewlett & Packard.
Linux is a registered trademark of Linus Torvalds.
Quadrics® is a registered trademark of Quadrics Ltd.
All other trademarks are the property of their respective owners.



Latest news

Quadrics QsTenG for HPC Interconnect Product Family (13 Nov 2007).
  - Click Here to view

> Legal