White Paper

Page 2 of 7

WHITE PAPER Massive Disaggregated Processing for Sensors at the Edge mrcy.com 3 Low latency assurance is essential, at every node As technologies evolve, data bandwidth is not the only area with significant increases; compute architecture is also benefiting from latency reductions offered by accelerators like DPUs that combine powerful CPUs, such as ARM, directly with high-speed networking and other accelerators for efficient, low latency I/O, which is critical for many edge applications, removing traditional CPU bottlenecks that may have introduced latency at each node. New AI-based techniques, like Cognitive Radar, must operate in near-real time and will add further processing requirements New, still-evolving application areas are adding further edge processing requirements. For example, cognitive radar applies AI techniques to extract information from a received return signal and then uses that information to improve transmit parameters such as frequency, waveform shape, and pulse repetition frequency. To be effective, the cognitive radar must execute those AI algorithms in near-real time, which, in turn, requires powerful graphics processing units (GPUs) in the processing chain.. THE DPU CONCEPT Data centers have a similar challenge Fortunately for edge applications, new technology has emerged to address a similar set of challenges across cloud, data center, and edge environments. Whether in the data center or at the edge, high-speed data movement is essential to application efficiency, and that movement demands a significant percentage of all computing cycles—if it is managed only by general-purpose CPUs. Sometimes the data in a stream must be processed directly by a CPU, while other streams are directed to storage. Many emerging AI applications operate using continual high-bandwidth data streams that are sent to GPUs, where immense numbers of math operations are executed in parallel. All the nodes and data streams need security. CPUs, designed for decision-making and general- purpose computing, can be programmed to manage any of those tasks, but they are not optimized for directing data streams, nor for storage and security management. What is DPU and what it does The data center solution for high-speed and low-latency networking is a data processing unit or DPU; it is a new class of programmable processors that is joining CPUs and GPUs as one of the three pillars of computing. Architecturally, a DPU is a system-on-a-chip (SoC) that combines three elements: ▪ A light-weight, multicore CPU ▪ A high-performance networking interface focused on parsing, processing, and moving data at line-rate speeds (i.e. 400G Ethernet) ▪ Programmable hardware acceleration engines for specific tasks, most especially controlling data storage, implementing security, and improving application performance for AI and machine learning Working together, these elements allow a DPU to perform the multiple functions of offloading, accelerating, and isolating software-defined network connectivity. One function very important to edge applications is the ability to feed networked data directly to GPUs using the direct memory access (DMA), without any involvement by a system CPU. More than just a smart NIC (network interface card), DPUs can be used as standalone embedded processors that > Low-Latency Eliminates bottlenecks Figure 1: NVIDIA BlueField DPU Figure 2: NVIDIA BlueField DPU Block Diagram

Articles in this issue

Cover

Links on this page

view archives of White Paper - Massive Disaggregated Processing for Sensors at the Edge

Massive Disaggregated Processing for Sensors at the Edge

Contents of this Issue

Navigation

Page 2 of 7

Articles in this issue

Links on this page