White Paper

Xeon-D Vs Xeon-E for Embedded Radar Applications

Issue link: https://read.uberflip.com/i/1173442

Contents of this Issue

Navigation

Page 6 of 7

7 Xeon D Vs E As of today there are embedded products available with dual server- class Xeon D-1540/1557 and Intel Xeon E5-2648L v3. An illustration of a dual Xeon D versus dual Xeon E architecture is shown in Figure 9 Figure 9. Dual Xeon D and E architecture comparison Figure 9 has two Xeon processors on each board. Both these processors have AVX2 arithmetic units. The Xeon E design is derived from a multi-socket high-performance serv- er computer. With its double memory bandwidth and high-speed proces- sor interlink it is clearly designed for higher performance applications. It can also run in SMP mode (one common OS running on both CPUs) with all 24 cores sharing a common high-speed memory. The eight-core Xeon D-1540 has 12 MB cache and the twelve-core Xeon D-1557 has 18 MB cache. The Xeon E-2648Lv3 with 12 cores has 30 MB cache. The cache size is critical for many applications so this is a major difference. A larger cache allows a processor to work on larger data sets and cache size can be even more important when more cores share a common cache. The Xeon D has two memory channels per processor and the Xeon E has four. The result is that the Xeon D memory speed is half of Xeon E. The effect of the difference in cache size and memory speed is illustrated in the Figure 10 (FFT with Intel IPP). Figure 10. In-place interleaved complex FFT If the Xeon D was run at the same clock speed of 1.8 GHz as the Xeon E (E5-2648Lv3) and running small FFT size, the performance per core would be similar. As the performance for larger sizes is limited by the cache size and memory bandwidth (memory bound), for larger FFT sizes, using more cores or increase core clock speed would not improve performance. In order to further clarify this, the diagram below shows the performance for scenarios where we run FFT on 6 and 12 cores simultaneously, all competing for the same cache and memory resources. Figure 11. Measured Gflop/s, 6 Vs 12 cores As shown in Figure 11, for memory bound functions, six Xeon E cores can outperform twelve Xeon D cores. If we run the full STAP benchmark on both Xeon D and E we will see that up to four cores, Xeon D can keep up with Xeon E. This is shown in Figure 12. Figure 12. STAP timing Xeon D Vs E While compute bound functions run in isolation they perform similarly but the cost of partitioning the full benchmark across cores and the resulting distributed corner-turns result in the Xeon D struggling. As a result, when using all twelve cores, it takes more than twice the time to run this benchmark on Xeon D. However, it seems likely that the original software can be redesigned to improve scalability for both Xeon D and E. The Xeon D's lack of a direct high-speed interlink between the proces- sors (QPI) results in a scalability between on-board processors to be the same as scaling across boards (over the fabric). On a single board the dual Xeon D are interconnected with 16x PCIe3 at 8 Gbps resulting in 16 GB/s compared with Xeon E's dual QPI with 38 GB/s BW. The result is that moving data between on-board processors will take twice the time. In applications demanding higher memory speed and inter-processor communication the Xeon E's larger cache, double memory speed and inter-processor QPI links will come into even greater importance. Such applications might be ground based 100+ channel radar systems with higher pulse repetition frequency (PRF) implementing an ordered statistics CFAR-algorithm operating in all three dimensions (range, pulse and channel). In such system, whilst computation complexity could be somewhat reduced, data movement is more demanding. Another application where memory speed and data movement is impor- tant is higher resolution Synthetic Aperture Radar (SAR). The analysis of using server-class processors in such applications could be a subject for further studies.

Articles in this issue

view archives of White Paper - Xeon-D Vs Xeon-E for Embedded Radar Applications