What Every Programmer Should Know About Memory
Ulrich Drepper
Red Hat, Inc.
drepper@redhat.com
November 21, 2007


2.3 Other Main Memory Users

Beside the CPUs there are other system components which can access the main memory. High-performance cards such as network and mass-storage controllers cannot afford to pipe all the data they need or provide through the CPU. Instead, they read or write the data directly from/to the main memory (Direct Memory Access, DMA). In Figure 2.1 we can see that the cards can talk through the South- and Northbridge directly with the memory. Other buses, like USB, also require FSB bandwidth—even though they do not use DMA—since the Southbridge is connected to the Northbridge through the FSB, too.

While DMA is certainly beneficial, it means that there is more competition for the FSB bandwidth. In times with high DMA traffic the CPU might stall more than usual while waiting for data from the main memory. There are ways around this given the right hardware. With an architecture as in Figure 2.3 one can make sure the computation uses memory on nodes which are not affected by DMA. It is also possible to attach a Southbridge to each node, equally distributing the load on the FSB of all the nodes. There are a myriad of possibilities. In Section 6 we will introduce techniques and programming interfaces which help achieving the improvements which are possible in software.

Finally it should be mentioned that some cheap systems have graphics systems without separate, dedicated video RAM. Those systems use parts of the main memory as video RAM. Since access to the video RAM is frequent (for a 1024×768 display with 16 bpp at 60Hz we are talking 94MB/s) and system memory, unlike RAM on graphics cards, does not have two ports this can substantially influence the systems performance and especially the latency. It is best to ignore such systems when performance is a priority. They are more trouble than they are worth. People buying those machines know they will not get the best performance.


除了CPU,其他的系统组件可以访问内存。高性能的网卡和大容量控制器并不允许他们的数据从CPU流过。相反,它们使用DMA(Direct Memory Access)的方式来直接读或者写内存,不经过CPU。在图2.1中,我们可以看到设备通过南北桥和内存沟通。其他的总线,类似USB,也会抢占FSB带宽。 即使他们没有使用DMA,但是南桥和北桥也是通过FSB联系到一起的。

使用DMA是十分的有利的,但这意味着与FSB总线带宽会有竞争。当有较多的DMA数据传输时,CPU会比以往有更高的延迟,它需要等待从内存来的数据。我们可以用一些正确的硬件来解决这个问题。在图2.3的架构中,我们的CPU使用直连的本地内存是不会被DMA所影响的。我们也可以在每个CPU节点上附加上一个南桥,使得所有节点负载均衡FSB的压力。会有很多的技术方法。在第6节,将介绍一些技术和编陈的接口在软件层次来帮助实现改善。

最后,需要提一下一些廉价的系统,它们的图形系统没有单独的,专门的图形内存。这些系统使用部分的主内存作为图形内存。因为访问图形内存是频繁的(一个1024*768,16bpp,60Hz的显示设置需要94MB/s的传输速度),并且系统内存不像图形显卡上的内存,没有两个接口,这将会影响系统的性能,尤其是延迟。当专注性能的时候,我们最好放弃这样的系统。它们制造的问题远远高于它们的价值。人们买它们的时候需要知道它们的性能很差。

10-03 10:42