Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
dragnet:cluster_benchmark [2016-07-27 18:40] – add GPU bandwithTest numbers amesfoort | dragnet:cluster_benchmark [2017-09-19 21:40] (current) – [RDMA] fix typo amesfoort | ||
---|---|---|---|
Line 18: | Line 18: | ||
- | ===== GPU Memory and PCIe bandwidth | + | ===== Memory and PCIe Bandwidth |
For PCIe bandwidth, there is a substantial difference between the 2 local GPUs and the 2 GPUs local to the other CPU in the same node. | For PCIe bandwidth, there is a substantial difference between the 2 local GPUs and the 2 GPUs local to the other CPU in the same node. | ||
Line 135: | Line 135: | ||
===== Infiniband ===== | ===== Infiniband ===== | ||
- | Each '' | + | Each '' |
==== IPoIB: TCP and UDP ==== | ==== IPoIB: TCP and UDP ==== | ||
- | An application that uses the Infiniband ('' | + | An application that uses the Infiniband ('' |
We used the '' | We used the '' | ||
Line 170: | Line 170: | ||
RDMA (Remote Direct Memory Access) allows an application to directly access memory on another node. Although some initial administration is set up via the OS kernel, the actual transfer commands and completion handling does not go via the kernel. This also saves data copying on sender and receiver and CPU usage. | RDMA (Remote Direct Memory Access) allows an application to directly access memory on another node. Although some initial administration is set up via the OS kernel, the actual transfer commands and completion handling does not go via the kernel. This also saves data copying on sender and receiver and CPU usage. | ||
- | Typical applications that may use RDMA are applications that use MPI (Message Passing Interface) (such as COBALT), or (hopefully) the LUSTRE client. NFS can also be set up to use RDMA. You can program directly into the '' | + | Typical applications that may use RDMA are applications that use MPI (Message Passing Interface) (such as COBALT), or (hopefully) the LUSTRE client. NFS can also be set up to use RDMA. You can program directly into the '' |
We used the '' | We used the '' | ||
Line 245: | Line 245: | ||
===== 10 Gbit/s Ethernet ===== | ===== 10 Gbit/s Ethernet ===== | ||
- | The '' | + | The '' |
Line 307: | Line 307: | ||
Total size: 14x 18982895616 bytes = 247.5087890625 Gbyte | Total size: 14x 18982895616 bytes = 247.5087890625 Gbyte | ||
=> 7.2 Gbit/s (14 scp streams (idle sys), no dynamic load-balancing) | => 7.2 Gbit/s (14 scp streams (idle sys), no dynamic load-balancing) | ||
+ | |||
+ | |||
+ | ====== Storage ====== | ||
+ | |||
+ | Only rough write tests have been done with a sequential dd(1). Disk I/O bandwidth changes across the platters. Actual file I/O also depends on how the filesystem lays out the data. | ||
+ | |||
+ | |||
+ | ===== drgXX nodes ===== | ||
+ | |||
+ | Scratch space on '' | ||
+ | |||
+ | Another cp test on a 85+% full target filesystem: | ||
+ | |||
+ | [amesfoort@drg23 data1]$ time (cp / | ||
+ | | ||
+ | real 9m56.566s | ||
+ | user 0m0.799s | ||
+ | sys 4m20.007s | ||
+ | |||
+ | With a file size of 2 * 75931582464 bytes, that's a read/write speed of 242.8 MiB/s. | ||
+ | ===== dragproc node ===== | ||
+ | |||
+ | On '' | ||