Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
dragnet:benchmarks_of_the_lotaas_pipelines [2016-08-06 10:56] – Sotiris Sanidas | dragnet:benchmarks_of_the_lotaas_pipelines [2017-03-08 15:27] (current) – external edit 127.0.0.1 | ||
---|---|---|---|
Line 62: | Line 62: | ||
- | ==== Data transfering | + | ==== Data transferring |
32-bit to 8-bit downsampling on CEP2 (per observation): | 32-bit to 8-bit downsampling on CEP2 (per observation): | ||
Line 71: | Line 71: | ||
==== Benchmarks for filterbank creation with psrfits2fil ==== | ==== Benchmarks for filterbank creation with psrfits2fil ==== | ||
- | A series of tests were ran on dragnet (drg01), directly (now through slurm), on fits files from a random LOTAAS observation. Number of cores,means number of different fits files processed simultaneously. The total time is an extrapolation of this benchmark.\\ | ||
- | Only one test was ran for each occasion. Repeating them showed differences in execution | + | psrfits2fil |
- | === Input/ | + | Using the same disk the following cases were tried: 1, |
- | 1-core:354sec\\ | + | for 2 disks: 1, |
- | Total time for 16 files:5664sec\\ | + | |
- | 3-cores:630sec\\ | + | {{dragnet:benchmarks:psrfits2fil1a.png? |
- | Total time for 16 files:3360sec\\ | + | |
- | 4-cores:720sec\\ | + | Using multithreading with 2 disks, gives a smooth linear performance up to 24 cores, and then it turns slightly worse, probably due to I/O. |
- | Total time for 16 files: | + | |
- | 5-cores: | + | Using the above results, I extrapolated the time needed with each work strategy in order to compute 32 filtebanks.\\ |
- | Total time for 16 files: | + | |
- | 8-cores:1735sec\\ | + | {{dragnet:benchmarks:psrfits2fil1b.png? |
- | Total time for 16 files:3360sec\\ | + | |
- | 16-cores: | + | When using the same disk, the fastest execution |
- | Total time for 16 files: | + | |
- | When writing the filterbank in the same disk with the fits file, the best performance is achieved | + | Using 2 disks, the performance is significantly better, and the best results are achieved |
- | === Input/ | + | ==== rfifind benchmarks ==== |
- | 1-core: | + | I ran the same tests twice. |
- | Total time for 16 files: | + | |
- | 4-cores:523sec\\ | + | I created rfi masks running rfifind in parallel for 4, |
- | Total time for 16 files: | + | In the following plots I plot the number of parallel instances of rfifind executed (x-axis) and the time taken for these to be completed (y-axis).\\ |
- | 8-cores:901sec\\ | + | {{dragnet:benchmarks: |
- | Total time for 16 files:1802sec\\ | + | {{dragnet: |
- | 16-cores: | + | In the following plots, I extrapolated the above results in order to find the optimal number of parallel jobs in order to compute 32 rfi masks |
- | Total time for 16 files: | + | |
- | 24-cores (hyperthreaded):2252sec scales perfectly!\\ | + | {{dragnet:benchmarks: |
- | 32-cores (hyperthreaded):3346sec performance loss already evident; | + | {{dragnet: |
- | Using all the cores in the dragnet nodes gives the best performance. Moreover, multithreading behaves exactly as having | + | From the above, we can conclude that using 1 or 2 disks does not make a big difference. Also, hyperthreading works smoothly, and indeed |
- | Moving a fil file from /data2 to /data1 takes 1 minute. | + | ==== Cartesius Benchmarks ==== |
+ | Processing 1 full pointing on cartesius using either /dev/shm or HDDs | ||
+ | |||
+ | {{dragnet: |