Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
dragnet:cluster_usage [2016-08-31 17:01] – [Using the Environment Modules] minor .profile note fix amesfoort | dragnet:cluster_usage [2019-01-07 15:06] (current) – [Access and Login] Reinoud Bokhorst | ||
---|---|---|---|
Line 7: | Line 7: | ||
===== Access and Login ===== | ===== Access and Login ===== | ||
- | To get an account, get permission from the Dragnet PI: Jason Hessels ('' | + | To get an account, get permission from the Dragnet PI: Jason Hessels ('' |
- | With permission | + | Easiest is to ask him to send his permission |
+ | You can also provide RO Admin your (e.g. home) IP(s) to add to a LOFAR portal | ||
- | Having an account, ssh to hostname '' | + | Having an account, ssh to hostname '' |
$ ssh USERNAME@dragnet | $ ssh USERNAME@dragnet | ||
Line 19: | Line 20: | ||
$ cat .ssh/ | $ cat .ssh/ | ||
$ chmod 600 .ssh/ | $ chmod 600 .ssh/ | ||
- | (For completeness: | + | (For completeness: |
To make login between nodes more reliable, you can disable the ssh host identification verification within DRAGNET. | To make login between nodes more reliable, you can disable the ssh host identification verification within DRAGNET. | ||
Line 29: | Line 30: | ||
StrictHostKeyChecking no | StrictHostKeyChecking no | ||
- | Now test if password-less login works by logging in and out to '' | + | Now test if password-less login works by logging in and out to '' |
- | ssh drg01 exit | + | ssh drg23 exit |
===== Finding Applications ===== | ===== Finding Applications ===== | ||
Line 45: | Line 46: | ||
Re-login (or enter the '' | Re-login (or enter the '' | ||
- | If you want to keep using the same tool version instead of auto-upgrading along when updates are installed, then specify the versioned module name (when available), e.g. '' | + | If you want to keep using the same tool version instead of auto-upgrading along when updates are installed, then specify the versioned module name (when available), e.g. '' |
Line 57: | Line 58: | ||
Type '' | Type '' | ||
- | List of available modules (Aug 2016): | + | List of available modules (July 2017): |
$ module avail | $ module avail | ||
| | ||
- | | + | --------------------------------------------------------------------- / |
dot | dot | ||
| | ||
- | | + | ---------------------------------------------------------------------------- / |
- | aoflagger/ | + | aoflagger/ |
- | aoflagger/ | + | aoflagger/2.9.0 |
- | casa/ | + | aoflagger/ |
+ | casa/ | ||
Add latest lofar module to your env: | Add latest lofar module to your env: | ||
Line 76: | Line 78: | ||
To run the prefactor and factor imaging pipelines, you may want to only use the following command (do not add '' | To run the prefactor and factor imaging pipelines, you may want to only use the following command (do not add '' | ||
- | $ module add local-user-tools wsclean/1.12 aoflagger/ | + | $ module add local-user-tools wsclean/2.4 aoflagger/ |
If you login and want to use CASA instead, better run ''/ | If you login and want to use CASA instead, better run ''/ | ||
Line 90: | Line 92: | ||
prepend-path PYTHONPATH / | prepend-path PYTHONPATH / | ||
------------------------------------------------------------------- | ------------------------------------------------------------------- | ||
- | |||
- | |||
- | ===== Copying Staged Data into DRAGNET ===== | ||
- | |||
- | To copy data sets from outside the LOFAR network (e.g. staged archive data) into DRAGNET, there is unfortunately only the login 1 Gbit/s link across the LOFAR portal available. (Atm, there is no 10G line available for this; the computing and network infra were designed with another usage pattern in mind. This may be solved in the future.) | ||
- | Since the portal is used by all users to login, it is important not to overload it. Load is monitored and too hungry copying processes may be killed if they harm other users. | ||
- | |||
- | So please rate-limit your download from outside into DRAGNET! A reasonable chunk of 1 Gbit/s is 400 Mbit/s (= 50 MByte/s), such that if somebody else does this too, there is still a bit of bandwidth for dozens of login sessions from other users. (Yes, this is hardly a foolproof strategy.) Please use: | ||
- | $ scp -l 400000 ... # value in kbit/s | ||
- | or | ||
- | $ wget --limit-rate=50m ... # value in MByte/s | ||
- | Rate-limited copying may take longer, but if the 1 Gbit/s portal link fills up, other users have problems working. A member of the DRAGNET team in Dwingeloo gets a visit from a sysadmin to call (or directly terminate the programs of) whatever DRAGNET user is causing it. | ||
- | |||
- | For those interested, you can use '' | ||
- | |||
- | |||
- | ===== Hostname Hell and Routing Rampage ===== | ||
- | If you are just running some computations on DRAGNET, skip this section. But if you need fast networking, or are already deep in the slow data transfers and rapid-fire connection errors, here is some info that may save you time wrt the multiple networks and network interfaces. (Or just tell us your needs.) | ||
- | |||
- | === Hostnames === | ||
- | * dragnet(.control.lofar) | ||
- | * dragproc(.control.lofar) | ||
- | * drg01(.control.lofar) - drg23(.control.lofar) | ||
- | |||
- | === Networks === | ||
- | Control/ | ||
- | 10G network: | ||
- | Infiniband network (IPoIB): NODENAME-ib.dragnet.infiniband.lofar (56 Gb) (all drgXX nodes) | ||
- | (There is also a 1 Gb IPMI network.) | ||
- | |||
- | ==== Cross-Cluster ==== | ||
- | When writing scripts that (also) have to work cross-cluster, | ||
- | |||
- | In most cases, you will use the network as deduced from the destination hostname or IP. Indicate a 10G name to use the 10G network. Idem for infiniband (IPoIB). (Exception: CEP 2, see below.) | ||
- | |||
- | //Note//: Copying large data sets at high bandwidth to/from other clusters (in particular CEP 2) may interfere with running observations as long as CEP 2 is still in use. If you are unsure, ask us. It is ok to use potentially oversubscribed links heavily, but please coordinate with Science Support! | ||
- | |||
- | |||
- | === CEP 2 === | ||
- | Initiate connections for e.g. data transfers from CEP 2 to HOSTNAME-10g.online.lofar and you will go via 10G. | ||
- | |||
- | The reverse, connecting from DRAGNET to CEP 2, by default will connect you via DRAGNET 1G (e.g. for login). To use 10G (e.g. to copy datasets), you need to bind to the local 10G interface name or IP. The program you are using has to support this via e.g. a command-line argument. | ||
Line 173: | Line 133: | ||
==== Shell Loop and SSH ==== | ==== Shell Loop and SSH ==== | ||
Examples: | Examples: | ||
- | $ for ((i = 1; i <= 10; i++)); do host=$(printf | + | $ for host in $(seq -f drg%02g 1 10); do ssh $host "hostname && |
- | $ for host in drg01 drg17; do ssh $host "df -h"; done | + | $ for host in drg01 drg17; do ssh $host "df -h"; done |
Be careful with complex commands! | Be careful with complex commands! | ||
+ | |||
+ | ===== Data Copying ===== | ||
+ | Generic data copying info plus cluster specific subsections. | ||
+ | |||
+ | To copy large data sets between nodes or into / out of DRAGNET, you can use '' | ||
+ | $ scp -B -c arcfour < | ||
+ | |||
+ | The '' | ||
+ | $ bbcp -A -e -s 4 -B 4M -r -g -@ follow -v -y dd -- drg23-10g.online.lofar:/ | ||
+ | |||
+ | Notes: | ||
+ | * OpenSSH-6.7 no longer allows the '' | ||
+ | * The '' | ||
+ | * For '' | ||
+ | |||
+ | |||
+ | ==== Hostname Hell and Routing Rampage ==== | ||
+ | If you are just running some computations on DRAGNET, skip this section. But if you need fast networking, or are already deep in the slow data transfers and rapid-fire connection errors, here is some info that may save you time wrt the multiple networks and network interfaces. (Or just tell us your needs.) | ||
+ | |||
+ | ==== Hostnames === | ||
+ | Control network: | ||
+ | * dragnet(.control.lofar) | ||
+ | * dragproc(.control.lofar) | ||
+ | * drg01(.control.lofar) - drg23(.control.lofar) | ||
+ | |||
+ | 10G network: | ||
+ | * dragproc-10g(.online.lofar) | ||
+ | * drg01-10g(.online.lofar) - drg23-10g(.online.lofar) | ||
+ | |||
+ | Infiniband network (~54G): | ||
+ | * drg01-ib(.dragnet.infiniband.lofar) | ||
+ | |||
+ | (There is also a 1 Gb IPMI network.) | ||
+ | |||
+ | Note that for copying files between hard disks, there is some benefit to use the 10G network. If you have data to copy on ''/ | ||
+ | |||
+ | |||
+ | ==== Cross-Cluster ==== | ||
+ | When writing scripts that (also) have to work cross-cluster, | ||
+ | |||
+ | In most cases, you will use the network as deduced from the destination hostname or IP. Indicate a 10G name to use the 10G network. Idem for infiniband. (Exception: CEP 2, see below.) | ||
+ | |||
+ | //Note//: Copying large data sets at high bandwidth to/from other clusters (in particular CEP 2) may interfere with running observations as long as CEP 2 is still in use. If you are unsure, ask us. It is ok to use potentially oversubscribed links heavily, but please coordinate with Science Operations and Support! | ||
+ | |||
+ | |||
+ | === CEP 2 === | ||
+ | Initiate connections for e.g. data transfers from CEP 2 to HOSTNAME-10g.online.lofar to transfer via 10G. | ||
+ | |||
+ | The reverse, connecting from DRAGNET to CEP 2, by default will connect you via DRAGNET 1G (e.g. for login). To use 10G (e.g. to copy datasets), you need to bind to the local 10G interface name or IP. The program you are using has to support this via e.g. a command-line argument. | ||
+ | |||
+ | |||
+ | === CEP 3 === | ||
+ | Use the '' | ||
+ | |||
+ | |||
+ | === CEP 4 === | ||
+ | CEP 4 has a Lustre global file system. Copying data to DRAGNET is supposed to happen via '' | ||
+ | |||
+ | A Lustre mount has also been set up on DRAGNET, but the storage name is not mounted by default. | ||
+ | |||
+ | |||
+ | === External Hosts (also LTA Staged Data) === | ||
+ | |||
+ | To copy data sets from outside the LOFAR network (e.g. staged long-term archive data) into DRAGNET, there is unfortunately only 1 Gbit/s available that is shared with other LOFAR users. A 10G link may become available in the future. | ||
+ | |||
+ | There are 3 cases to distinguish: | ||
+ | |||
+ | == 1. Access external hosts (but not lofar-webdav.grid.sara.nl) from the LOFAR network == | ||
+ | This all uses the LOFAR portal / public internet link (1 Gbit/s). Since the LOFAR portal is used by all users to login, it is important not to overload it. Load is monitored and too hungry copying processes may be killed if they harm other users. | ||
+ | |||
+ | So please rate-limit your download from outside into DRAGNET and CEPx! A reasonable chunk of 1 Gbit/s is 400 Mbit/s (= 50 MByte/s), such that if somebody else does this too, there is still a bit of bandwidth for dozens of login sessions from other users. (Yes, this is hardly a foolproof strategy.) Please use: | ||
+ | $ scp -l 400000 ... # value in kbit/s | ||
+ | or | ||
+ | $ wget --limit-rate=50m ... # value in MByte/s | ||
+ | or | ||
+ | $ curl --limit-rate=50m ... # value in MByte/s | ||
+ | or | ||
+ | $ rsync --bwlimit=51200 ... # value in kByte/ | ||
+ | |||
+ | For those interested, you can use '' | ||
+ | |||
+ | == 2. Download via http(s) from lofar-webdav.grid.sara.nl to the LOFAR network == | ||
+ | |||
+ | A http(s) '' | ||
+ | |||
+ | http_proxy=lexar004.control.lofar: | ||
+ | |||
+ | // | ||
+ | Put that username and password in a '' | ||
+ | proxy-user=yourusername | ||
+ | proxy-password=yourpassword | ||
+ | then keep it reasonably private by making that file only accessible to you: | ||
+ | chmod 600 ~/.wgetrc | ||
+ | |||
+ | If you use this only for lofar-webdav.grid.sara.nl, | ||
+ | |||
+ | //Note:// This also works for other http(s) destinations than SurfSara servers, however, then you need to rate-limit your http(s) traffic as described above under **1.**. Do **not** use this for other LTA sites than SurfSara, as atm this interferes with data streams from some int'l station! | ||
+ | |||
+ | == 3. Between ASTRON internal 10.xx.xx.xx nodes and the LOFAR network == | ||
+ | Specifically for ASTRON hosts with an internal '' | ||
===== SLURM Job Submission ===== | ===== SLURM Job Submission ===== | ||
Line 189: | Line 249: | ||
From any DRAGNET node (typically the '' | From any DRAGNET node (typically the '' | ||
- | Use '' | + | Use '' |
$ srun --nodes=5 --nodelist=drg01, | $ srun --nodes=5 --nodelist=drg01, | ||
dir1 dir2 file1 file2 [...] | dir1 dir2 file1 file2 [...] | ||
- | Use '' | + | Use '' |
$ sbatch --mail-type=END, | $ sbatch --mail-type=END, | ||
Submitted batch job < | Submitted batch job < | ||
Line 220: | Line 280: | ||
\\ | \\ | ||
Show list and state of nodes. When submitting a job, you can indicate one of the partitions listed or a (not necessarily large enough) set of nodes that must be used. Please hesitate indefinitely when trying to submit insane loads to the '' | Show list and state of nodes. When submitting a job, you can indicate one of the partitions listed or a (not necessarily large enough) set of nodes that must be used. Please hesitate indefinitely when trying to submit insane loads to the '' | ||
- | $ sinfo | + | $ sinfo --all |
PARTITION AVAIL TIMELIMIT | PARTITION AVAIL TIMELIMIT | ||
- | workers* | + | workers* |
- | proc | + | |
head | head | ||
+ | lofarobs | ||
If you get an error on job submission that there are no resources in the cluster to ever satisfy your job, and you know this is wrong (no typo), you can see with the '' | If you get an error on job submission that there are no resources in the cluster to ever satisfy your job, and you know this is wrong (no typo), you can see with the '' | ||
Line 230: | Line 290: | ||
More detail: | More detail: | ||
$ sinfo -o "%10N %8z %8m %40f %10G %C" | $ sinfo -o "%10N %8z %8m %40f %10G %C" | ||
- | NODELIST | + | NODELIST |
+ | dragnet,dr 1+: | ||
drg[01-23] 2:8:1 128500 | drg[01-23] 2:8:1 128500 | ||
- | dragnet,dr 1+:4+:1+ 31800+ | ||
where in the last column A = Allocated, I = Idle, O = Other, T = Total | where in the last column A = Allocated, I = Idle, O = Other, T = Total | ||
==== Hints on using more SLURM capabilities ==== | ==== Hints on using more SLURM capabilities ==== | ||
Line 246: | Line 306: | ||
* either number of nodes or CPUs | * either number of nodes or CPUs | ||
* number of GPUs, if any needed. If no GPUs are requested, any GPU program will fail. (Btw, this policy is not fully as intended, so if technically it can be improved, we can look into it.) | * number of GPUs, if any needed. If no GPUs are requested, any GPU program will fail. (Btw, this policy is not fully as intended, so if technically it can be improved, we can look into it.) | ||
- | * if you want to run >1 job on a node at the same time, memory. Just reserve per job: 128500 / NJOBS_PER_NODE. By default, SLURM reserves all the memory of a node, preventing other jobs from running on the same node(s). This may or may not be the intention. (If the intention, better use %%--%%exclusive.) | + | * In general, but no longer on DRAGNET or CEP4: if you want to run >1 job on a node at the same time, memory. Just reserve per job: 128500 / NJOBS_PER_NODE. By default, SLURM reserves all the memory of a node, preventing other jobs from running on the same node(s). This may or may not be the intention. (If the intention, better use %%--%%exclusive.) |
Note that a '' | Note that a '' | ||
- | To indicate a scheduling resource constraint on 2 GPUs, use the --gres option (//gres// stands for //generic resource// | + | To indicate a scheduling resource constraint on 2 GPUs, use the %%--%%gres option (//gres// stands for //generic resource// |
$ srun --gres=gpu: | $ srun --gres=gpu: | ||
To indicate a list of nodes that must be used (list may be smaller than number of nodes requested). Some examples: | To indicate a list of nodes that must be used (list may be smaller than number of nodes requested). Some examples: | ||
- | $ srun --nodelist=drg02 ls | + | $ srun --nodelist=drg23 ls |
- | $ srun --nodelist=drg05-drg07, | + | $ srun --nodelist=drg05-drg07, |
$ srun --nodelist=./ | $ srun --nodelist=./ | ||
Line 266: | Line 326: | ||
Bring fixed node back to partition from state DOWN to state IDLE (logged in as slurm): | Bring fixed node back to partition from state DOWN to state IDLE (logged in as slurm): | ||
- | $ scontrol update NodeName=drg02 state=idle | + | $ scontrol update NodeName=drg23 state=idle |
Users can resume their (list of) job(s) after SLURM found it/they cannot be run (network errors or so) and sets the status to something like ' | Users can resume their (list of) job(s) after SLURM found it/they cannot be run (network errors or so) and sets the status to something like ' | ||
- | This can also be exectured | + | This can also be executed |
$ scontrol resume 100 | $ scontrol resume 100 | ||
$ scontrol resume [1000,2000] | $ scontrol resume [1000,2000] | ||
+ | | ||
+ | ==== SLURM Troubleshooting ==== | ||
+ | == " | ||
+ | |||
+ | If you expect that there should be enough resources, but slurm submission fails because some nodes could be in " | ||
+ | something like this, where nodes drg06 and drg08 are in drain state: | ||
+ | |||
+ | $ sinfo | ||
+ | PARTITION AVAIL TIMELIMIT | ||
+ | workers* | ||
+ | workers* | ||
+ | workers* | ||
+ | head | ||
+ | |||
+ | To " | ||
+ | $ scontrol update NodeName=drg08 State=DOWN Reason=" | ||
+ | $ scontrol update NodeName=drg08 State=RESUME | ||