Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revisionBoth sides next revision | ||
public:lta_howto [2013-06-21 15:22] – [Retrieving data] Adriaan Renting | public:lta_howto [2020-10-30 16:32] – [Staging Transient Buffer Board (TBB) data] Sander ter Veen | ||
---|---|---|---|
Line 1: | Line 1: | ||
====== Long Term Archive Howto ====== | ====== Long Term Archive Howto ====== | ||
- | This is a short manual on how to search for and retrieve data from the Long Term Archive. | + | This is a short manual on how to search for and retrieve data from the LOFAR Long Term Archive. |
- | ===== User Access ===== | + | To access the LTA, go to: [[https:// |
- | To access the LTA you need to have an account | + | For background information and in case of problems, please refer to the [[:public: |
- | - This automatically happens if you were a member | + | |
- | - Otherwise Science Support needs to add you to the project to which you need access. | + | |
- | - For public | + | |
- | If you were not originally a member of the project in MoM and Science Support adds you to it, you might get an email asking you to set a new password in [[https:// | + | ====== Release Notes ====== |
- | ===== Finding | + | ^Release^Description| |
+ | |July 2018|1) All data is now searchable, not only released data or data of projects you're a member of. Note that for downloading data (staging) the proprietary restrictions still apply. \\ 2) All projects can now be selected by all users, not only by project members. This provides the means to filter data based on project. \\ 3) Cone search algorithm implemented based on the Haversine formula for angular distance calculation. The calculated angular distance to the reference coordinates is now displayed in the search results. \\ 4) Search keys: added "time resolution", | ||
- | Once your account is set up, you can navigate to [[http:// | + | ====== User Access ====== |
- | Login into the website by clicking in ' | + | === I forgot my password === |
- | {{:public:lta_howto0.png|}} | + | Please visit [[https:// |
- | Currently you can only search the LTA catalogue per project. This means you need to select a project first by clicking on the ' | + | === Searching / retrieving data === |
- | {{:public:lta_howto1.png|}} | + | The LTA catalogue can be searched directly without needing any account. Access to all projects and search queries will return results of the entire catalogue because metadata are public |
- | Once you have selected your project, you can use either: | + | Staging and subsequent downloading of __**public**__ |
- | - The //Search// screen which allows you to search | + | |
- | - The //Show Latest// screen which shows you the most recently added data for this project. | + | |
- | The result | + | To stage and retrieve project-related data in the LTA which are __**proprietary**__ |
- | {{:public: | + | Please read the [[https://old.astron.nl/ |
+ | === Step-by-step guide to search and retrieve data === | ||
- | If you have a list of observations, | + | __Basic search__: |
- | ===== Retrieving | + | * log in to [[https:// |
+ | * click SEARCH DATA in the top menu | ||
+ | * specify the data product types of interest and a target name or coordinated | ||
+ | * click on " | ||
+ | * from the screen that follows, you should be able to stage the data products | ||
+ | |||
+ | __Advanced search__: | ||
+ | |||
+ | * log in to [[https:// | ||
+ | * click SEARCH DATA in the top menu | ||
+ | * click on the side Advanced Search drop-down list | ||
+ | * specify the data product types of interest from the drop down list | ||
+ | * select products features and specify a target name or coordinated | ||
+ | * click on " | ||
+ | * from the screen that follows, you should be able to stage the data products | ||
+ | |||
+ | __Project search (to restrict all data searches to that project only)__: | ||
+ | |||
+ | * log in to [[https:// | ||
+ | * click BROWSE PROJECTS in the top menu | ||
+ | * at this level membership can be checked, with the first column showing if you are a member of the project or for finding public projects. Available options are: | ||
+ | * click on the project name to view the project details | ||
+ | * use the ' | ||
+ | * use the 'show data' button to select the project and to show all data in it | ||
+ | * from the screen that follows, you should be able to either search / select / stage the data products | ||
+ | |||
+ | ====== How to find data in the archive ====== | ||
+ | |||
+ | Once your account is set up, or as anonymous user you can navigate the catalogue. In the former case you can login by clicking on the top right LOGIN button shown below. | ||
+ | |||
+ | === Page navigation === | ||
+ | |||
+ | The LTA menu, as shown below, gives access to the main functionality. | ||
+ | |||
+ | {{: | ||
+ | |||
+ | A search in the LTA catalogue can be initiated by clicking on the SEARCH DATA button on the menu. At this point a default basic search is setup, where users can select the data product type of interest and perform a cone search. An advanced search mode, with more advanced parameters per data type, can also be selected by clicking on the drop menu on the left side. | ||
+ | |||
+ | A " | ||
+ | |||
+ | * Click on the project name to view the project details and eventually select it. | ||
+ | * Use the ' | ||
+ | * Use the 'show data' button to select the project and to show all data in it. | ||
+ | |||
+ | === Finding Data === | ||
+ | |||
+ | {{: | ||
+ | |||
+ | Depending on the search parameters, e.g., which data products were requested (observation, | ||
+ | |||
+ | - select observations/ | ||
+ | - select observations and "show pipelines" | ||
+ | - select observations/ | ||
+ | |||
+ | Note that observations often have no raw data in the archive, but the metadata is visible because subsequent pipelines have processed the raw data further. To get to the pipelines related to observations, | ||
+ | |||
+ | To see whether observations or pipelines have data products in the LTA, look for the " | ||
+ | |||
+ | Once you have a list of dataproducts on your screen, the " | ||
+ | |||
+ | There is a separate page with **[[: | ||
+ | |||
+ | ==== Unspecified Data/ | ||
+ | |||
+ | Some data has had problems somewhere in the automation and control part of the LOFAR software during observation or processing. Sometimes a few subbands might be affected, sometimes an entire observation. Science Data Centre Operations will check the data, (re)run things manually or fix things if needed and then archive the data. This does mean that the automation and control sometimes loses track of the files and the archiving process has no information beyond the Observation ID and filename itself. In such cases a few subbands or an entire observation might end up under " | ||
+ | |||
+ | If an Observation is missing, or is missing subbands, please check if it ended up under Unspecified. | ||
+ | |||
+ | ===== Staging | ||
Once you have a list of dataproducts, | Once you have a list of dataproducts, | ||
- | {{:public:lta_howto2.png|}} | + | {{:public:lta_staging_1.png?900}} |
+ | |||
+ | The LOFAR Archive stores data on magnetic tape. This means that it cannot be downloaded right away, but has to be copied from tape to disk first. This process is called ' | ||
+ | |||
+ | When you have made your selection of files, click on //stage//. This shows you the following message. It means that a request has been sent to the LTA staging service to start retrieving the requested files from the tape and make them available on disk. You will get a confirmation e-mail, to acknowledge that your staging request was received and the process was queued. When the files are staged, you will get a notification email informing you that your data are ready for retrieval. | ||
+ | |||
+ | {{: | ||
+ | |||
+ | The e-mail that you get when the staging on disk is complete gives you a list of files and has several attachments. Amongst them are two files '' | ||
+ | |||
+ | {{: | ||
+ | |||
+ | There are two different ways to download your files with these attachments: | ||
+ | |||
+ | We also attach plain lists of the files/SURLs that were scheduled for staging (in the confirmation mail), those that were successfully staged, and (if any) those that could not be staged (in the success / partial success notifications). | ||
+ | |||
+ | === Please take note of the following === | ||
+ | |||
+ | - Unless you have an extremely fast connection (10 Gbit/s or more), **it is in general advisable to stage no more than 5 TB at a time** | ||
+ | - On a 1 Gbit/s connection as a general rule of thumb, you should be able to retrieve data at about 100-500 GB/hour, especially if you try to retrieve 4-8 files concurrently. If you see speeds much lower than this, you might have some kind of network problem and should in general contact your IT staff. | ||
+ | - Staging the data from tape to disk might take quite a bit of time. In the large data centres that the LTA uses, the tape drives are shared with all users and requests are queued. This is not just users of LOFAR but large data other projects like the LHC. This might mean that it takes anywhere from a few hours to a day or more to stage a copy of your data from tape to disk. | ||
+ | - The amount of space available for staging data is limited although quite large. This space is however shared between all LOFAR LTA users. This includes LTA operations for buffering data from CEP to the LTA before it gets moved to tape. If many users are staging data at the same time, and/or SDCO operations is transferring large amounts of data, the system might temporarily run low on disk space. You might then get a message that your request was only partially successful. In general the request will still finish 1-2 days later and we do monitor if requests don't get stuck and restart if needed. | ||
+ | - We strive to keep a copy of data that was staged on disk for 1-2 weeks so you have some time to download it. After that it might get removed to make space for more recent requests. The copy of the data on tape is only read and will still be available if you need to access the data again at a later stage but you might need to stage a copy to disk again. | ||
+ | - We are continuously trying to improve the reliability and speed of the available services. Please contact SDCO if you have any problems or suggestions for improvement. | ||
+ | - The data centres the LTA uses also have maintenance or small outages sometimes. SDCO can advice you if this is the case and when it is planned to end, if you are having trouble accessing data. In general this will not be at the same dates as the LOFAR stop days. | ||
+ | |||
+ | ==== Staging Transient Buffer Board (TBB) data ==== | ||
+ | |||
+ | TBB data needs to be staged by hand. Please send a request at [[https:// | ||
+ | < | ||
+ | |||
+ | wget --no-check-certificate https:// | ||
+ | |||
+ | Note: the filename should start or be prepended with srm:////// | ||
- | When you have made your selection of files, you click on //stage//. This shows you the following message. It means that a request has been sent to the LTA staging service to start retrieving the requested files from tape storage and make them available. You will get an e-mail when this tape retrieval is complete. | + | </file> |
- | {{: | + | You will need a valid LTA account to access this data. If the download is very short, you can view (e.g. cat) the filename for errors and report them to the helpdesk. |
- | The e-mail | + | //==== = Download data ===== You can download your requested data with the files from your e-mail |
- | {{: | + | to retrieve a single file. You need '' |
- | There are two ways you can use this list to retrieve the files: http and srm | + | If you do experience insufficient transfer speeds with srmcp, you may want to look into using srmcp with a [[:public:srmclientinstallation# |
- | === Please take note of the following ==== | ||
- | - Unless you have an extremely fast connection (10 Gbit/s or more), it is in general advisable to stage no more than 10 TB at a time (see also point 4). At maximum efficiency a 1 Gbit/s connection will already take 24 hours to retrieve 10 TB of data, in practice it will often take quite a bit more. | + | ===== Troubleshooting ===== |
- | - On a 1 Gbit/s connection as a general rule of thumb, you should be able to retrieve data at about 100-500 GB/hour, especially if you try to retrieve 4-8 files c concurrently. If you see speeds much lower than this, you might have some kind of network problem and should in general contact your IT staff. | + | |
- | - Staging the data from tape to disk might take quite a bit of time, even a full day or more. In general in the large data centres that the LTA uses, the tape drives are shared with all users and requests are queued. This is not just users of LOFAR but other projects like the LHC. This is why data needs to be staged to disk. | + | |
- | - The amount of space available for staging data is limited although quite large. This space is however shared between all LOFAR LTA users, including LTA operations for buffering data from CEP to the LTA before it gets moved to tape. If many users are staging data at the same time, and/or LOFAR operations is transferring large amounts of data, the system might temporarily run low on disk space. You might then get a message that your request was only partially successful. In general the request will still finish 1-2 days later and we do monitor if requests don't get stuck and restart if needed. | + | |
- | - We strive to keep a copy of data that was staged on disk for 1-2 weeks so you have some time to download it. After that it might get removed to make space for more recent requests. The the copy of the data on tape is only read and will still be available if you need to access the data again at a later stage. | + | |
- | - We are continuously trying to improve the reliability and speed of the available services. Please contact Science Support if you have any problems or suggestions for improvement. | + | |
- | - The data centres the LTA uses also have maintenance or small outages sometimes. Science Support can advice you if this is the case and when it is planned to end, if you are having trouble accessing data. In general this will not be at the same dates as the LOFAR stop days. | + | |
- | ==== HTTP download | + | |
- | If you open '' | + | * There is a [[: |
- | For wget you can use the following command line: | + | \\ |
- | wget -i html.txt | + | |
- | This will download the files in '' | + | |
- | user=lofaruser | + | |
- | password=secret | + | |
- | Set access authorizations of the .wgetrc file to user only so that the credentials are not exposed to anybody else, e.g.: | + | |
- | chmod 600 .wgetrc | + | |
- | There is no easy way to have wget rename the files as part of the command directly. It does not accept the -O flag inside a file it gets with -i. You can either rename files afterward, or add the -O option to each line in html.txt but then feed each line to wget separately like this: cat '' | + | |
- | ==== SRM download ==== | + | |
- | If you open the file '' | ||
- | An example command line would be: | ||
- | srmcp -server_mode=passive -copyjobfile=srm.txt | ||
- | to retrieve all requested files contained in srm.txt or e.g. | ||
- | srmcp -server_mode=passive srm:// | ||
- | to retrieve a single file. You need '' | ||