Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
public:lta_howto [2015-01-23 10:31] – [HTTP download] Joern Kuensemoeller | public:lta_howto [2025-01-17 11:00] (current) – [Change of account registration method] Hanno Holties | ||
---|---|---|---|
Line 1: | Line 1: | ||
====== Long Term Archive Howto ====== | ====== Long Term Archive Howto ====== | ||
- | This is a short manual on how to search for and retrieve data from the Long Term Archive. | + | This is a short manual on how to search for and retrieve data from the LOFAR Long Term Archive. |
+ | To access the LTA, go to: [[https:// | ||
+ | |||
+ | For background information and in case of problems, please refer to the [[: | ||
+ | |||
+ | ====== Release Notes ====== | ||
+ | |||
+ | ^Release^Description| | ||
+ | |July 2018|1) All data is now searchable, not only released data or data of projects you're a member of. Note that for downloading data (staging) the proprietary restrictions still apply. \\ 2) All projects can now be selected by all users, not only by project members. This provides the means to filter data based on project. \\ 3) Cone search algorithm implemented based on the Haversine formula for angular distance calculation. The calculated angular distance to the reference coordinates is now displayed in the search results. \\ 4) Search keys: added "time resolution", | ||
====== User Access ====== | ====== User Access ====== | ||
- | To access the LTA you need to have an account in [[https:// | + | ==== Change |
- | - This automatically happens if you were a member | + | |
- | - Otherwise Science Support needs to add you to the project to which you need access. | + | |
- | - For public data you can use an anonymous | + | |
- | If you were not originally a member | + | :!: We are in the process |
+ | === I forgot my password === | ||
- | ====== How to retrieve data from the archive ====== | + | Please visit [[https:// |
- | Once you have a MoM account, there is a three step procedure to get your data: | + | === Searching / retrieving |
- | - [[#Finding data|Find data you want to download]] | + | |
- | - [[#Staging data (Prepare for download)|Stage data (Prepare for download)]] | + | |
- | - [[#Download data|Download data]] | + | |
- | These steps are explained in detail below. | + | The LTA catalogue can be searched directly without needing any account. Access to all projects and search queries will return results of the entire catalogue because metadata |
- | ===== Finding | + | Staging and subsequent downloading of __**public**__ |
- | Once your account is set up, you can navigate | + | To stage and retrieve project-related data in the LTA which are __**proprietary**__ |
- | Login into the website | + | Please read the [[https:// |
+ | === Step-by-step guide to search and retrieve data === | ||
- | {{:public: | + | __Basic search__: |
- | Currently you can only search the LTA catalogue per project. This means you need to select | + | * log in to [[https:// |
+ | * click SEARCH DATA in the top menu | ||
+ | * specify the data product types of interest and a target name or coordinated | ||
+ | * click on " | ||
+ | * from the screen that follows, | ||
- | {{:public: | + | __Advanced search__: |
- | Once you have selected your project, you can use either: | + | * log in to [[https://lta.lofar.eu/|https://lta.lofar.eu/]] |
- | - The //Search// screen which allows you to search by RA/Dec, ObservationId, | + | |
- | - The //Show Latest// | + | * click on the side Advanced Search drop-down list |
+ | * specify the data product types of interest from the drop down list | ||
+ | * select products features and specify a target name or coordinated | ||
+ | * click on " | ||
+ | * from the screen | ||
- | The result of either query will be a list of data products or observations similar | + | __Project search (to restrict all data searches |
- | {{:public: | + | * log in to [[https://lta.lofar.eu/|https:// |
+ | * click BROWSE PROJECTS in the top menu | ||
+ | * at this level membership can be checked, with the first column showing if you are a member of the project or for finding public projects. Available options are: | ||
+ | * click on the project name to view the project details | ||
+ | * use the ' | ||
+ | * use the 'show data' button to select the project and to show all data in it | ||
+ | * from the screen that follows, you should be able to either search / select / stage the data products | ||
- | If you have a list of observations, | ||
+ | ====== How to find data in the archive ====== | ||
- | If you hover with your mouse over the DataProductsIdentifier in the detailed DataProducts view, you can get more information, | + | Once your account is set up, or as anonymous user you can navigate |
- | {{: | + | === Page navigation === |
+ | The LTA menu, as shown below, gives access to the main functionality. | ||
- | [[lta_tricks|There is a separate page with more detailed information and advanced tricks to help find and download your data]] | + | {{: |
+ | A search in the LTA catalogue can be initiated by clicking on the SEARCH DATA button on the menu. At this point a default basic search is setup, where users can select the data product type of interest and perform a cone search. An advanced search mode, with more advanced parameters per data type, can also be selected by clicking on the drop menu on the left side. | ||
+ | |||
+ | A " | ||
+ | |||
+ | * Click on the project name to view the project details and eventually select it. | ||
+ | * Use the ' | ||
+ | * Use the 'show data' button to select the project and to show all data in it. | ||
+ | |||
+ | === Finding Data === | ||
+ | |||
+ | {{: | ||
+ | |||
+ | Depending on the search parameters, e.g., which data products were requested (observation, | ||
+ | |||
+ | - select observations/ | ||
+ | - select observations and "show pipelines" | ||
+ | - select observations/ | ||
+ | |||
+ | Note that observations often have no raw data in the archive, but the metadata is visible because subsequent pipelines have processed the raw data further. To get to the pipelines related to observations, | ||
+ | |||
+ | To see whether observations or pipelines have data products in the LTA, look for the " | ||
+ | |||
+ | Once you have a list of dataproducts on your screen, the " | ||
+ | |||
+ | There is a separate page with **[[: | ||
==== Unspecified Data/ | ==== Unspecified Data/ | ||
- | Some data has had problems somewhere in the automation and control part of the LOFAR software during observation or processing. Sometimes a few subbands might be affected, sometimes an entire observation. Science | + | Some data has had problems somewhere in the automation and control part of the LOFAR software during observation or processing. Sometimes a few subbands might be affected, sometimes an entire observation. Science |
If an Observation is missing, or is missing subbands, please check if it ended up under Unspecified. | If an Observation is missing, or is missing subbands, please check if it ended up under Unspecified. | ||
Line 64: | Line 108: | ||
Once you have a list of dataproducts, | Once you have a list of dataproducts, | ||
- | {{:public:lta_howto2.png|}} | + | {{:public:lta_staging_1.png?900}} |
- | The LOFAR Archive stores data on magnetic tape. This means, that it cannot be downloaded right away, but has to be copied from tape to disk first. This process is called ' | + | The LOFAR Archive stores data on magnetic tape. This means that it cannot be downloaded right away, but has to be copied from tape to disk first. This process is called ' |
- | When you have made your selection of files, | + | When you have made your selection of files, click on //stage//. This shows you the following message. It means that a request has been sent to the LTA staging service to start retrieving the requested files from the tape and make them available |
- | {{:public:lta_howto3.png|}} | + | {{:public:lta_howto_3.png?900}} |
- | The e-mail that you get when the tape retrieval | + | The e-mail that you get when the staging on disk is complete gives you a list of files and has several |
- | {{: | + | {{: |
- | There are two ways you can use this list to retrieve the files: [[#HTTP Download|http]] and [[#SRM Download|srm]] | + | There are two different |
- | === Please take note of the following ==== | + | We also attach plain lists of the files/SURLs that were scheduled for staging (in the confirmation mail), those that were successfully staged, and (if any) those that could not be staged (in the success / partial success notifications). |
- | | + | === Please take note of the following === |
+ | |||
+ | | ||
- On a 1 Gbit/s connection as a general rule of thumb, you should be able to retrieve data at about 100-500 GB/hour, especially if you try to retrieve 4-8 files concurrently. If you see speeds much lower than this, you might have some kind of network problem and should in general contact your IT staff. | - On a 1 Gbit/s connection as a general rule of thumb, you should be able to retrieve data at about 100-500 GB/hour, especially if you try to retrieve 4-8 files concurrently. If you see speeds much lower than this, you might have some kind of network problem and should in general contact your IT staff. | ||
- Staging the data from tape to disk might take quite a bit of time. In the large data centres that the LTA uses, the tape drives are shared with all users and requests are queued. This is not just users of LOFAR but large data other projects like the LHC. This might mean that it takes anywhere from a few hours to a day or more to stage a copy of your data from tape to disk. | - Staging the data from tape to disk might take quite a bit of time. In the large data centres that the LTA uses, the tape drives are shared with all users and requests are queued. This is not just users of LOFAR but large data other projects like the LHC. This might mean that it takes anywhere from a few hours to a day or more to stage a copy of your data from tape to disk. | ||
- | - The amount of space available for staging data is limited although quite large. This space is however shared between all LOFAR LTA users. This includes LTA operations for buffering data from CEP to the LTA before it gets moved to tape. If many users are staging data at the same time, and/ | + | - The amount of space available for staging data is limited although quite large. This space is however shared between all LOFAR LTA users. This includes LTA operations for buffering data from CEP to the LTA before it gets moved to tape. If many users are staging data at the same time, and/ |
- | - We strive to keep a copy of data that was staged on disk for 1-2 weeks so you have some time to download it. After that it might get removed to make space for more recent requests. The the copy of the data on tape is only read and will still be available if you need to access the data again at a later stage but you might need to stage a copy to disk again. | + | - We strive to keep a copy of data that was staged on disk for 1-2 weeks so you have some time to download it. After that it might get removed to make space for more recent requests. The copy of the data on tape is only read and will still be available if you need to access the data again at a later stage but you might need to stage a copy to disk again. |
- | - We are continuously trying to improve the reliability and speed of the available services. Please contact | + | - We are continuously trying to improve the reliability and speed of the available services. Please contact |
- | - The data centres the LTA uses also have maintenance or small outages sometimes. | + | - The data centres the LTA uses also have maintenance or small outages sometimes. |
+ | |||
+ | ==== Staging Transient Buffer Board (TBB) data ==== | ||
+ | |||
+ | TBB data needs to be staged by hand. Please send a request at [[https:// | ||
+ | < | ||
+ | |||
+ | wget --no-check-certificate https:// | ||
+ | |||
+ | The filename should start or be prepended with srm:// | ||
+ | |||
+ | </ | ||
+ | |||
+ | You will need a valid LTA account to access this data. If the filename is very short, you can view (e.g. cat) it to view errors that have occured. | ||
===== Download data ===== | ===== Download data ===== | ||
- | You can download your requested data with the files from your e-mail notification. | + | You can download your requested data with the files from your e-mail notification. There are different possibilities and tools to do this. If you're unsure, which one to use, please refer to the according [[: |
==== HTTP download ==== | ==== HTTP download ==== | ||
- | If you open '' | + | If you open '' |
For wget you can use the following command line: | For wget you can use the following command line: | ||
- | | + | < |
- | This will download the files in '' | + | |
+ | wget -i html.txt | ||
+ | |||
+ | </ | ||
+ | |||
+ | This will download the files in '' | ||
Preferrably, | Preferrably, | ||
- | | + | |
+ | < | ||
+ | wget -ci html.txt | ||
+ | |||
+ | </ | ||
Do not set the username and password on the wget command line because this allows other users on the system to view them in the process list. Instead you should create a file ~/.wgetrc with two lines according to the following example: | Do not set the username and password on the wget command line because this allows other users on the system to view them in the process list. Instead you should create a file ~/.wgetrc with two lines according to the following example: | ||
- | user=lofaruser | ||
- | password=secret | ||
- | **Note:** This is only an example, you have to edit the file and enter your own personal user name and password! | + | < |
- | + | user=lofaruser | |
+ | password=secret | ||
+ | |||
+ | </ | ||
+ | |||
+ | Note: This is only an example, you have to edit the file and enter your own personal user name and password! | ||
Set access authorizations of the .wgetrc file to user only so that the credentials are not exposed to anybody else, e.g.: | Set access authorizations of the .wgetrc file to user only so that the credentials are not exposed to anybody else, e.g.: | ||
- | | + | |
+ | < | ||
+ | chmod 600 .wgetrc | ||
+ | |||
+ | </ | ||
There is no easy way to have wget rename the files as part of the command directly. It does not accept the -O flag inside a file it gets with -i. You can either rename files afterward, e.g. using the following command: | There is no easy way to have wget rename the files as part of the command directly. It does not accept the -O flag inside a file it gets with -i. You can either rename files afterward, e.g. using the following command: | ||
- | | + | |
+ | < | ||
+ | find . -name " | ||
+ | |||
+ | </ | ||
or add the -O option to each line in html.txt but then feed each line to wget separately like this: cat '' | or add the -O option to each line in html.txt but then feed each line to wget separately like this: cat '' | ||
The following Python script will take care of renaming and untarring the downloaded files: | The following Python script will take care of renaming and untarring the downloaded files: | ||
- | |||
< | < | ||
+ | |||
#M.C. Toribio | #M.C. Toribio | ||
# | # | ||
Line 144: | Line 226: | ||
print outname+' | print outname+' | ||
+ | |||
+ | </ | ||
+ | |||
+ | Another Python script for renaming the downloaded (and previously untarred) files. It removes the random part of the filename before the .tar extension: | ||
+ | |||
+ | < | ||
+ | import os | ||
+ | import sys | ||
+ | import glob | ||
+ | |||
+ | # AUTHOR: J.B.R. OONK (ASTRON/ | ||
+ | # - changes LTA retrieval filename to standard filename | ||
+ | # - run in the directory where LTA files are located | ||
+ | |||
+ | # FILE DIRECTORY | ||
+ | path = " | ||
+ | |||
+ | filelist = glob.glob(path+' | ||
+ | print ' | ||
+ | |||
+ | #FILE STRING SEPARATORS | ||
+ | sp1d=' | ||
+ | sp2d=' | ||
+ | extn=' | ||
+ | extt=' | ||
+ | |||
+ | #LOOP | ||
+ | print '##### | ||
+ | for infile_orig in filelist: | ||
+ | |||
+ | #GET FILE | ||
+ | infiletar | ||
+ | infile | ||
+ | print 'doing file: ', infile | ||
+ | |||
+ | spl1=infile.split(sp1d)[11] | ||
+ | spl2=spl1.split(sp2d)[1] | ||
+ | spl3=spl2.split(extn)[0] | ||
+ | newname = spl3+extn+extt | ||
+ | |||
+ | # SPECIFY FILE MV COMMAND | ||
+ | command=' | ||
+ | print command | ||
+ | |||
+ | # CARRY OUT FILENAME CHANGE !!! | ||
+ | # - COMMENT FOR TESTING OUTPUT | ||
+ | # - UNCOMMENT TO PERFORM FILE MV COMMAND | ||
+ | # | ||
+ | |||
+ | print ' | ||
+ | |||
</ | </ | ||
Note that wget does not overwrite existing files. If you use the continue option (' | Note that wget does not overwrite existing files. If you use the continue option (' | ||
- | There are some small example links if you browse to [[https:// | + | There are some small example links if you browse to [[https:// |
==== SRM download ==== | ==== SRM download ==== | ||
- | If you open the file '' | + | If you open the file '' |
- | An example command line would be: | + | < |
- | srmcp -server_mode=passive -copyjobfile=srm.txt | + | |
+ | srmcp -server_mode=passive -copyjobfile=srm.txt | ||
+ | |||
+ | </ | ||
to retrieve all requested files contained in srm.txt or e.g. | to retrieve all requested files contained in srm.txt or e.g. | ||
- | | + | |
- | to retrieve a single file. You need '' | + | < |
+ | srmcp -server_mode=passive srm:// | ||
+ | |||
+ | </ | ||
+ | |||
+ | to retrieve a single file. You need '' | ||
+ | |||
+ | If you do experience insufficient transfer speeds with srmcp, you may want to look into using srmcp with a [[: | ||
===== Troubleshooting ===== | ===== Troubleshooting ===== | ||
- | * If you download files with http/wget and then have trouble extracting the data from the tar file, check if the files are much smaller than you expect. Something might have gone wrong with the transfer. One thing you can do to check, | + | * There is a [[: |
- | * We have seen the error "All Ready slots are taken and Ready Thread Queue is full", which means the system is overloaded and you should try again in a few hours. | + | |
- | * If the downloads time out even after you have properly staged the data, you can check if the servers at SARA or Jülich are down: http:// | + |