public:lta_faq

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
public:lta_faq [2020-10-18 19:44] Marco Iacobellipublic:lta_faq [2023-02-17 10:48] (current) Hanno Holties
Line 1: Line 1:
 ===== Long Term Archive Frequently Asked Questions ===== ===== Long Term Archive Frequently Asked Questions =====
  
-This page is targeted at users who want to learn more on the LOFAR LTA (Long Term Archive). It should answer the most common questions and provide help in case of difficulties in data retrieval. If the information on this page did not succeed to solve any issues you may encounter, please submit a support request to the [[https://support.astron.nl/rohelpdesk|Science Data Centre helpdesk]] as detailed [[http://astron.nl/radio-observatory/submit-support-request/jira-ro-helpdesk-ticketing-system|here]] (Please [[:public:lta_faq#i_want_to_contact_science_support_what_information_should_i_include|read this]] before you do!).+This page is targeted at users who want to learn more on the LOFAR LTA (Long Term Archive). It should answer the most common questions and provide help in case of difficulties in data retrieval. If the information on this page did not succeed to solve any issues you may encounter, please submit a support request to the [[https://support.astron.nl/rohelpdesk|ASTRON helpdesk]] as detailed [[http://astron.nl/radio-observatory/submit-support-request/jira-ro-helpdesk-ticketing-system|here]] (Please [[:public:lta_faq#i_want_to_contact_science_support_what_information_should_i_include|read this]] before you do!).
  
 ===== Questions ===== ===== Questions =====
Line 14: Line 14:
   * **Q**: [[:public:lta_faq#there_are_different_ways_to_download_which_one_is_the_best|There are different ways to download. Which one is the best]]   * **Q**: [[:public:lta_faq#there_are_different_ways_to_download_which_one_is_the_best|There are different ways to download. Which one is the best]]
   * **Q**: [[:public:lta_faq#my_downloads_are_too_slow_what_can_i_do|My download speeds are too slow. What can I do]]   * **Q**: [[:public:lta_faq#my_downloads_are_too_slow_what_can_i_do|My download speeds are too slow. What can I do]]
-  * **Q**: [[:public:lta_faq#i_want_to_contact_science_support_what_information_should_i_include|I want to contact Science DAta Centre Operations. What information should I include]]+  * **Q**: [[:public:lta_faq#i_want_to_contact_science_support_what_information_should_i_include|I want to contact Science Data Centre Operations. What information should I include]]
 ==== Troubleshoot ==== ==== Troubleshoot ====
  
Line 56: Line 56:
 Note that the larger your request, the longer it takes until you can retrieve the first file. Also, please limit the number of requests running in parallel to a few, especially when they contain many files. In principle, we avoid introducing hard limits, but rely on reasonable user behavior. This also means that you can block the system for a long time or, in the worst case, even bring it down. So please act responsibly or we might have to enforce some limits in the future to keep the system available for other users. Be aware, that we may cancel your request(s) in excessive cases to maintain LTA operation. Note that the larger your request, the longer it takes until you can retrieve the first file. Also, please limit the number of requests running in parallel to a few, especially when they contain many files. In principle, we avoid introducing hard limits, but rely on reasonable user behavior. This also means that you can block the system for a long time or, in the worst case, even bring it down. So please act responsibly or we might have to enforce some limits in the future to keep the system available for other users. Be aware, that we may cancel your request(s) in excessive cases to maintain LTA operation.
  
-If you, by accident, staged some 100'000 files or 100 TB of data, please contact the [[https://support.astron.nl/rohelpdesk|Radio Observatory helpdesk]], so that we can stop these requests, thanks!+If you, by accident, staged some 100'000 files or 100 TB of data, please contact the [[https://support.astron.nl/rohelpdesk|ASTRON helpdesk]], so that we can stop these requests, thanks!
  
 === What is all this SRM / 'staging' stuff about? === === What is all this SRM / 'staging' stuff about? ===
  
-These are technical terms that refer to the storage backend of the LTA. Each of the three LTA sites (in Amsterdam, Juelich and Groningen) operates an SRM (Storage Resource Management) system. Each SRM system consists of magnetic tape storage and hard disk storage. Both are addressed by a common file system, where each file has a specific locality: it can be either on disk ('online') or on tape ('nearline') or both. The usual case for LTA data is, that it is on tape only. Since the tape is not directly accessible but placed in a library shelf, the data on it first has to be copied from tape to disk, in order to retrieve it. This process is called 'staging'. Only while the data is (also) on disk, you will be able to download it. (In physics terms, think of it as an excited state.) To save cost, the disk pool is of limited capacity and only meant for temporary caching data that a user wants to access right now. After 7 days, all data is automatically 'released', which means that it may be deleted from the disk storage, as soon as the space is required for other data. It then has to be staged again in order to become accessible again.+These are technical terms that refer to the storage backend of the LTA. Each of the three LTA sites (in Amsterdam, Juelichand Poznan) operates an SRM (Storage Resource Management) system. Each SRM system consists of magnetic tape storage and hard disk storage. Both are addressed by a common file system, where each file has a specific locality: it can be either on disk ('online') or on tape ('nearline') or both. The usual case for LTA data is, that it is on tape only. Since the tape is not directly accessible but placed in a library shelf, the data on it first has to be copied from tape to disk, in order to retrieve it. This process is called 'staging'. Only while the data is (also) on disk, you will be able to download it. (In physics terms, think of it as an excited state.) To save cost, the disk pool is of limited capacity and only meant for temporary caching data that a user wants to access right now. After 7 days, all data is automatically 'released', which means that it may be deleted from the disk storage, as soon as the space is required for other data. It then has to be staged again in order to become accessible again.
  
 Usually, you don't have to worry about the details. But be aware, that data retrieval is a two-step procedure: 1) preparation for download ('staging') and 2) the download itself. Also, take care not to request [[:public:lta_faq#what_is_an_appropriate_amount_of_data_to_retrieve|too much data at the same time.]] Usually, you don't have to worry about the details. But be aware, that data retrieval is a two-step procedure: 1) preparation for download ('staging') and 2) the download itself. Also, take care not to request [[:public:lta_faq#what_is_an_appropriate_amount_of_data_to_retrieve|too much data at the same time.]]
Line 66: Line 66:
 === Do I have to make new requests via the web catalog? === === Do I have to make new requests via the web catalog? ===
  
-In principle, yes, this is the only supported procedure, at the moment. There are development versions of programming interfaces, with which it is possible to query the catalog and talk to the staging service, e.g. from scripts. But these are not generally made available, unsupported and still in development. If you are an 'expert user', are self-dependent enough to figure out how to work with this, and have a good reason, please contact the [[https://support.astron.nl/rohelpdesk|Radio Observatory helpdesk]] for some instructions and an emphatic admonition to take extra care.+In principle, yes, this is the only supported procedure, at the moment. There are development versions of programming interfaces, with which it is possible to query the catalog and talk to the staging service, e.g. from scripts. But these are not generally made available, unsupported and still in development. If you are an 'expert user', are self-dependent enough to figure out how to work with this, and have a good reason, please contact the [[https://support.astron.nl/rohelpdesk|ASTRON helpdesk]] for some instructions and an emphatic admonition to take extra care.
  
 === There are different ways to download. Which one is the best? === === There are different ways to download. Which one is the best? ===
Line 103: Line 103:
 === I did not receive a mail notification that my request was scheduled! === === I did not receive a mail notification that my request was scheduled! ===
  
-If the LTA catalog did not show any error when you submitted your request, then it is safe to assume that your request was registered in our staging system. Usually, you should get a notification mail that this has happened within a few minutes. If you did not receive the notification within an hour, then our staging service may be down. Note that your request is not lost in this case and will be picked up after the service is back online. In urgent cases or if you are not sure that something went wrong while submitting your request, please contact the [[https://support.astron.nl/rohelpdesk|Science Data Centre Operations helpdesk]].+If the LTA catalog did not show any error when you submitted your request, then it is safe to assume that your request was registered in our staging system. Usually, you should get a notification mail that this has happened within a few minutes. If you did not receive the notification within an hour, then our staging service may be down. Note that your request is not lost in this case and will be picked up after the service is back online. In urgent cases or if you are not sure that something went wrong while submitting your request, please contact the [[https://support.astron.nl/rohelpdesk|ASTRON helpdesk]].
  
 === I did not receive a mail notification that my data is ready for retrieval! Has my request gone lost? === === I did not receive a mail notification that my data is ready for retrieval! Has my request gone lost? ===
  
-After you got a notification that your requests was scheduled, it is in our database and there's hardly a possibility that it got lost. Staging requests can take up to a day or two, but will finish a lot sooner in most cases. This depends on your request's size but also on how busy the storage systems are by other user's requests at the moment. Sometimes, the LTA storage systems are down for maintenance and this can delay the whole procedure. You can [[http://web.grid.sara.nl/cgi-bin/lofar.py|check for downtimes here]].+After you got a notification that your requests was scheduled, it is in our database and there's hardly a possibility that it got lost. Staging requests can take up to a day or two, but will finish a lot sooner in most cases. This depends on your request's size but also on how busy the storage systems are by other user's requests at the moment. Sometimes, the LTA storage systems are down for maintenance and this can delay the whole procedure. You can [[https://ganglia.grid.surfsara.nl/cgi-bin/lofar.py|check for downtimes here]].
  
-It is not alarming when your request did not finish in 24 hours, even when your last request finished within 10 minutes. In urgent cases or if you did not receive a notification after 48 hours, please contact the [[https://support.astron.nl/rohelpdesk|Science Data Centre Operations helpdesk]].+It is not alarming when your request did not finish in 24 hours, even when your last request finished within 10 minutes. In urgent cases or if you did not receive a notification after 48 hours, please contact the [[https://support.astron.nl/rohelpdesk|ASTRON helpdesk]].
  
 === I got an email that says my staging request has failed! What happened? === === I got an email that says my staging request has failed! What happened? ===
  
-This means that the SRM server could not fulfill the request at all. This might mean that the system itself is fine, but none of the files from your request could be staged (e.g. missing files). Check the error message from your mail notification for details. The notification can also indicate that there is a general problem with the SRM system or with the staging service itself, i.e. something is broken or down for maintenance. We try to detect all temporary issues and only inform users in case that something is wrong with their request itself, but we cannot foresee all eventualities. If you cannot make sense out of the error message, or don't know how to deal with it, please contact the [[https://support.astron.nl/rohelpdesk|Science Data Centre Operations helpdesk]].+This means that the SRM server could not fulfill the request at all. This might mean that the system itself is fine, but none of the files from your request could be staged (e.g. missing files). Check the error message from your mail notification for details. The notification can also indicate that there is a general problem with the SRM system or with the staging service itself, i.e. something is broken or down for maintenance. We try to detect all temporary issues and only inform users in case that something is wrong with their request itself, but we cannot foresee all eventualities. If you cannot make sense out of the error message, or don't know how to deal with it, please contact the [[https://support.astron.nl/rohelpdesk|ASTRON helpdesk]].
  
 If you used the xmlrpc interface to submit your request, please first check whether you made a mistake and e.g. entered the wrong SURLs. If you used the xmlrpc interface to submit your request, please first check whether you made a mistake and e.g. entered the wrong SURLs.
Line 121: Line 121:
 === I got an email that says my staging request was only partially successful! What's going on? === === I got an email that says my staging request was only partially successful! What's going on? ===
  
-In general, this means that the SRM system works fine, but there was a problem processing your request. As a result, some of your files could be [[:public:lta_faq#what_is_all_this_srm_staging_stuff_about|staged]], some could not. Your mail notification should include a list of which files could not be prepared for download successfully and also include an error message to indicate the cause. If the error message says 'Incorrect URL: host does not match', this means that you combined files in a requests that are stored on two different SRM locations (e.g. one file at surfSARA and one file at Target). When one SRM location gets the request, it can only stage the local files. You have to request the files from different locations independently, to prevent this. Other messages should be self-explanatory, e.g. if a file is missing. If you cannot make sense out of the error message, or don't know how to deal with it, please contact the [[https://support.astron.nl/rohelpdesk|Science Data Centre Operations helpdesk]].+In general, this means that the SRM system works fine, but there was a problem processing your request. As a result, some of your files could be [[:public:lta_faq#what_is_all_this_srm_staging_stuff_about|staged]], some could not. Your mail notification should include a list of which files could not be prepared for download successfully and also include an error message to indicate the cause. If the error message says 'Incorrect URL: host does not match', this means that you combined files in a requests that are stored on two different SRM locations (e.g. one file at surfSARA and one file at Target). When one SRM location gets the request, it can only stage the local files. You have to request the files from different locations independently, to prevent this. Other messages should be self-explanatory, e.g. if a file is missing. If you cannot make sense out of the error message, or don't know how to deal with it, please contact the [[https://support.astron.nl/rohelpdesk|ASTRON helpdesk]].
  
 If you used the xmlrpc interface to submit your request, please first check whether you made a mistake and e.g. entered the wrong SURLs. If you used the xmlrpc interface to submit your request, please first check whether you made a mistake and e.g. entered the wrong SURLs.
Line 129: Line 129:
 === Oops! I made a mistake! How can I stop a request? === === Oops! I made a mistake! How can I stop a request? ===
  
-Unfortunately, this is currently not possible for you as a user. Stay calm and ask [[https://support.astron.nl/rohelpdesk|Science Data Centre Operations helpdesk]] to stop the request for you.+Unfortunately, this is currently not possible for you as a user. Stay calm and ask [[https://support.astron.nl/rohelpdesk|ASTRON helpdesk]] to stop the request for you.
  
 === My files only contain some error message instead of data === === My files only contain some error message instead of data ===
  
-Most errors should result in a 404/50x return code. However, some error messages are still returned as a message. Please read the error message carefully. In many cases, it should give you some indication of what went wrong. If this does not help you, please contact the [[https://support.astron.nl/rohelpdesk|Science Data Centre Operations helpdesk]] or retry after a few hours.+Most errors should result in a 404/50x return code. However, some error messages are still returned as a message. Please read the error message carefully. In many cases, it should give you some indication of what went wrong. If this does not help you, please contact the [[https://support.astron.nl/rohelpdesk|ASTRON helpdesk]] or retry after a few hours.
  
 **Important:**  If you use wget with option '-c', please note the following: wget does not check the contents of an existing file, so when restarting wget with option '-c' (continue) to retrieve the failed files, it will append the later data chunk to the existing file that contains the error message (and not the first section of you data). Make sure to delete the existing error files (should be obvious by the small file size) before calling 'wget -ci' again, to avoid corrupted data. If you already ended up with a corrupted file, you have to delete that and re-retrieve the whole file. **Important:**  If you use wget with option '-c', please note the following: wget does not check the contents of an existing file, so when restarting wget with option '-c' (continue) to retrieve the failed files, it will append the later data chunk to the existing file that contains the error message (and not the first section of you data). Make sure to delete the existing error files (should be obvious by the small file size) before calling 'wget -ci' again, to avoid corrupted data. If you already ended up with a corrupted file, you have to delete that and re-retrieve the whole file.
Line 139: Line 139:
 === My data files are corrupted === === My data files are corrupted ===
  
-Check if the files are much smaller than you expect. Something might have gone wrong with the transfer. Please check the beginning of your files, e.g. with 'less'. If there is an error message, please [[:public:lta_faq#my_files_only_contain_some_error_message|refer to this answer]]. Otherwise, please try to re-retrieve an affected file. If this does not help, please contact the [[https://support.astron.nl/rohelpdesk|Science Data Centre Operations helpdesk]].+Check if the files are much smaller than you expect. Something might have gone wrong with the transfer. Please check the beginning of your files, e.g. with 'less'. If there is an error message, please [[:public:lta_faq#my_files_only_contain_some_error_message|refer to this answer]]. Otherwise, please try to re-retrieve an affected file. If this does not help, please contact the [[https://support.astron.nl/rohelpdesk|ASTRON helpdesk]].
  
 === My downloads fail with error "All Ready slots are taken and Ready Thread Queue is full" === === My downloads fail with error "All Ready slots are taken and Ready Thread Queue is full" ===
Line 147: Line 147:
 === My downloads don't start / time out === === My downloads don't start / time out ===
  
-Maybe the SRM system is down for maintenance, please check [[http://web.grid.sara.nl/cgi-bin/lofar.py|http://web.grid.sara.nl/cgi-bin/lofar.py]]. If there is nothing going on, there is probably something wrong with the download service. Please try again a bit later and submit a support request to the [[https://support.astron.nl/rohelpdesk|Science Data Centre Operations helpdesk]], if the issue persists.+Maybe the SRM system is down for maintenance, please check [[https://ganglia.grid.surfsara.nl/cgi-bin/lofar.py|https://ganglia.grid.surfsara.nl/cgi-bin/lofar.py]]. If there is nothing going on, there is probably something wrong with the download service. Please try again a bit later and submit a support request to the [[https://support.astron.nl/rohelpdesk|ASTRON helpdesk]], if the issue persists.
  
 === Http downloads randomly fail with "503 Service Temporarily Unavailable" === === Http downloads randomly fail with "503 Service Temporarily Unavailable" ===
Line 194: Line 194:
 === SRM/Grid commands fail and I cannot figure out why! === === SRM/Grid commands fail and I cannot figure out why! ===
  
-Retry with option '-debug', which will print a lot of debug information to stdout. If this does not help yourself to figure out what is going wrong, submit a support request to the [[https://support.astron.nl/rohelpdesk|Science Data Centre Operations helpdesk]]. (Please [[:public:lta_faq#i_want_to_contact_science_support_what_information_should_i_include|read this]] before that!).+Retry with option '-debug', which will print a lot of debug information to stdout. If this does not help yourself to figure out what is going wrong, submit a support request to the [[https://support.astron.nl/rohelpdesk|ASTRON helpdesk]]. (Please [[:public:lta_faq#i_want_to_contact_science_support_what_information_should_i_include|read this]] before that!).
  
  
  • Last modified: 2020-10-18 19:44
  • by Marco Iacobelli