Outils pour utilisateurs

Outils du site


cluster-lbt:getting_support
no way to compare when less than two revisions

Différences

Ci-dessous, les différences entre deux révisions de la page.


cluster-lbt:getting_support [2021/02/18 17:40] (Version actuelle) – créée - modification externe 127.0.0.1
Ligne 1: Ligne 1:
 +===== Getting support =====
 +==== Trying to solve issues by yourself ====
 +The first thing to do when a crash occurs is to check your output files (specified in -o and -e PBS parameters). In most cases, crash details are written inside and are human-readable.
  
 +Example:\\
 +Your job is crashing each time you try to run it and, in your output files, you can read:
 +<code bash output>
 +[...]
 +Traceback (most recent call last):
 + File "<stdin>", line 1, in <module>
 + File "/shared/compilers/python/2.7.5/gnu/lib/python2.7/site-packages/h5py-2.5.0-py2.7-linux-x86_64.egg/h5py/__init__.py", line 13, in <module>
 +   from . import _errors
 +ImportError: libhdf5.so.9: cannot open shared object file: No such file or directory
 +</code>
 +
 +As you can notice above, the library libhdf5.so.9 is missing. So, the good way to solve it yourself is to wonder where is located the expected library (manifestly not in a common environment); and the good response is to load the afferent module, simply doing:
 +<cli>
 +$ module load hdf5
 +</cli>
 +This is an easy-to-understand example but it represents the vast majority of user support requests.
 +==== Asking help to your PI ====
 +After having tried to solve your problem by yourself, you should try askin help to your PI who probably have a long-term user experience with the computing resources available here and/or elsewhere.
 +==== Asking help to the users community of LBT's computing resources ====
 +Because you are part of the users community of the LBT's computing resources, a mailing-list has been created to share your issues and experience by sending an e-mail to <cluster-users@ibpc.fr>. If you are not part of this list, you should contact IBPC IT team (at <lbt-info@ibpc.fr>) to register into.
 +
 +This mailing-list will also be used to inform you about computing resources evolution.
 +==== Asking help to the IT manager ====
 +All user-support requests concerning LBT's computing resources should be done sending an email at <lbt-info@ibpc.fr>.
 +
 +That said, if -and only if- your request concerns the LBT's computational and storage resources (not the IBPC network and services neither desktop machines), you can contact me directly sending a well-formed email at <geoffrey.letessier@cnrs.fr>.
 +
 +What I mean by "well-formed email": if possible, your email should include the following 4 pieces of information:
 +  * Description of your issue.
 +  * Is your issue reproducible? (i.e.: does it happen everytime in the same conditions?)
 +  * The full path of your crashing job
 +  * The 2 output files provided by Torque resource manager (files generated by the -e and -o PBS options in your job script), being sure you are using the script I provided [[cluster-lbt:quick_start_guide|here]]
 +
 +<note> If your problem is that your job is blocked on queue, please dont kill it and provide me the output of "**checkjob -vv <job-ID>**" command</note>
cluster-lbt/getting_support.txt · Dernière modification : 2021/02/18 17:40 de 127.0.0.1

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki