====== New Arrivals Guide ====== ===== Welcome aboard ====== At this point, you should have met several smiling people and settled in your office. Comfy chair, splendid view of the surrounding mountains.. Welcome to Grenoble! Since you are reading this page, you have been given your temporary password and successfully logged in. You should change your password as soon as possible. In fact, you should change it right now. :-) [[https://password.inria.fr/|INRIA password change]] If you're coming from a different INRIA centre, you will keep your account but it needs to be updated manually with Jean-Marc Joseph. In some cases you have to change your password anyway to force the update. We leave for lunch between 12:00 and 13:00. You will need your lunch card, Nathalie (our team assistant) should have briefed you on how to get it (I think you pick it up at the restaurant with an ID card but I'm not sure). Your card has an associated balance that you can recharge with debit card, cash, or on the internet. For any computer-related questions for which you cannot find answer in the wiki, or when in doubt before doing something on your machine/on servers of the team, your [[:system_administrators|system administrators]] are here to help. ---- ===== Mail ===== You have a new mail address in the format "firstname.lastname@inria.fr". To access your mail, use the webmail [[https://zimbra.inria.fr|Zimbra]]. If you want to redirect your mails to your personal address, you need to setup a filter in Zimbra: - Go to "Préférences", "Filtres", "Nouveau filtre". - Under "Exécuter les actions suivantes", click on the green "+" icon and set it as "Rediriger vers l'adresse" - Enter your mail, click OK and in the top left corner, click "Enregistrer" Filters are very handy to automatically organize your mail as it arrives. == Thunderbird == If you want to use an email client like Thunderbird (or Icedove), you can use the following settings: * mail: name.surname@inria.fr * password: your_password * Incoming server: IMAP, zimbra.inria.fr, Port 993, Security SSL/TLS, Authentication by normal password * Outgoing server (SMTP): smtp.inria.fr, Port 587, Security STARTTLS, Authentication by normal password * Id (in/out): your_login ---- ===== Calendar ===== == Calendar from Zimbra == You can use Thunderbird/Lightning to read and edit your calendars form the Zimbra interface. To do so, add a new calendar in Thunderbird, select ''distant'' and then ''caldav'' format. The location of your default calendar is <code>https://zimbra.inria.fr/dav/your_login/calendar</code> To use another calendar, replace ''calendar'' by its ID (which is available with the link ''share calendar'' in Zimbra). == Calendar from the team == You can also read the calendar from the team (http://thoth.inrialpes.fr/seminars) in Thunderbird/Lightning by adding a new distant calendar based on ''iCal'' and available at the following address <code>https://calendar.google.com/calendar/ical/kmd5s2qkot725dd3h57ukblqi0%40group.calendar.google.com/public/basic.ics</code> ---- ===== Your work station ===== All desktop machines run a Linux Ubuntu distribution (16.04 LTS or 18.04 LTS).\\ You have basic user access to all of the desktop machines. The credentials (username+password) are the same as your INRIA account. Administrator privileges are given only to the [[system administrators]]. However, you can install packages using the ''sudo apt-get'' command: sudo apt-get install nameOfYourPackage You can choose your desktop environment at the login screen, in the top right corner. The recommended desktop environments are Xubuntu, MATE and KDE. The default X keyboard layout can be changed with: /home/clear/lear/tools_clear/lear-set-keyboard us Replace 'us' by 'fr' for french layout. One thing to remember, desktops stay online 24/7. You shouldn't shut down your computer when you leave work. The reason for this is that data is stored on every desktop, and this data should be accessible to other people at all times. Sometimes a reboot is necessary, in those cases talk to a [[system_administrators|system administrator]] first. Regarding the use of your personal machine (laptop), you will not have access to the wire network nor to the institutional wifi network ''inria-grenoble''. However, you can use the following wifi network: ''eduroam'' or ''inria-guest'' (see [[new_arrivals_guide#configuring_wifi|below]]). ---- ===== Printing ===== You can use ''printer_color_gra'' to print documents. The closest printer is in H110 (next to Nathalie's office). Swipe your Inria badge to identify yourself using the receiver on the printer. You can also print from this Internet webservice: [[http://print.inria.fr/]]. The main advantage of this app is that you can use it from outside Inria's intranet and that it prints two-sided documents by default. ---- ===== Moving around ===== ==== Setup SSH with RSA key pair ==== To move around the internal network and access it from outside, you have to setup your SSH key. - Open a terminal, type "ssh-keygen" and press Enter. - save the key in the default location - choose a password when you are prompted to (**do not leave this field empty**). This takes care of key generation. You also need to tag your key as "authorized" for it to work between machines: cd ~/.ssh/ cat id_rsa.pub > authorized_keys Now you're ready to connect to a different machine: ssh clear ==== Customizing your .bashrc, .bash_profile ==== By default, your terminal prompt will look barebones. In particular, it does not show which machine you are connected on, nor the current directory. You need to edit the file ~/.bashrc to trick out your terminal with some much-needed information. Add this line: export PS1="\\u@\h \\w: " When you create local shells, from now on they will have the "user@host pwd:" format. To obtain the same result over SSH, append these lines to ~/.bash_profile : <code>if [ -f ~/.bashrc ]; then . ~/.bashrc fi </code> ==== PATH and LD_LIBRARY_PATH ==== It is **strongly recommended to set PATH and LD_LIBRARY_PATH inside a bash function**, and call that function when necessary. If you do not follow this simple rule, things will work until they fail catastrophically. If at some point one of the paths in these environment variables is not accessible, **your terminals will be unresponsive and you won't be able to log in**. In some cases you can't even fix the problem yourself. Far from ideal. Save yourself the trouble, customize this snippet of code and insert it in your .bashrc: <code> function setenv_cuda() { export PATH=/path/to/your/cuda/bin:${PATH} export LD_LIBRARY_PATH=/path/to/your/cuda/lib64:${LD_LIBRARY_PATH} } </code> You can call this function in your scripts or your terminal: <code> setenv_cuda </code> ==== Configuring Wifi ==== You can try to connect to ''inria-grenoble'' with your Inria credentials. The ''eduroam'' network is available at the centre and also on Grenoble campus (and on many other campuses in France and across the world). Again, use the same credentials. To enable it, follow the instructions on this page: [[https://wiki.inria.fr/support/Eduroam]] Otherwise, you can use ''inria-guest'' as a fallback. It is a "closed" network at first, you need to open a browser page to prompt the login screen. However, you should keep in mind that the connexion is less secure on ''inria-guest''. ==== Working from home ==== [[tutorials:system:working_from_home|A complete page is dedicated to this section.]] ---- ===== Storage ===== ==== Home directory ==== Your home directory is accessible from all machines with this path: /home/thoth/yourUsername It is designed to store __low volume, highly critical data__, such as your code. The space available is only ~10 GB. You can hardly store videos, descriptors or experimental results on this, and it shouldn't be used as such. Your home directory data is **relatively safe**. It is stored by the centre's IT. If you have important data, regularly make backups yourself on a laptop, external hard drive, or upload your code somewhere (see [[new_arrivals_guide#online_storage]] for details). **Important:** Home directories are backed up. If you do something wrong (delete a file, etc.) old versions of any directories in your home directory can be accessed in: /home/thoth/yourUsername/path/to/yourDirectory/.snapshot This hidden directory stores different version of the files from ''/home/thoth/yourUsername/path/to/yourDirectory'' (last 4 hours, last 14 days and eventually last 4 weeks). It is impossible to retrieve a file modified more than 4 weeks ago. ==== Online storage ==== * **Versioning and saving code**: it is highly recommended to use development tools that allow you to version and back up your code. To version your code, you can use [[https://git-scm.com/doc|git]] or [[https://subversion.apache.org/|svn]] (git is now recommended). Inria proposes services to manage git (and svn) projects online: the [[https://gitlab.inria.fr|Inria GitLab]] (where you can log in with your Inria credentials) or the older [[info:inria_gforge|Inria GForge]]. * **Sharing and backing up files**: Inria proposes a [[https://www.seafile.com|SeaFile]] sync service available at [[https://mybox.inria.fr]] that basically works as any syncing service file (Dropbox, GoogleDrive, etc.). You have a 10Gb storage space where you can create some libraries. You can install a client on your machine for automatic sync. You can log in with your Inria credentials. People with whom you are collaborating (that are not Inria) can create an account and contribute to your libraries (they will NOT have a 10Gb storage space). * **Foreign services**: as member of Inria, it is highly recommended that you use the pre-cited tools to store and share code/data online. In particular, for privacy concern, you should avoid using services that are not hosted by Inria or its institutional partners. For instance, you should not use Dropbox, Google Drive and equivalent to save work related data. The use of Inria GitLab is also encouraged (compared to GitHub for instance). ==== Local storage ==== Every machine has a system disk where you can store browser cache and others. The partition has the following path: /local_sysdisk/ To use it, ask your [[system_administrators|system administrator]] for access. He will create a directory "/local_sysdisk/yourUserName" to which you have write access. This directory is only visible locally and is not redundant. ==== Shared storage ==== There are numerous "scratch" spaces in the THOTH team. They are data volumes exported via NFS to *all* machines. scratch spaces are designed to store __high volumes of data__ with some redundancy, however it **should not be considered reliable**. Also, the data on scratch spaces is not backed up, if you remove the wrong file you can't recover it. Some examples of scratch space paths: /scratch/clear/ /scratch2/clear/ /home/clear/ /scratch/albireo/ To see all the available scratch spaces, check the [[http://lear.inrialpes.fr/private/disk_space.txt|disk usage table]]. To use a scratch space, ask your [[system_administrators|system administrator]] for access. He will create a directory "/scratch/machine/yourUserName" to which you have write access. ---- ===== Thoth Reading group ===== Thoth regularly organizes reading groups. During a reading group, one person analyzes an interesting paper in Computer Vision / Machine Learning and presents it to the whole group. ---- ===== Resources ===== The team has a large amount of resources for computations. Using them is highly encouraged. However with great power comes great responsibilities. This part will present the bountiful resources at your disposal. You should then read the dedicated page [[tutorials:oar_tutorial|OAR Tutorial]] where you will learn how to submit job in practice. ==== CPU cluster ==== 37 nodes, 618 physical cores, 10Gb/1Gb interlink The CPU cluster of the team is now part to the "shared cluster" of the center where you can use computing resources from different teams. You have a priority access to the computing nodes from THOTH. To access the CPU cluster, do <code>ssh access2-cp</code> to connect to the frontend (Ubuntu submission node) or ''ssh access1-cp'' to access a Fedora submission node.\\ To manage and share resources between multiple users, the submission node uses the [[http://oar.imag.fr|OAR reservation system]].\\ You can check the cluster status and running job status by accessing [[http://visu-cp.inrialpes.fr/monika]].\\ Job insulation is pretty well managed on the CPU cluster, your main concern should be the use of memory. Before submitting a job, you should estimate the memory it will require and request for a computation node with sufficient memory. Since OAR does not monitor the quantity of memory requested by submitted jobs, multiple heavy memory jobs can be assigned to the same computation node, creating some issue and potentially crashing the node. Thus, it is recommended that your experiments do not waste memory (for safety, you can use the ''ulimit'' bash command). Reservations are done through the "oarsub" command. The simplest reservation looks like this: oarsub -I # Interactive If the cluster isn't full, your job will be executed almost immediately. The ''-I'' option stands for "interactive", it opens a shell on a cluster node where you can run your experiment. This interactive mode is useful to run some small tests (session limited in time) before using the standard submission mode. For more information about reserving resources using OAR (which resources for which duration) and to learn best practice rules, see the [[tutorials:oar_tutorial|OAR tutorial]]. ==== GPU cluster ==== 27 nodes, 66 GPUs, 10Gb/1Gb interlink To access the GPU cluster, do ''ssh edgar'' to connect to the frontend.\\ ''edgar'' also uses OAR for reservations. You can check the GPU cluster status and running job status at [[http://edgar/monika]].\\ You have to be very cautious with the GPU cluster since job insulation is not as effective with GPUs as it is on the CPU cluster, so you can potentially mess with jobs from other users. Best practice rule can be found in [[tutorials:oar_tutorial|OAR tutorial]]. ==== Desktop machines ==== The list of desktop machines is available here: [[http://thoth/private/machines.list]]\\ [[http://thoth/private/machines.ods]] Under certain circumstances it is acceptable to run experiments on desktop machines.\\ There are, however, best practices to follow: * Almost every machine has a designated user. If your experiment is big, ask for his permission first. * Only run code that you *know* is stable. * Don't crash your friend's machine by allocating too much memory. ===== All set ===== This concludes Thoth's introductory course.\\ I hope your thirst for knowledge is satiated. If you have any questions, drop by someone's office ASAP! Have fun. :-) ===== TL;DR ===== A quick recap: * Every user gets a ''home'' directory that can store up to 10Gb and that is backed up regularly. These home directories are mounted on all the machines via ''NFS''. You should store there information that is important but avoid installing large libraries. It can very much hold your code but not your data. * Every user can get access to one or multiple ''scratch'' directories that are also mounted on all the machines via ''NFS'' where data/checkpoints/logs can be stored. If you need any ''scratch'' space, you should inform [[:system_administrators|us]] and we will create one for you. * There is a GPU cluster that we host and administrate ourselves and that holds 66 GPUs with various types of GPU cards. We use ''OAR'' as a job scheduler. You have 2 modes: ''interactive'' where you get a shell on a GPU instance (max 12h) and ''passive'' where you simply launch the job in the background (unlimited time), be it using a ''.sh'' script or the full python command inside the ''OAR'' command itself. In addition, there are 2 queues: ''default'' where each user has a maximum of 3 instances that cannot be killed until the job stops and ''besteffort'' where the name is self-explanatory and jobs can get killed to make space for others (based on "karma", availability...) but can also be restarted automatically when adding the ''-t idempotent'' option. Note that the GPU cluster front node is ''edgar'' on which you should launch ''OAR'' commands but do not launch ''python'' jobs directly on this machine! * There is also a CPU cluster that [[:system_administrators|we]] do not administrate directly that contains a lot of CPUs. Here again we use ''OAR'' and the same rules as above hold. Note that the CPU cluster front node is ''access2-cp''. * Every user manages its own ''python'' environment but we strongly recommend using ''conda'' that can be installed with ''Miniconda''. [[:system_administrators|We]] can help you set that up! * Every user has access to a ''gitlab'' account (using your INRIA credentials) where code can be stored using version control.