Atlas

ETAP login through sshgate

The instructions here are mostly for the desktop linux PCs that have been installed by Andreas Nussbaumer (or with boot server by Andreas) and that are modified in a way, that the central administration is not possible (or rendered ineffective). If you have your own ssh-server, please be sure to follow the same guidelines.

This is at the moment in a test phase. Please try to use it as much as you can.

Background

The university is trying to reduce the number of open ports to the outside world. One of the open ports is port 22 for SSH. Although SSH is considered to have few vulnerabilities when set up correctly and kept up to date, the SSH servers are usually targets of brute-force attacks that can either probe weak passwords or decrease availability for users.

The ETAP linux desktops (and some servers) that are set up by Andreas Nussbaumer usually block IP addresses for hours (not users), if the password has been entered wrong 3 times. You can also use ssh-keys (see section about ssh-keys) for added security (ideally protected by a password as well, see remarks on using ssh-keys). The university LDAP user authentication blocks users (not IP addresses) when in a short amount of time many failed log-in attempt have been performed.

Both measures can be supported by allowing only ssh-connections from the university network and allow only outside connections through the sshgate.

Please read more about the sshgate ZDV here.

Creating ssh-keys

On Linux you can create ssh-keys by using ssh-keygen -b 4096.

-b 4096 ensures large keys that are considered safe at the moment (2020, see Wikipedia on RSA here)

Please note that, if you already have a key, the key files id_rsa and id_rsa.pub will be overwritten. You can also generate more than one key and use different keys for different servers.

More information can also be found here.

ssh-keys on sshgate

Please deploy your ssh-key in your account (link to account page) by following the instructions: ZDV - SSH Gate.

Most important is that the comment that you can enter when adding your key to your account must contain the string SSHGATE, HPCGATE or LINUXGATE (multiple purposes can be separated by a comma, e.g. HPCGATE,SSHGATE). This also means that people that have already deployed a key for Mogon do not need to register a new key.

Remarks on using ssh-keys

Always protect your ssh-key with a password, in case someone steals your ssh-key files (private key), the attacker needs a password to decrypt it. The password should follow common password guidelines (strong password, not the same as somewhere else etc.).

You ssh-key consists of two files/parts, the public key (id_rsa.pub) and the private key (id_rsa). The public key is free to be distributed and you can give it to other (even untrusted) people, websites or computers. The private key must be protected from other people (ssh even does not allow the private key file to have more than access rights to its owner).

On the ETAP computer the ssh-key can be used for login. However, your home directory needs to be mounted first and this can only work with your password. So the first time you log into a PC, you might need to enter your password and only after that your ssh-key is accepted.

Connection through the sshgate

This works similar to the login into Mogon through the hpcgate. You do not need to use the sshgate, when you directly connect to the hpcgate. You do not need to use the sshgate when you are inside the university network (physically or through a VPN).

The option -J automatically uses the ssh gate as a jump host (or jump proxy). You can have two different usernames for the login and you can also use different ssh-keys (not shown in the example below), the default will assume the default username ($USER) and ssh-key (id_rsa).

ssh -J UNIVERYSITYUSERNAME@sshgate.zdv.uni-mainz.de USERNAME@TARGET

For other operating systems or programs other than ssh, please look for the option jump proxy or jump host. If you want to use scp, rsync or similar, you need to use the appropriate option or the option that can set the ssh-command and you need to use the command and options above.

For older ssh versions you can try the options on this webpage WikiBooks ssh.

For more convenience use your ssh-config (see below).

Setting up the SSH-Configuration

In short, the ssh-config allows you to use an alias when connecting with ssh, and it can set all the options for a certain connection.

Host sshgate
    HostName sshgate.zdv.uni-mainz.de
    User <username>
    IdentityFile ~/Path/To/Private/Key1

Host myhost
    HostName myhost.physik.uni-mainz.de
    User <username>
    ProxyJump sshgate
    ForwardX11 yes
    IdentityFile ~/Path/To/Private/Key2

In this example, two aliases sshgate and myhost have been created. The aliases myhost and sshgate have been referenced. For both aliases a number of options, like the username and the path to the ssh key have been defined, so that you only need to use ssh myhost to access myhost.physik.uni-mainz.de with your correct username(s) through the sshgate.

ATLAS on MOGON II

In the following MOGON is used as a synonym for MOGON II.

All machines on Mogon2 are running a version of CentOS8, so that it can support running software like on LXPLUS.

For general usage of ATLAS software have a look at common ATLAS Wiki pages like here. See below for some specific topics on setting up software.

The ZDV hosts a wiki about MOGON with some useful hints: https://mogonwiki.zdv.uni-mainz.de

Please consider to join the HPC Matttermost group as well: https://mattermost.gitlab.rlp.net/hpc-support

Limitations

Generally the MOGON login and worker nodes have limited access to the internet, the worker nodes have only the necessary network connections, so some behaviour regarding access to computer outside of the MOGON network might be different from login nodes.

Login nodes have limited access to the university network which includes

  • gitlab.rlp.net
  • all wetap/etap machines

In addition some common websites have been enabled by the HPC team (this is subject to changes by the HPC team)

  • common websites for python pip installations
  • gitlab.cern.ch for gitlab access via https

Since ssh connections to university computers (and from university computer via lpcgate) are not limited you can use ssh to “bridge” connections12. Please note that you must understand what you are doing and connections should only be opened when needed. Any misuse can lead to a ban from using MOGON or more strict limitations on ssh for everyone.

Storage

Your starting directory is your home directory which is different from your university home directory. You should store your code here and other files that you do not change often.

There is a so-called project/scratch space available here /gpfs/fs7/atlas/ where you can create your own directory and store your output files. The size is 125TB for all Mainz ATLAS/ETAP users.

A large fraction of the storage is realized as a grid storage with the name MAINZ_LOCALGROUPDISK. Writing to this storage should be done via grid tools (see below). You can directly read the files from the storage in the directory /lustre/miifs02/storm/atlas/atlaslocalgroupdisk/rucio/ which then follows the rules of local storage of grid files. How to find your particular file is also explained below.

For your jobs on MOGON you should make use of a temporary directory $TMPDIR that is specifically created for the job. It will be on a very fast storage system and it will be deleted at the end of the job. Before the end of the job you can copy your output files to the project/scratch space.

File transfer from/to MOGON with ssh or samba

gitlab

Please read the section above about using sshgate and ssh-keys.

Software: CvmFS

CvmFS is installed on the login nodes as well as on the worker nodes.

Setup

Put this in your .bashrc:

export ATLAS_LOCAL_ROOT_BASE=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase
alias setupATLAS='source ${ATLAS_LOCAL_ROOT_BASE}/user/atlasLocalSetup.sh'

Then, you can enable the ATLAS environment. Due to the CentOS 8 operating system, this currently runs within a Singularity container and can be called with:

setupATLAS -c centos7 -b

If you want to run a script inside the container, you have to specify the script before running setupAtlas:

export ALRB_CONT_RUNPAYLOAD="<YourScript>"

In order to mount gpfs and lustre, create a file ~/alrb_container.cfg.sh containing:

if [ -d /lustre ]; then
    set_ALRB_ContainerEnv ALRB_CONT_CMDOPTS "-B /lustre:/lustre"
fi

# set HOME to be the same as that of regular home
set_ALRB_ContainerEnv ALRB_CONT_POSTSETUP "export HOME='$HOME'"

Follow common instructions on setting up ATLAS software via CvmFS, like e.g. lsetup or showVersions

MOGON II Gridsite

General remarks

  • You should only store data on MOGON that is related to your work on MOGON. The fileserver is not intended as a backup system.

  • We want to reserve miifs02 for the grid site. All your personal (MOGON related) data should be stored on /gpfs/fs7 (more details below).

  • Data you want to archive and do not need to access on a regular basis can be stored in the MOGON archive using iRODS

Introduction and HowTo: Click Here

Rucio

Request samples

Use rucio to store samples on our grid site (using https://rucio-ui.cern.ch/r2d2/request) instead of downloading them to a local folder. This way users can share datasets. And data not needed anymore will be removed after the lifetime you can define there. For all rucio operations, a grid certificate is needed. Due to the privacy policy on Mogon, you have to create it using a machine with CVMFS connected to the internet (e.g. lxplus). There you call:

setupATLAS
lsetup rucio
voms-proxy-init -voms atlas --valid 48:00 --out gridproxy.x509

Move gridproxy.x509 to Mogon and call:

setupATLAS #add suitable options here
export X509_USER_PROXY=/home/$(whoami)/gridproxy.x509
lsetup rucio

Once you stored a DID on the grid site you can find the corresponding files using:

rucio list-file-replicas --rse MAINZ_LOCALGROUPDISK --link /atlas/:/lustre/miifs02/storm/atlas/ DID

Upload samples

You can store your results of your analysis on our grid site using rucio upload instead of copying it to the scratch space by:

rucio upload --rse MAINZ_LOCALGROUPDISK --register-after-upload —lifetime 15552000 —name NAME FILE

Alternatively, you can perform the same for a group of files, e.g.:

rucio upload --rse MAINZ_LOCALGROUPDISK --register-after-upload user.dta:Embedding_DAODs folder/files_in_folder.*.root

--register-after-upload registers the file in rucio only after successful upload, especially important when uploading large datasets. Just adjust the username (dta in this case) and the files to create a group. These files can be found via:

rucio list-dataset-replicas user.dta:Embedding_DAODs

Blacklisting of sites

Blacklisting is not necessary anymore!!! ATLAS implemented a distance matrix and a multi-hopping schema, that should take care of the issues with Australia-ATLAS and TRIUMF-LCG2.

Transfer to FZK

If your datasets are only at one of these sites, please request a replica (DaTRI user request in PANDA web interface) to Karlsruhe FZK-LCG2_SCRATCHDISK. When the replica is complete the exclusion should work.

Cancellation of data transfers

Firstly, you have to identify the dataset’s name. Go to Panda and fill in the “Data Pattern” with the name of the dataset (e.g., user.tlin*). Choose “Request status” as “transfer” and click the button “list” to get all your dataset which are transferring now.

Second, click on the dataset name you would like to stop transferring; this will lead you to a page with details on the transfer. Check the “Status” and change it to “Stop”. The transfer should no be stopped. You can check the status again like detailed in the first step. It should be have the status “stopped”.

Monitoring

Some links to check the status of mainz.

Grid

EGI

Atlas

In case of problems

Footnotes