Atlas
ETAP login through sshgate
The instructions here are mostly for the desktop linux PCs that have been installed by Andreas Nussbaumer (or with boot server by Andreas) and that are modified in a way, that the central administration is not possible (or rendered ineffective). If you have your own ssh-server, please be sure to follow the same guidelines.
This is at the moment in a test phase. Please try to use it as much as you can.
Background
The university is trying to reduce the number of open ports to the outside world. One of the open ports is port 22 for SSH. Although SSH is considered to have few vulnerabilities when set up correctly and kept up to date, the SSH servers are usually targets of brute-force attacks that can either probe weak passwords or decrease availability for users.
The ETAP linux desktops (and some servers) that are set up by Andreas Nussbaumer usually block IP addresses for hours (not users), if the password has been entered wrong 3 times. You can also use ssh-keys (see section about ssh-keys) for added security (ideally protected by a password as well, see remarks on using ssh-keys). The university LDAP user authentication blocks users (not IP addresses) when in a short amount of time many failed log-in attempt have been performed.
Both measures can be supported by allowing only ssh-connections from the university network and allow only outside connections through the sshgate.
Please read more about the sshgate ZDV here .
Creating ssh-keys
On Linux you can create ssh-keys by using
ssh-keygen -b 4096
.
-b 4096
ensures large keys that are considered safe at the moment (2020, see
Wikipedia on RSA here
)
Please note that, if you already have a key, the key files id_rsa
and id_rsa.pub
will be overwritten.
You can also generate more than one key and use different keys for different servers.
More information can also be found here .
ssh-keys on sshgate
Please deploy your ssh-key in your account ( link to account page ) by following the instructions: ZDV - SSH Gate .
Most important is that the comment that you can enter when adding your key to your account must contain the string SSHGATE
, HPCGATE
or LINUXGATE
(multiple purposes can be separated by a comma, e.g. HPCGATE,SSHGATE
).
This also means that people that have already deployed a key for Mogon do not need to register a new key.
Remarks on using ssh-keys
Always protect your ssh-key with a password, in case someone steals your ssh-key files (private key), the attacker needs a password to decrypt it. The password should follow common password guidelines (strong password, not the same as somewhere else etc.).
You ssh-key consists of two files/parts, the public key (id_rsa.pub
) and the private key (id_rsa
). The public key is free to be distributed and you can give it to other (even untrusted) people, websites or computers. The private key must be protected from other people (ssh even does not allow the private key file to have more than access rights to its owner).
On the ETAP computer the ssh-key can be used for login. However, your home directory needs to be mounted first and this can only work with your password. So the first time you log into a PC, you might need to enter your password and only after that your ssh-key is accepted.
Connection through the sshgate
This works similar to the login into Mogon through the hpcgate. You do not need to use the sshgate, when you directly connect to the hpcgate. You do not need to use the sshgate when you are inside the university network (physically or through a VPN).
The option -J
automatically uses the ssh gate as a jump host (or jump proxy). You can have two different usernames for the login and you can also use different ssh-keys (not shown in the example below), the default will assume the default username ($USER) and ssh-key (id_rsa).
ssh -J UNIVERYSITYUSERNAME@sshgate.zdv.uni-mainz.de USERNAME@TARGET
For other operating systems or programs other than ssh, please look for the option jump proxy or jump host. If you want to use scp
, rsync
or similar, you need to use the appropriate option or the option that can set the ssh-command and you need to use the command and options above.
For older ssh versions you can try the options on this webpage WikiBooks ssh .
For more convenience use your ssh-config (see below).
Setting up the SSH-Configuration
In short, the ssh-config allows you to use an alias when connecting with ssh, and it can set all the options for a certain connection.
In this example, two aliases sshgate
and myhost
have been created. The aliases myhost
and sshgate
have been referenced. For both aliases a number of options, like the username and the path to the ssh key have been defined, so that you only need to use
ssh myhost
to access myhost.physik.uni-mainz.de with your correct username(s) through the sshgate.
ATLAS on MOGON II
In the following MOGON is used as a synonym for MOGON II.
All machines on Mogon2 are running a version of CentOS8, so that it can support running software like on LXPLUS.
For general usage of ATLAS software have a look at common ATLAS Wiki pages like here . See below for some specific topics on setting up software.
The ZDV hosts a wiki about MOGON with some useful hints: https://docs.hpc.uni-mainz.de
Please consider to join the HPC Matttermost group as well: https://mattermost.gitlab.rlp.net/hpc-support
Limitations
Generally the MOGON login and worker nodes have limited access to the internet, the worker nodes have only the necessary network connections, so some behaviour regarding access to computer outside of the MOGON network might be different from login nodes.
Login nodes have limited access to the university network which includes
- gitlab.rlp.net
- all wetap/etap machines
In addition some common websites have been enabled by the HPC team (this is subject to changes by the HPC team)
- common websites for python pip installations
- gitlab.cern.ch for gitlab access via https
Since ssh connections to university computers (and from university computer via lpcgate) are not limited you can use ssh to “bridge” connections12. Please note that you must understand what you are doing and connections should only be opened when needed. Any misuse can lead to a ban from using MOGON or more strict limitations on ssh for everyone.
Storage
Your starting directory is your home directory which is different from your university home directory. You should store your code here and other files that you do not change often.
There is a so-called project/scratch space available here /gpfs/fs7/atlas/
where you can create your own directory and store your output files. The size is 125TB for all Mainz ATLAS/ETAP users.
A large fraction of the storage is realized as a grid storage with the name MAINZ_LOCALGROUPDISK
. Writing to this storage should be done via grid tools (see below). You can directly read the files from the storage in the directory /lustre/miifs02/storm/atlas/atlaslocalgroupdisk/rucio/
which then follows the rules of local storage of grid files. How to find your particular file is also explained below.
For your jobs on MOGON you should make use of a temporary directory $TMPDIR
that is specifically created for the job. It will be on a very fast storage system and it will be deleted at the end of the job. Before the end of the job you can copy your output files to the project/scratch space.
File transfer from/to MOGON with ssh or samba
gitlab
ssh/ssh-keys related information
Please read the section above about using sshgate and ssh-keys.
Software: CvmFS
CvmFS is installed on the login nodes as well as on the worker nodes.
Setup
Put this in your .bashrc
:
Then, you can enable the ATLAS environment. Due to the CentOS 8 operating system, this currently runs within a Singularity container and can be called with:
If you want to run a script inside the container, you have to specify the script before running setupAtlas
:
In order to mount gpfs and lustre, create a file ~/alrb_container.cfg.sh
containing:
Follow common instructions on setting up ATLAS software via CvmFS, like e.g. lsetup or showVersions
MOGON II Gridsite
General remarks
You should only store data on MOGON that is related to your work on MOGON. The fileserver is not intended as a backup system.
We want to reserve
miifs02
for the grid site. All your personal (MOGON related) data should be stored on/gpfs/fs7
(more details below).Data you want to archive and do not need to access on a regular basis can be stored in the MOGON archive using iRODS
Introduction and HowTo: Click Here
Rucio
Request samples
Use rucio to store samples on our grid site (using https://rucio-ui.cern.ch/r2d2/request ) instead of downloading them to a local folder. This way users can share datasets. And data not needed anymore will be removed after the lifetime you can define there. For all rucio operations, a grid certificate is needed. Due to the privacy policy on Mogon, you have to create it using a machine with CVMFS connected to the internet (e.g. lxplus). There you call:
Move gridproxy.x509
to Mogon and call:
Once you stored a DID on the grid site you can find the corresponding files using:
Upload samples
You can store your results of your analysis on our grid site using rucio upload instead of copying it to the scratch space by:
Alternatively, you can perform the same for a group of files, e.g.:
--register-after-upload
registers the file in rucio only after successful upload, especially important when uploading large datasets. Just adjust the username (dta in this case) and the files to create a group. These files can be found via:
Blacklisting of sites
Blacklisting is not necessary anymore!!! ATLAS implemented a distance matrix and a multi-hopping schema, that should take care of the issues with Australia-ATLAS
and TRIUMF-LCG2
.
Transfer to FZK
If your datasets are only at one of these sites, please request a replica (DaTRI user request in PANDA web interface) to Karlsruhe FZK-LCG2_SCRATCHDISK. When the replica is complete the exclusion should work.
Cancellation of data transfers
Firstly, you have to identify the dataset’s name. Go to Panda and fill in the “Data Pattern” with the name of the dataset (e.g., user.tlin*). Choose “Request status” as “transfer” and click the button “list” to get all your dataset which are transferring now.
Second, click on the dataset name you would like to stop transferring; this will lead you to a page with details on the transfer. Check the “Status” and change it to “Stop”. The transfer should no be stopped. You can check the status again like detailed in the first step. It should be have the status “stopped”.
Monitoring
Some links to check the status of mainz
.
Grid
EGI
- Nagios
- Central Operations Portal - Master Instance
- Availabilities & Reliabilities, Last 30 days
- MyEGI Service Availability Monitoring Portal
- All Tickets
- Wiki , SAM Instances , IGTF , Benchmark values , NGI_DE regional tools , GOCDB , NGI_DE Operations Center , Decommissioning
- Accounting Portal , APEL Publication Test , APEL Synchronisation Test
Atlas
- ADC Monitoring
- Panda Monitor BigPanda: https://bigpanda.cern.ch/
- RSE (Rucia SE)
- Blacklisted Sites
- AGIS
- Status of LGD #sub
- GridKa Cloud
- Atlas Installation System 2
- Dataset Accounting
In case of problems
- DaTri, grid certficates, grid writes: hn-atlas-dist-analysis-help (Distributed Analysis Help)