2.2 [SSH access, keys and things](#22-ssh-access-keys-and-things)
2.3[Login nodes](#23-login-nodes)
2.4 [Set up Cisco Duo](Cisco-Duo)
3.[Data transfer](#3-data-transfer)
4.[Filesystem](#4-filesystem)
4.1[Backing up your data](#41-backing-up-your-data)
5.[Submitting jobs](#5-submitting-jobs)
6.[module - selecting your software environment](#6-module---selecting-your-software-environment)
7.[Getting help](#7-getting-help)
<!-- 8. [FAQ (JARDS, Backup, ...)](faq) -->
## 1. RAMSES - Main specifications:
(**R**esearch **A**ccelerator for **M**odeling and **S**imulation with **E**nhanced **S**ecurity)
<!-- Image of ramses specs -->
<!--  -->
- 164 compute nodes, 10 Kubernetes nodes
- 348 CPUs = 31576 Cores
- Accelerators
...
...
@@ -37,7 +40,7 @@
- high-speed interconnect
- HDR100 InfiniBand
## Get access
## 2. Access to RAMSES
To gain access to Ramses, you need to fulfill three requirements (in any order):
...
...
@@ -51,104 +54,110 @@ Apply for a **user account**:
New users can apply for a trial account with limited core/GPU hours without a project description. Applications for a full account need a project description to be reviewed. Up to 15 million core hours per project, a technical review (reasonable usage of resources) is sufficient. Beyond that, a scientific review (importance of research) will be necessary.
### 2FA
### 2.1 Multi-factor-authentication
For security reasons, you can't login with a combination of username and password. Instead, we use a system called **2-Factor-Authentication** (2FA/MFA), meaning you need to prove your identity with two different (as in different systems/locations) 'factors':
For security reasons, you can't login with a combination of username and password. Instead, we use a system called **Multi-Factor-Authentication** (MFA, sometimes 2FA for two-factor), meaning you need to prove your identity with two different (as in different systems/locations) 'factors':
- The first factor is an SSH public key. Please send your SSH _public_ key to the HPC team. [General information on public key authentication](https://www.ssh.com/academy/ssh/public-key-authentication)
-As the second factor we use Cisco Duo. To use it, you will need to enroll your account, see [cisco-duo-setup.pdf](uploads/cd518a29f4362a9383c7345a975ed065/cisco-duo-setup.pdf) .
- The first factor is given by the SSH public key. Please send your SSH _public_ key to the HPC team. [General information on public key authentication](https://www.ssh.com/academy/ssh/public-key-authentication)
-The second factor involves the Cisco Duo app. To use it, you will need to enroll your account, see [cisco-duo-setup.pdf](uploads/cd518a29f4362a9383c7345a975ed065/cisco-duo-setup.pdf) .
If you own a [Yubikey](https://en.wikipedia.org/wiki/YubiKey) hardware token, you can also use it (in OTP mode) as the second authentication factor instead of Cisco Duo. If you are interested in using a Yubikey, please contact the [HPC-Team](mailto:hpc-mgr@uni-koeln.de). Please note: we can't provide Yubikeys to users, but it could be a worthwhile investment for about 50€.
After you have successfully enrolled in Duo and prepared your SSH Key, please send your key.
### Generate SSH keys
Here is a quick intro to ssh keys: There is always a private (as in **private - don't share, don't give away**) and a public key in a key pair. The public key (\*.pub) is put into the file `~/.ssh/authorized_keys` on Ramses. When you have the matching private key, this makes the login authentication work. Do not give away the private key and protect it with a passphrase. The keypairs are usually stored in a hidden directory (folder) named .ssh (same on Linux/Mac/WIN).
cat ~/.ssh/id_ed25519.pub # copy-paste and send to hpc-mgr@uni-koeln.de
# Procedure for circumventing passphrase at ssh-login (optional):
eval "$(ssh-agent -s)" # set ssh-agent's environment variables
ssh-add ~/.ssh/id_ed25519 # provide private-key identity to agent
ssh-add -l # list managed identities (should show at least one entry)
# Done, or keep reading below for more details
```
There is always a private (as in **private - don't share, don't give away**) and a public key in an SSH key pair. As with physical keys, one does not want to share private keys or leave copies thereof in other locations/computers. Instead, create new SSH key pairs on each host you use regularly. Let's outline a **3-step procedure** to get you "keyed-in".
You are asked for a file location, just press ENTER
The type of key to be generated is specified with the -t option, where we recommend the type "ed25519" for enhanced security. You can then confirm the default file location by hitting ENTER.
```
Enter file in which to save the key (/home/<USERNAME>/.ssh/id_ed25519):
```
Enter file in which to save the key (/home/<username>/.ssh/id_ed25519):
which will produce the private-and-public key pair `~/.ssh/id_ed25519` and `~/.ssh/id_ed25519.pub`.
Outdated SSH versions may not allow for the key type "ed25519". In this case, use
```
ssh-keygen -t rsa -b 4096 -C "<YOUR NAME>"
```
which will produce `~/.ssh/id_rsa` and `~/.ssh/id_rsa.pub`. Below, we keep assuming type "ed25519".
Next you **have to** enter a passphrase (you could use this [passphrase generator](https://www.tu-braunschweig.de/it-sicherheit/pwsec/pwgen)):
Next you **have to** enter a passphrase. Weak passphrases present other vulnerabilities. Therefore, for your convenience, this [passphrase generator](https://www.tu-braunschweig.de/it-sicherheit/pwsec/pwgen) assists in a secure choice.
```
Enter passphrase (empty for no passphrase):
```
**Don't** leave it empty!! As usual, store your password in a secure place or use a password-manager ([e.g. KeePass](https://keepass.info/download.html)).
**We stress the importance of a non-empty and secure** passphrase! As usual, store the passphrase in a secure place or use a password-manager ([e.g. KeePass](https://keepass.info/download.html)). In Step 3 (2.2.3) below, we outline how to dodge typing that passphrase multiple times.
```
Enter same passphrase again:
Your identification has been saved in /home/<username>/.ssh/id_ed25519
Your public key has been saved in /home/<username>/.ssh/id_ed25519.pub
Your identification has been saved in /home/<USERNAME>/.ssh/id_ed25519
Your public key has been saved in /home/<USERNAME>/.ssh/id_ed25519.pub
The key fingerprint is:
SHA256:mDIS+q3blablaBLABLAjePkEMEoR4sAIVumEoJiCXDNVs Your Name
You will now see the key pair `~/.ssh/id_ed25519` and `~/.ssh/id_ed25519.pub`.
You can ignore the rest of the output. The keypair is stored under \~/.ssh/id_ed25519(.pub). You can now send us the **public** key (id_ed25519\*\*.pub\*\*), either as a file or just copy/paste the key itself:
#### 2.2.2 Step 2: Transfer the public key
You can now send us the **public** key (`id_ed25519.pub`), either as a file or just copy-paste the public-key file content:
```
cat ~/.ssh/id_ed25519.pub
```
**Please send the public key to:****hpc-mgr@uni-koeln.de**
On Ramses, being the remote machine, the public key (\*.pub) will be a line-item in `~/.ssh/authorized_keys`. Together with your matching local private key, it enables ssh-authentication without password-request.
If the `ssh` program on your computer is outdated, it will not know the key type ed25519. In this case use
#### 2.2.3 Step 3 (optional): Wanna pass on the passphrase? - Call your agent!
`ssh-agent` is a program which can automatically do your authentication when logging in to a remote machine via ssh. Thus, to avoid providing the passphrase during multiple subsequent sessions, you can load the private key into memory using the ssh-agent.
First, check if your agent is at home by typing
```
ssh-keygen -t rsa -b 4096 -C "Your Name"
ssh-add -l # with `-l` as in "list"
```
and send us the file `~/.ssh/id_rsa.pub` instead.
To avoid having to enter the passphrase every time you log in, you can load the key into memory using the ssh-agent.
On most Linux and Mac systems, the agent is pre-installed, you can check with the command `ssh-add -l`. This should not return an error, but usually `This agent has no identities`. Otherwise you can start the ssh-agent:
This will list all identities currently represented by the agent. Think of "identity" as an SSH key that you want to add to the SSH authentication agent. In case, you see a message like this:
`Could not open a connection to your authentication agent.`
you need to activate the agent first:
```
ssh-agent # start the ssh-agent
eval "$(ssh-agent -s)"
```
Then you add the public key you just created:
When the agent is active, you may see a listing of identities, and you're good to go if your Ramses-key is part of it. However, if you see a message like `This agent has no identities.`, your key still needs to be added:
```
ssh-add [ path to your key file, ~/.ssh/id_rsa or id_ed25519 ]
ssh-add ~/.ssh/id_ed25519 # supply your private-key, may also be ~/.ssh/id_rsa
```
You can usually just run `ssh-add` since `ssh-add` can find the files on its own. `ssh-add` asks for the password you set in the `ssh-keygen` step and afterwards `ssh-add -l` should list your key like this:
Alternatively, you can just run `ssh-add` which adds all keys located in default file names. In either case, `ssh-add` will ask for the passphrase (as set with `ssh-keygen`) once more. Afterwards,`ssh-add -l` will produce a listing similar to
You can now use it within your session without having to re-enter the passphrase during subsequent logins.
You can now use it within your session without having to re-enter your SSH Key password.
If you have to use a Windows System: [Key-based authentication in OpenSSH for Windows](https://learn.microsoft.com/en-gb/windows-server/administration/openssh/openssh_keymanagement)
#### 2.2.4 Additional SSH information
- For the Windows System: [Key-based authentication in OpenSSH for Windows](https://learn.microsoft.com/en-gb/windows-server/administration/openssh/openssh_keymanagement)
If you already have access to Ramses but you are using the CHEOPS key, we advise you to create your own SSH key on your local machine/laptop and then add the public key to your `.ssh/authorized_keys` file in your home on Ramses. Any text editor will work for this.
- For CHEOPS users: If you already have access to Ramses but you are using the CHEOPS key, we advise you to create your own SSH key on your local machine/laptop and then add the public key to your `.ssh/authorized_keys` file in your home on Ramses. Any text editor will work for this.
**PLEASE NOTE**: Do no share SSH Keys with other people and do not copy private keys to other computers. Just create new SSH Key pairs on each computer you use regularly. You can also use[SSH Agent Forwarding](https://docs.github.com/en/authentication/connecting-to-github-with-ssh/using-ssh-agent-forwarding), where an SSH Key is taken along into a SSH session to a remote computer, eliminating the need to create many keys.
- How to avoid creating multiple key pairs for multiple remote machines:[SSH Agent Forwarding](https://docs.github.com/en/authentication/connecting-to-github-with-ssh/using-ssh-agent-forwarding). This feature allows an SSH Key to be taken along into another SSH session. Essentially, it lets you authenticate to other servers through an initial SSH connection.
### LOGIN
### 2.3 Login nodes
Ramses has currently two login servers available:
`ramses1.itcc.uni-koeln.de`
`ramses4.itcc.uni-koeln.de`
Ramses has four login servers: ramses1.itcc.uni-koeln.de to ramses4.itcc.uni-koeln.de . Do not use ramses2 or ramses3, they are for internal use only for now.
When logging in to ramses1, the public key you sent us is authenticated with the private key on your computer (1st factor, you will be asked for the ssh passphrase). If successful, a verification request is automatically pushed to the Duo App on your device where you need to confirm the login (2nd factor).
#### Login to `ramses1`
When logging in to **ramses1**, the public key you sent us is authenticated with your host's private key (1st factor, you will be asked for the ssh passphrase, unless you have ssh-agent configured as described above). If successful, a verification request is automatically pushed to the Duo-app on your device where you need to confirm the login (2nd factor).
On your terminal you should see something like this:
...
...
@@ -164,8 +173,8 @@ rpabel2@ramses1:~>
Even though the message `Autopushing...` appears twice, only one push is executed and only one verification is needed.
On ramses4, you can choose different Cisco Duo authenticators, if you have configured any:
#### Login to `ramses4`
On **ramses4**, you can choose different Cisco Duo authenticators, if you have configured any:
```
rpabel2@soliton:~> ssh ramses4
Duo two-factor login for rpabel2
...
...
@@ -181,10 +190,14 @@ Enter a passcode or select one of the following options:
In this example, if you choose '1', an authentication request is pushed to your phone and you just have to confirm it with a tap on the screen. Alternatively, instead of choosing a number in the above example, you could also open the Duo Mobile app on your device and enter the 6-digit passcode shown in the app. This code changes every 30 seconds.
**PLEASE NOTE**: Be carefull with scripted logins: Any login attempt with your SSH key that triggers Duo Autopush is counted by Duo. If you don't respond in your App, your account will be blocked after 10 attempts (and has to be unlocked by an admin).
**IMPORTANT NOTE**: Be carefull with scripted logins: Any login attempt with your SSH key that triggers Duo Autopush is counted by Duo. If you don't respond in your Duo-app, your account will be blocked after 10 attempts. Only an admin can then unlock it.
#### Facilitating your login
To facilitate login, we suggest to set an alias upon your shell-initialization (`~/.bashrc`, `~/.zshrc` or similar):
Make adjustments according to your preferences. In case you just happened to wish for enabling shorter attention spans during your ramses sessions: The default of `ServerAliveInterval` (in seconds) may cause automatic logouts after a relatively short idle time. Overriding this behaviour, as done above, can also be done globally on your host through (sudo) editing the file `/etc/ssh/ssh_config`. Then, it would simply be a line entry like `ServerAliveInterval 1000`.
## Data transfer
## 3. Data transfer
To transfer your data to the cluster, we recommend using [scp](https://tldr.inbrowser.app/pages/common/scp)(**s**ecure**c**o**p**y) - either on the command line (CLI/Terminal) or with a graphical client (e.g. WinSCP).
...
...
@@ -209,9 +222,9 @@ Please note: you can transfer data ONLY to the login nodes (ramses1 ... ramses4)
- if you prefer interactive transfer with a shiny GUI: e.g. [FileZilla (Linux/Mac/Win)](https://filezilla-project.org/), [WinSCP (Win only)](https://winscp.net/eng/download.php), [Cyberduck (Mac only)](https://cyberduck.io/download/)
## Filesystem
## 4. Filesystem
The filesystem setup is exactly as on CHEOPS:
Ramses replicates the filesystem setup of CHEOPS, specifically:
- /home/\<user\>
- size per user 100GB / 100.000 files
...
...
@@ -226,10 +239,11 @@ The filesystem setup is exactly as on CHEOPS:
- created on request
- NO AUTOMATIC BACKUP
### 4.1 Backing up your data
*we are working on filling in information about data backups on Ramses*👷
## Submitting jobs
## 5. Submitting jobs
There are several partitions/queues in slurm intended for general usage:
...
...
@@ -277,30 +291,26 @@ Each user has a default group account in slurm which corresponds to his workgrou
sacctmgr show assoc -n user=$USER format=Account
```
## Backup your data
\[coming soon\]
## 6. `module` - selecting your software environment
## Environment Modules
`module` is a user interface to the Modules package, which provides for the dynamic modification of the user's environment via module‐files. This helps avoid software conflicts due to incompatibilities, versioning, dependencies, etc. Further, module-files allow for concurrent usage of different software versions, for example, when cross-checking executable-output of new compiler versions.
To avoid software conflicts (resulting from incompatibilities, versioning, dependencies, etc.), software is provided as Environment Modules. By using Modules, it is possible to have different versions of software installed on the system.
Your shell-initialization script can select frequently used modules. For example, an entry in your `~/.bashrc` file might look like this:
`module load lang/Julia/1.9.3-linux-x86_64 lang/Python/3.11.5-GCCcore-13.2.0 # load My Favorite Things`
You can select the module(s) you need directly on the command line or in your scripts.
Basic `module` commands are:
| Syntax | Description |
| ----------- | ----------- |
| `module avail` | list all available modules |
| `module avail <module-name>` | search for specific module name (name can be truncated) |
| `module load <category>/<module>[/version]` | load a specific module |
| `module unload <module-name>` | unload a specific module |
| `module purge` | unload all modules |
| `module search <string>` | searches the module-whatis information for the specified string, useful for exploring available tools without knowing their module-name |
Basic commands are:
```
- module avail : list **available** modules
- module avail <string> : search for specific module name
For more details on the environment module system, see the [Software](/rpabel2/itcc-hpc-ramses/-/wikis/Software#Module-System) section.
## Getting help
## 7. Getting help
#### HPC support
...
...
@@ -318,7 +328,7 @@ Please have a look at the "Get access" section above for information on how to o
If you send a **support request**, please provide all relevant information to describe your case. In particular, **error messages** are crucial for analysis and should be provided with the request. Depending on your application, error messages are usually printed to the standard error (`stderr`) and/or the standard output (`stdout`) stream so that you will either see them passing by on the screen or find them in a corresponding file. In addition, accompanying information is often helpful to track down errors. For instance, if the batch system fails to run a job, you should provide the job identifier (`<jobid>`) with your report. If building an application fails, you should provide name and version of the compiler and the libraries used.
The HPC team handles hundreds of support requests per year. In order to ensure efficient and timely resolution of issues, please include in your request as much as possible of the following information:
In order to ensure efficient and timely resolution of issues, please include in your request as much as possible of the following information: