SSH
Connections and Agents
Connecting to other computers sounds like useful thing to do: whether you want to get data from another computer, copy data to another location or just want to send a couple of computations to a cluster, it all involves accessing computers via a network. SSH (secure shell) is the most popular protocol to do so.
This post is a quick introduction into SSH and a couple of tricks when using it.
We assume that you are working on a UNIX (Mac or Linux) machine.
This post does not cover Windows.
SSH on Windows is not that much fun. You will have to use programs like PuTTY
and Pageant
to achieve a similar result.
TL;DR
- SSH (secure shell) is a protocol to connect to other machines over the networks
- SSH keys are a more secure version to connect to clusters and help you to avoid typing your password hundreds of times a day. They consist of two parts: a private and a public part. As the name suggests, the public part goes to the remote machine and lets you connect without typing your password.
- SSH agents store your decrypted private key in memory and you can connect to remote machines without typing your password (after setting up the SSH keys)
Connecting to a remote machine
Most scientific compute clusters are not directly connected to the internet. They are either available via the local network, or via a dedicated gateway server. The cluster itself is usually divided into login nodes and the compute nodes. The goal of this post is to get you to the login node. The inner workings of the cluster will be covered in another post.
As an example, we are going to consider the xmaris
cluster at the physics department of Leiden university.
If you want to connect to a cluster, you always connect to the login node.
This node manages the connections, runs tests or compiles your code.
It is not meant to do any heavy lifting.
The computing nodes are doing that.
However, usually don’t connect directly to the computing nodes.
The workload manager will distribute the payload across the nodes.
For security reasons, it is not possible to connect to the cluster’s login node.
The institute’s network is separated from the internet via a firewall.
In order to connect to the network, you have to connect via a gateway machine, in the case of xmaris
it is called styx
.
If you are connecting from inside of the institute’s network, you could directly connect to the login node.
That is not possible from outside the network, i.e. from your home.
Luckily, there are clever ways to avoid the double connection by using a jump server (see below).
If you have access, the cluster of the theory group can be accessed via
ssh <cluster_user>@xmaris.lorentz.leidenuniv.nl
where <cluster_user>
is your username on the cluster.
If your username on the cluster is the same as your local username, you can use the shorter command
ssh xmaris.lorentz.leidenuniv.nl
Additionally, you have to be in the institute network (\emph{eduroam is not enough}).
If not, you first have to log into styx
with
ssh <cluster_user>@styx.lorentz.leidenuniv.nl
and then log into xmaris
ssh xmaris
or activate your VPN.
For further details, have a look here. In the next part, we will set up a jump server via SSH to automate the process.
Proxy Jumps with the SSH config
Naturally, typing your password twice and connecting by hand via another machine is kind of annoying. So, is there a better way to do this. Yes, there is!
The ssh
command on Unix (Mac and Linux) systems is controlled by a local configuration file located under ~/.ssh/config
.
It gives us a lot of useful tools to customize the ssh
command.
As a first thing, we can define some alias names for the servers.
Nobody really wants to remember URLs.
If file ~/.ssh/config
does not exist yet, you can just create it.
It is just a textfile and enter the following text:
Host styx
HostName styx.lorentz.leidenuniv.nl
User <username>
Host xmaris
HostName xmaris.lorentz.leidenuniv.nl
User <username>
ProxyJump <username>@styx.lorentz.leidenuniv.nl
A quick note: You need to substitute <username>
with the your username on the cluster (not your local username on the computer that you are using; the two might coincide).
Now, we can just type ssh styx
and the ssh
command will automatically use the correct username and URL to connect to.
However, this still does not solve the problem of typing the password multiple times.
We would like to make the gateway (styx
) transparent, i.e. we don’t notice that we connect via styx
to xmaris
.
Luckily, the config
comes to our rescue.
We cannot only define aliases for long server names and but also tell SSH how to connect to a certain server.
In our case, we are interested in connecting to xmaris
via a jump server because the login node of the xmaris
cluster is not visible to the internet.
By connecting to the gateway styx
first, we enter the Lorentz Institute network and can connect to the xmaris
server in a second step.
However, we really couldn’t care less about the gateway server.
It serves only as a crutch to avoid a VPN.
Luckily, SSH has an option for that: ProxyJump
.
In short, add the command ProxyJump <username>@styx.lorentz.leidenuniv.nl
to the entry of xmaris
in your ssh
config file.
The configuration for the cluster could look like:
Host styx
HostName styx.lorentz.leidenuniv.nl
User <username>
ForwardX11 no
Host xmaris
HostName xmaris.lorentz.leidenuniv.nl
User <username>
ProxyJump <username>@styx.lorentz.leidenuniv.nl
ForwardX11 no
Let’s note a couple of things:
The line ForwardX11 no
avoids the forwarding of an X server connection from the cluster to the local computer.
The X server is responsible for creating a graphical user interface, but who needs a GUI on a cluster anyway?!
So far, so good. But we still have to type our password multiple times. Isn’t there still a better way to do this?
SSH keypairs
The answer to less passwords are SSH keys with an SSH agent. Let’s start with SSH keys.
If you haven’t set up a key pair yet, you will have to type your password twice; once to enter the gateway and then to enter the login node. You can avoid typing passwords at all if you set up a keypairs for the connections (and store the private key in an agent, more about that later).
Create a key pair
The commandssh-keygen
a new key pair. It consists of a private and a public key. As the name suggests, the public key will be public and the private key should not be distributed. Pictorially, you can think of the public key as a lock. Everybody can have your lock, you don’t care too much. The private key is the key (see graphics above). You really don’t want to loose your key.
The commandssh-keygen
will ask for two things: a location and a password for the private key. The default location for keys is~/.ssh/
(the folder where also the SSH config lives). Naming keys in a reasonable manner is good practice. It helps you to later recognize which key belongs to which server or service. One could be namedgitlab
, another onexmaris
, etc. For the matter of this example, let’s name the keyxmaris-key
. Please do set a password for your private keys. In case your computer should get compromised, you don’t want that people can impersonate you on all clusters that you are working on!Distribute the public key Now, we have to tell the server about the public key, i.e. we have to send the pictorial lock to the server. The easiest way to do that is the command
ssh-copy-id
. In the example ofstyx
, this could bessh-copy-id -i ~/.ssh/xmaris-key.pub styx
and thenssh-copy-id -i ~/.ssh/xmaris-key.pub xmaris
. The suffix .pub is important here. You want to distribute your public key not the private one. Now, the public part of thexmaris-key
key is stored on both,styx
andxmaris
.Log in with the private key Finally, you should be able to log into the server with
ssh -i ~/.ssh/xmaris-key xmaris
. The command will not ask for your password anymore, but for the password of the private key. There is of course a more convenient way to tell SSH which key pair to use. You can add it to thestyx
gateway and toxmaris
by adding the lineIdentityFile <path to file>
to your configuration.
Host styx
HostName styx.lorentz.leidenuniv.nl
User <username>
ForwardX11 no
IdentityFile ~/.ssh/xmaris-key
Host xmaris
HostName xmaris.lorentz.leidenuniv.nl
User <username>
ProxyJump <username>@styx.lorentz.leidenuniv.nl
ForwardX11 no
IdentityFile ~/.ssh/xmaris-key
For further insights about SSH public key authentication, I recommend this beauty.
SSH Agents
Although we are using SSH keys now, we still have to type the password of the private key. As a final step, we will get rid of that password and make working on remote machines completely seamless.
The main ingredient for this step is the SSH agent. It is a possibility to store decrypted private keys for a certain amount of time. The default time varies from 24h to ’next shutdown'.
In order to check if you have an SSH agent running on your system by default, please execute ssh-add -l
.
If you see an error message, then you don’t have an agent running and should execute eval $(ssh-agent)
.
It will start an agent that we can use to store the private keys.
To add keys to the agent, we use the command line program ssh-add
.
With ssh-add <path to private key>
, you can add a private key to the agent.
It will store a decrypted version and make it to SSH upon request.
After adding xmaris-key
with ssh-add ~/.ssh/xmaris-key
, you should be able to connect to xmaris
directly with an automatic proxy jump via styx
by just typing:
ssh xmaris
Finally, there is a cool feature of SSH keys which is forwarding:
The ForwardAgent yes/no
in the config enables or disables the forwarding of a local SSH agent to the cluster.
This choice here is entirely personal, yours might differ.
Your local agent can be sent to the cluster via the SSH connection and makes private keys available on the cluster without storing them there.
Personally, I like to have the agent on xmaris to connect to gitlab/github.
This way, I don’t have to store my private keys on the cluster permanently.
They are transferred as part of the agent and I can use them on the cluster as if they were stored there.
For more information on SSH agents, have a look here.
The final SSH configuration could look like this:
Host styx
HostName styx.lorentz.leidenuniv.nl
User <username>
IdentityFile ~/.ssh/xmaris-key
ForwardX11 no
ForwardAgent yes
Host xmaris
HostName xmaris.lorentz.leidenuniv.nl
User <username>
ProxyJump <username>@styx.lorentz.leidenuniv.nl
IdentityFile ~/.ssh/xmaris-key
ForwardX11 no
ForwardAgent yes
Although the commands in this post are given for a special cluster (xmaris
with the styx
gateway), very similar commands should work with other clusters as well.
If in doubt, contact your local cluster administrator and ask for advice.
Commands used in this post
ssh-keygen
: Generate a private-public key pairssh-copy-id <path_to_public_key> <server>
: Copy the public key to a servereval $(ssh-agent)
: Start an SSH agentssh-add <path to private key>
: Store a new private key in the agentssh-add -l
: List all private keys stored in the agentssh-add -D
: Delete all private keys from the agent