SSH Certificates
So remember back in chapter 32 when I set up Step-CA as our internal certificate authority? Step-CA also handle SSH certificates, which allows a less peer-to-peer model for authenticating between nodes. I'd actually tried to set these up before and it was an enormous pain in the pass and didn't really work well, so when I saw Step-CA included it in its featureset, I was excited.
It's very easy to allow authorized_keys
to grow without bound, and I'm fairly sure very few people actually read these messages:
The authenticity of host 'wtf.node.goldentooth.net (192.168.10.51)' can't be established.
ED25519 key fingerprint is SHA256:8xKJ5Fw6K+YFGxqR5EWsM4w3t5Y7MzO1p3G9kPvXHDo.
This key is not known by any other names.
Are you sure you want to continue connecting (yes/no/[fingerprint])?
So I wanted something that would allow seamless interconnection between the nodes while maintaining good security.
SSH certificates solve both of these problems elegantly. Instead of managing individual keys, you have a certificate authority that signs certificates. For user authentication, the SSH server trusts the CA's public key. For host authentication, your SSH client trusts the CA's public key.
It's basically the same model as TLS certificates, but for SSH. And since we already have Step-CA running, why not use it?
The Implementation
I created an Ansible role called goldentooth.setup_ssh_certificates
to handle all of this. Let me walk through what it does.
Setting Up the CA Trust
First, we need to grab the SSH CA public keys from our Step-CA server. There are actually two different keys - one for signing user certificates and one for signing host certificates:
- name: 'Get SSH User CA public key'
ansible.builtin.slurp:
src: "{{ step_ca.ca.etc_path }}/certs/ssh_user_ca_key.pub"
register: 'ssh_user_ca_key_b64'
delegate_to: "{{ step_ca.server }}"
run_once: true
become: true
- name: 'Get SSH Host CA public key'
ansible.builtin.slurp:
src: "{{ step_ca.ca.etc_path }}/certs/ssh_host_ca_key.pub"
register: 'ssh_host_ca_key_b64'
delegate_to: "{{ step_ca.server }}"
run_once: true
become: true
Then we configure sshd to trust certificates signed by our User CA:
- name: 'Configure sshd to trust User CA'
ansible.builtin.lineinfile:
path: '/etc/ssh/sshd_config'
regexp: '^#?TrustedUserCAKeys'
line: 'TrustedUserCAKeys /etc/ssh/ssh_user_ca.pub'
state: 'present'
validate: '/usr/sbin/sshd -t -f %s'
notify: 'reload sshd'
Host Certificates
For host certificates, we generate a certificate for each node that includes multiple principals (names the certificate is valid for):
- name: 'Generate SSH host certificate'
ansible.builtin.shell:
cmd: |
step ssh certificate \
--host \
--sign \
--force \
--no-password \
--insecure \
--provisioner="{{ step_ca.default_provisioner.name }}" \
--provisioner-password-file="{{ step_ca.default_provisioner.password_path }}" \
--principal="{{ ansible_hostname }}" \
--principal="{{ ansible_hostname }}.{{ cluster.node_domain }}" \
--principal="{{ ansible_hostname }}.{{ cluster.domain }}" \
--principal="{{ ansible_default_ipv4.address }}" \
--ca-url="https://{{ hostvars[step_ca.server].ipv4_address }}:9443" \
--root="{{ step_ca.root_cert_path }}" \
--not-after=24h \
{{ ansible_hostname }} \
/etc/step/certs/ssh_host.key.pub
Automatic Certificate Renewal
Notice the --not-after=24h
? Yeah, these certificates expire daily. Which means it's very important that the automatic renewal works 😀
Enter systemd timers:
[Unit]
Description=Timer for SSH host certificate renewal
Documentation=https://smallstep.com/docs/step-cli/reference/ssh/certificate
[Timer]
OnBootSec=5min
OnUnitActiveSec=15min
RandomizedDelaySec=5min
[Install]
WantedBy=timers.target
This runs every 15 minutes (with some randomization to avoid thundering herd problems). The service itself checks if the certificate needs renewal before actually doing anything:
# Check if certificate needs renewal
ExecCondition=/usr/bin/step certificate needs-renewal /etc/step/certs/ssh_host.key-cert.pub
User Certificates
For user certificates, I set up both root and my regular user account. The process is similar - generate a certificate with appropriate principals:
- name: 'Generate root user SSH certificate'
ansible.builtin.shell:
cmd: |
step ssh certificate \
--sign \
--force \
--no-password \
--insecure \
--provisioner="{{ step_ca.default_provisioner.name }}" \
--provisioner-password-file="{{ step_ca.default_provisioner.password_path }}" \
--principal="root" \
--principal="{{ ansible_hostname }}-root" \
--ca-url="https://{{ hostvars[step_ca.server].ipv4_address }}:9443" \
--root="{{ step_ca.root_cert_path }}" \
--not-after=24h \
root@{{ ansible_hostname }} \
/etc/step/certs/root_ssh_key.pub
Then configure SSH to actually use the certificate:
- name: 'Configure root SSH to use certificate'
ansible.builtin.blockinfile:
path: '/root/.ssh/config'
create: true
owner: 'root'
group: 'root'
mode: '0600'
block: |
Host *
CertificateFile /etc/step/certs/root_ssh_key-cert.pub
IdentityFile /etc/step/certs/root_ssh_key
marker: '# {mark} ANSIBLE MANAGED BLOCK - SSH CERTIFICATE'
The Trust Configuration
For the client side, we need to tell SSH to trust host certificates signed by our CA:
- name: 'Configure SSH client to trust Host CA'
ansible.builtin.lineinfile:
path: '/etc/ssh/ssh_known_hosts'
line: "@cert-authority * {{ ssh_host_ca_key }}"
create: true
owner: 'root'
group: 'root'
mode: '0644'
And since we're all friends here in the cluster, I disabled strict host key checking for cluster nodes:
- name: 'Disable StrictHostKeyChecking for cluster nodes'
ansible.builtin.blockinfile:
path: '/etc/ssh/ssh_config'
block: |
Host *.{{ cluster.node_domain }} *.{{ cluster.domain }}
StrictHostKeyChecking no
UserKnownHostsFile /dev/null
marker: '# {mark} ANSIBLE MANAGED BLOCK - CLUSTER SSH CONFIG'
Is this less secure? Technically yes. Do I care? Not really. These are all nodes in my internal cluster that I control. The certificates provide the actual authentication.
The Results
After running the playbook, I can now SSH between any nodes in the cluster without passwords or key management:
root@bramble-ca:~# ssh bramble-01
Welcome to Ubuntu 24.04.1 LTS (GNU/Linux 6.8.0-1017-raspi aarch64)
...
Last login: Sat Jul 19 00:15:23 2025 from 192.168.10.50
root@bramble-01:~#
No host key verification prompts. No password prompts. Just instant access.
And the best part? I can verify that certificates are being used:
root@bramble-01:~# ssh-keygen -L -f /etc/step/certs/ssh_host.key-cert.pub
/etc/step/certs/ssh_host.key-cert.pub:
Type: ssh-ed25519-cert-v01@openssh.com host certificate
Public key: ED25519-CERT SHA256:M5PQn6zVH7xJL+OFQzH4yVwR5EHrF2xQPm9QR5xKXBc
Signing CA: ED25519 SHA256:gNPpOqPsZW6YZDmhWQWqJ4l+L8E5Xgg8FQyAAbPi7Ss (using ssh-ed25519)
Key ID: "bramble-01"
Serial: 8485811653946933657
Valid: from 2025-07-18T20:13:42 to 2025-07-19T20:14:42
Principals:
bramble-01
bramble-01.node.goldentooth.net
bramble-01.goldentooth.net
192.168.10.51
Critical Options: (none)
Extensions: (none)
Look at that! The certificate is valid for exactly 24 hours and includes all the names I might use to connect to this host.