TinySSH vs others

Comparison of Dropbear, OpenSSH and TinySSH
2016-01-26 sysadmin

This post follows Installing TinySSH.

Disclaimer: I’m in not a crypto expert! Don’t take anything below for granted.

Security features

  • OpenSSH implements privilege separation; neither Dropbear nor TinySSH does. IMHO this is a big deal.
  • TinySSH implements less, which is its main appeal.
  • Dropbear is more featured but seems to be trailing security wise, it’s change log looks especially scary.
  • Dropbear is often used in conjunction with busybox due to its small memory and executable footprint.

The clear winner here is OpenSSH due to its maturity and privilege separation feature. Microsoft recently committed to support OpenSSH, albeit still in a fork.

Setup

  • Local build of TinySSH @ 0281a0ca15
  • x64 host:
    • Up to date Ubuntu 14.04 on Dual Xeon E5-2690 @ 2.90GHz
    • Ubuntu’s OpenSSH_6.6.1p1 Ubuntu-2ubuntu2.4, OpenSSL 1.0.1f 6 Jan 2014
    • Ubuntu’s Dropbear v2013.60
    • To test Dropbear, used: sudo dropbear -p 10021 -E
    • To test TinySSH, used: tcpserver -HRDl0 0.0.0.0 10022 ./tinysshd -v ./keys
  • ARM host:
    • Up to date Raspbian Jessie on Raspberry Pi2 on ARM Cortex-A7 BCM2709
    • Raspbian’s OpenSSH_6.7p1 Raspbian-5+deb8u1, OpenSSL 1.0.1k 8 Jan 2015
    • Raspbian’s Dropbear v2014.65
    • To test Dropbear, used: sudo dropbear -p 10021 -E
    • To test TinySSH: installed via systemd as described in my previous post
  • A third x64 host running up to date Ubuntu 14.04 is used for connection latency and I/O performance testing.
  • All hosts are connected to the same 1Gbit network switch with wifi disabled.
  • _~/.ssh/authorizedkeys is cleared up to only contain the key (RSA or Ed25519) being testing.
  • ~/.ssh/config is set to:

    • Default: empty file
    • Strict: with the following content:

      Ciphers chacha20-poly1305@openssh.com
      HostKeyAlgorithms ssh-ed25519
      KexAlgorithms curve25519-sha256@libssh.org
      

Caveats

  • I’m not a cryptography expert.
  • Both OpenSSH and Dropbear were running quite old versions, a follow up should be done with more recent versions.
  • TinySSH wasn’t built with NaCl, which claims to improve performance.
  • Dropbear has a flag “-i inet mode” which could be leveraged to have it start via systemd. This wasn’t tested.
  • Measurements focus on the real time; the latency in running the command, not in the CPU time involved in the client.
  • I don’t have an OSX 10.11 host with a 1Gbit ethernet wired connection to test performance. On the other hand I realized that to be able to connect to the server, chacha20-poly1305@openssh.com needs to be enabled manually. Add the following to your ~/.ssh/config to enable chacha20 and usual aes ciphers. In particular, this disable CBC modes that are in the default:

    Ciphers chacha20-poly1305@openssh.com,aes128-ctr,aes256-ctr
    

Preferred cipher stream protocol

Measurements

Cipher stream protocol preference was detected using:

ssh -vvv $HOST "exit 0" |& grep "debug1: kex: server->client"

Results

Server   | Host | .ssh/config |                 Cipher chosen
-------- | ---- | ----------- | -----------------------------
dropbear |  ARM |     Default | aes128-ctr with hmac-md5
dropbear |  ARM |      Strict | Failure to connect
dropbear |  x64 |     Default | aes128-ctr with hmac-md5
dropbear |  x64 |      Strict | Failure to connect
openssh  |  ARM |     Default | aes128-ctr with hmac-sha1-etm
openssh  |  ARM |      Strict | chacha20-poly1305@openssh.com
openssh  |  x64 |     Default | aes128-ctr with hmac-md5-etm
openssh  |  x64 |      Strict | chacha20-poly1305@openssh.com
tinyssh  |  ARM |     Default | chacha20-poly1305@openssh.com
tinyssh  |  ARM |      Strict | chacha20-poly1305@openssh.com
tinyssh  |  x64 |     Default | chacha20-poly1305@openssh.com
tinyssh  |  x64 |     Default | chacha20-poly1305@openssh.com
  • I was a bit saddened by the preferred hmac algorithm on Ubuntu but this is fixable with a properly configured ~/.ssh/config or on the server in _/etc/ssh/sshdconfig.
  • Dropbear trails in its support for recent encryption algorithms, it doesn’t support chacha20 nor Ed25519.

Connection latency

Connection latency is important for use case like a git server where repeated but short lived connections are frequently done. In practice, we’d aim for sub 200ms on a local network with high performance machines to reduce the perceptible overhead.

Measurements

Measurements are done by taking the median value of 5 repetitions via:

(for i in {0..4}; do time ssh $HOST "exit 0"; done) |& grep real | sort -n

Results

Server   | Host | .ssh/config |    Algo | Latency
-------- | ---- | ----------- | ------- | -------
dropbear |  ARM |     Default |     RSA |   287ms
dropbear |  x64 |     Default |     RSA |    95ms
openssh  |  ARM |     Default |     RSA |   206ms
openssh  |  ARM |     Default | Ed25519 |   241ms
openssh  |  ARM |      Strict |     RSA |   218ms
openssh  |  ARM |      Strict | Ed25519 |   252ms
openssh  |  x64 |     Default |     RSA |   310ms
openssh  |  x64 |     Default | Ed25519 |   304ms
openssh  |  x64 |      Strict |     RSA |   309ms
openssh  |  x64 |      Strict | Ed25519 |   334ms
tinyssh  |  ARM |     Default | Ed25519 |   192ms
tinyssh  |  ARM |      Strict | Ed25519 |   197ms
tinyssh  |  x64 |     Default | Ed25519 |   103ms
tinyssh  |  x64 |      Strict | Ed25519 |   132ms
  • OpenSSH: the fact that handshake is slower on x64 than on ARM is probably related to a methodology error on my part, I can’t imagine this is normal.
  • OpenSSH: Ed25519 is slightly slower (15%) on ARM but small enough on x64.
  • Dropbear and TinySSH: both seem to gain a lot of performance from not using privilege separation on x64 but the benefit is not much present on ARM.
  • I saw from the logs that the client sent first “none” authentication, then RSA, then Ed25519, so Ed25519 is slightly disadvantaged above by 2 RTT. The network latency is <0.2ms on the test network so it shouldn’t have significantly impacted measurements.
  • I’m not sure why the Strict mode influence the latency so much compared to Default for TinySSH on x64. The difference was consistently reproduced but is not as visible on ARM.

The results clearly warrants a follow up with further diagnostics.

I/O Performance

Measurements

The test reads from /dev/zero and sends it over stdout. Since the stream is not compressed and network not encumbered, this measures the cipher stream performance with minimal CPU overhead and no disk I/O.

Measurements are done by taking the median value of 5 repetitions via:

(for i in {0..4}; do time ssh $HOST "dd if=/dev/zero count=1024 bs=1048576" > /dev/null; done) |& grep real | sort -n

Results

The Raspberry Pi2 has a 100mbit ethernet port connected over USB, so it’s bound to be significantly slower.

Server   | Host | .ssh/config |    Algo | Time to send 1Gb |      Speed
-------- | ---- | ----------- | ------- | ---------------- | ----------
dropbear |  ARM |     Default |     RSA |         155942ms |   6.6MiB/s
dropbear |  x64 |     Default |     RSA |          10240ms | 100.0MiB/s
openssh  |  ARM |     Default |     RSA |         120216ms |   8.5MiB/s
openssh  |  ARM |     Default | Ed25519 |         114913ms |   8.9MiB/s
openssh  |  ARM |      Strict | Ed25519 |         114913ms |   8.9MiB/s
openssh  |  x64 |     Default |     RSA |           9464ms | 108.2MiB/s
openssh  |  x64 |     Default | Ed25519 |           9450ms | 108.4MiB/s
openssh  |  x64 |      Strict | Ed25519 |           9568ms | 107.0MiB/s
tinyssh  |  ARM |     Default | Ed25519 |         124963ms |   8.2MiB/s
tinyssh  |  x64 |     Default | Ed25519 |           9319ms | 109.9MiB/s
tinyssh  |  x64 |      Strict | Ed25519 |           9515ms | 107.6MiB/s
  • Dropbear is noticeably slower than other implementations.
  • TinySSH is on part with OpenSSH yet has no assembly code.
  • On x64, pretty much all servers are close to saturating the network bandwidth.
  • On the Raspberry Pi2, this is probably saturating its USB<->Fast Ethernet controller.
  • The small variance seems to be closely related to the handshake latency.

Memory use

Measurement

This is in no way scientific, I ssh’ed in and look at the ps output of the server process for the connection.

Results

The memory values are for up to 3 processes involved: daemon / priv separation / connection

The values varies a lot from one host to another (e.g. between two Ubuntu 14.04 x64 hosts) so take these with a large grain of salt.

Server   | Host | .ssh/config |    Algo |     Virtual Memory Size | Resident Set Size*
-------- | ---- | ----------- | ------- | ----------------------- | ------------------
dropbear |  ARM |     Default |     RSA |  2456 /    N/A /   2908 | 1648 /  N/A / 1860
openssh  |  ARM |     Default | Ed25519 |  7808 /  11072 /  11072 | 4388 / 4932 / 3104
openssh  |  ARM |     Default |     RSA |  7808 /  11072 /  11072 | 4388 / 4956 / 3048
tinyssh  |  ARM |     Default | Ed25519 |   N/A /    N/A /   3176 |  N/A /  N/A / 2292
dropbear |  x64 |     Default |     RSA | 11032 /    N/A /  19464 | 1300 /  N/A / 2268
openssh  |  x64 |     Default | Ed25519 | 61376 / 109796 / 109796 | 1304 / 6588 / 3904
openssh  |  x64 |     Default |     RSA | 61376 / 109796 / 109796 | 1304 / 6556 / 4108
tinyssh**|  x64 |     Default | Ed25519 | 4244 /     N/A /  15720 | 1136 /  N/A / 2792
  • Resident Set Size* is bound to be variable so it is not meant to be taken as a hard value.
  • tinyssh**: Included memory usage of tcpserver for completeness. It is not included in the ARM readings since systemd is running anyway.
  • Dropbear could have similar saving to TinySSH if it were running in inet mode and started with systemd. This wasn’t tested.
  • Memory saving is dominated by the fact that OpenSSH implements privilege separation but neither Dropbear not TinySSH does. It’s trading off security over memory use.
  • Discounting the privilege separation overhead (the largest RSS value for both sshd), TinySSH uses significantly less RAM than Dropbear when used together with systemd but it’s less efficient otherwise:
    • ARM: 89% less Virtual Memory Size and 81% less Resident Set Size when comparing TinySSH to OpenSSH.
  • It is clear from the readings than a lot of work as been done in Dropbear to reduce its memory usage.

5Mb on a <=1Gb machine is a welcome saving, yet OpenSSH’s privilege separation security benefit is undeniable.

Conclusion

It’s definitely too early to use TinySSH in production, wouldn’t it be because because privilege separation is not implemented and as far as I know nobody did an external review of the code. For security products, “it works” is not enough. I can see a niche on embedded devices, it could be a good choice performance wise especially for Linux based system using systemd, the gain in memory use is real and I/O performance significant. It could probably be a good case for specific use like boot time remote LUKS decryption for desktop use.

In particular, Dropbear doesn’t seem to use sound default values so replacing uses of Dropbear by TinySSH should be considered in the coming years. TinySSH is significantly faster than Dropbear in connection latency on ARM and in throughput on all platforms.

As I noted in my previous post, I’m concerned by TinySSH coding style (lacks of brackets around conditions: a recipe for another goto fail) but I like the philosophy and the fact that less secure algorithms (like md5) are simply not implemented. The coding style could probably be fixed if desired by the author.

There’s definitely caveats in moving from RSA keys to Ed25519. OSX 10.11 supports Ed25519 just fine yet Gnome keyring still doesn’t support Ed25519 key, one has to use OpenSSH’s agent. An Ed25519 key will also not be usable on older servers, requiring users to have 2 keys, one RSA, one Ed25519. But at the same time, if you connect to an old server not supporting Ed25519, you should question its security!

Updates:

2016-01-29: Added note about OSX 10.11 client default settings which fail with tinyssh.