CalcSnippets Search
Server 3 min read

`ssh-keyscan` Is the Fast Way to Build `known_hosts` Files but the Wrong Way to Skip Verification Thinking

A practical guide to `ssh-keyscan` for developers who need to collect SSH host keys for automation, CI, or deployment scripts without pretending raw collection is the same thing as trust.

Why this command matters: SSH automation breaks in two opposite ways. Some teams do everything manually forever. Others automate host trust so casually that they stop thinking about verification at all. ssh-keyscan sits right in the middle of that tension.

What ssh-keyscan does

The OpenSSH manual describes ssh-keyscan as a utility for gathering the public SSH host keys of a number of hosts. It was designed to help build and verify ssh_known_hosts files and to fit shell and Perl scripts with a minimal interface.

That means the command is not primarily about logging in. It is about collecting host key material efficiently.

A basic example:

ssh-keyscan github.com

That returns host key lines you can place into a known_hosts file.

Why this command is so useful in automation

Modern deployment flows often need non-interactive SSH:

  1. CI pulling private dependencies
  2. deploy scripts connecting to servers
  3. build machines cloning private repos
  4. orchestration jobs touching many hosts

In those environments, the classic interactive SSH trust prompt is not a usable workflow. ssh-keyscan helps you prebuild the trust file so automation does not stall waiting for a human.

The part people misuse

The same man page also contains the warning many teams mentally skip: if an ssh_known_hosts file is constructed using ssh-keyscan without verifying the keys, users become vulnerable to man-in-the-middle attacks.

That sentence matters more than most blog posts admit.

ssh-keyscan gathers keys. It does not prove the network path is honest. It does not prove you contacted the right machine. It does not replace out-of-band trust.

That is the core mental model:

  1. collection is not verification
  2. automation is not trust

A practical CI example

Many teams do something like this:

mkdir -p ~/.ssh
ssh-keyscan github.com >> ~/.ssh/known_hosts

That can be acceptable when the trust assumptions are already understood and documented. It becomes sloppy when people paste it into scripts as a magic incantation without knowing what risk they are accepting.

The right question is not “does the command work?” The right question is “why am I comfortable trusting this key source in this environment?”

Useful flags worth knowing

The manual highlights a few options that matter in real workflows:

  1. -t to select key types such as rsa, ecdsa, or ed25519
  2. -p to connect to a non-default port
  3. -H to hash hostnames and addresses in output
  4. -T to set a connection timeout
  5. -f to read many hosts from a file

These are not edge-case flags. They are what make the tool practical in real deployment environments.

For example:

ssh-keyscan -T 10 -t ed25519 my-server.example.com

That is clearer and tighter than fetching every possible thing with default assumptions.

Why the scaling story matters

The manual also notes that ssh-keyscan uses non-blocking socket I/O and can contact many hosts in parallel efficiently. That is exactly why the command remains valuable. If you need keys from dozens or hundreds of hosts, interactive SSH acceptance is not a strategy.

Speed is the point.

But speed without verification discipline is where teams create quiet risk.

The right way to think about it

Use ssh-keyscan as a collection tool inside a larger trust process. Maybe that process is:

  1. compare keys against provider documentation
  2. validate fingerprints out of band
  3. store trusted keys in versioned infrastructure config
  4. alert when host keys change unexpectedly

That is real operational maturity. Blindly appending keys in every script is not.

Final recommendation

Use ssh-keyscan when you need fast, scriptable host key collection. Just keep the important boundary in your head: it is a data-gathering tool, not a magical trust oracle. Fast automation is useful. Verified automation is better.

Sources

Keep reading

Related guides