Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clusters with long names violate the SNI spec #52508

Open
espadolini opened this issue Feb 26, 2025 · 0 comments
Open

Clusters with long names violate the SNI spec #52508

espadolini opened this issue Feb 26, 2025 · 0 comments
Labels

Comments

@espadolini
Copy link
Contributor

Since #2243, Teleport has used a Server Name Indicator (SNI) of the form <cluster name encoded in hex>.teleport.cluster.local for clients to communicate to the Auth Service over TLS; this was initially just a way to make authentication more efficient (in a very strange old world where trust between clusters was mutual) but it is now a core part of how multiplexing between different clusters works (for the sake of Teleport Cloud).

The problem with that encoding is that the specification for the SNI extension (RFC 6066 §3) implicitly requires that the specified hostname is a valid hostname according to the IDN spec (RFC 5890 § 2.3.1), which means that each label (each component of the hostname, delimited by dots) must not exceed 63 bytes. This means that a cluster name longer than 31 bytes will result in an invalid SNI hostname.

While the Go TLS library and OpenSSL don't seem to care much, other libraries such as OpenJDK's standard library and Rustls will refuse to use and/or validate against such names.

The mechanism to load certificate authorities for validation based on the client SNI is largely vestigial at this point (and we'd definitely benefit from not unmarshaling the same data over and over again just to handle some connections), and the real fix is to ignore hostname validation (while still checking the server certificate against the known host CA, obviously), we still need some way to specify the intended destination of a TLS connection for the sake of multi-tenancy.

If we decided to stick with a reversible encoding we could improve the situation by using base32, pushing the cluster name size limit from 32 to 38 (as 39 would require exactly 63 bytes, but we likely need one byte to identify the encoding). Alternatively (and potentially only for clusters with "oversized" names) we could use an encoded, fixed-length hash of the name as the first label of the SNI (this would require searching through known CAs when deciding how to handle a connection, but that's easy enough to make efficient).

@espadolini espadolini added the bug label Feb 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant