Ticket #787 (new defect)

Opened 13 years ago

Last modified 12 years ago

GSSAPIKeyExchange causes delays when sshing to non-MIT servers

Reported by: andersk Owned by:
Priority: normal Milestone: The Distant Future
Component: -- Keywords:
Cc: Fixed in version:
Upstream bug:

Description

From the zlogs a year ago (-c debathena -i gssapikeyechange, 2010-01-12), and rediscovered by various other people since then:

GSSAPIKeyExchange causes ~3 second delays when sshing to non-MIT servers:

…
debug1: Local version string SSH-2.0-OpenSSH_5.2p1 Debian-1ubuntu1
debug2: fd 3 setting O_NONBLOCK
[1 second pause]
debug1: Unspecified GSS failure.  Minor code may provide more information
Server krbtgt/MIT.EDU@ATHENA.MIT.EDU not found in Kerberos database

[1 second pause]
debug1: Unspecified GSS failure.  Minor code may provide more information
Server krbtgt/MIT.EDU@ATHENA.MIT.EDU not found in Kerberos database

[1 second pause]
debug1: Unspecified GSS failure.  Minor code may provide more information


debug1: SSH2_MSG_KEXINIT sent
debug1: SSH2_MSG_KEXINIT received
…

From a wireshark dump, it appears to be trying and failing to obtain the following TGSs from kerberos.mit.edu:

host/longitude.lan
krbtgt/LAN
krbtgt/EDU
krbtgt/MIT.EDU
host/longitude.lan
krbtgt/LAN
krbtgt/EDU
krbtgt/MIT.EDU
host/longitude.lan
krbtgt/LAN
krbtgt/EDU
krbtgt/MIT.EDU

This is a distinct issue from #315/#653, as this is about a server with no Debathena or Kerberos configuration of any kind.

Change History

comment:1 Changed 13 years ago by andersk

Some users are seeing 30 s delays instead of just 3 s. One way this happens is when the network drops the queries to the KDC.

comment:2 Changed 13 years ago by ghudson

The delay comes from a combinatoric number of DNS and KDC requests. ssh tries to get credentials via three different GSS mechanisms; each attempt results in four KDC requests; each KDC requests results in eight DNS queries. So, 96 DNS queries and 12 KDC queries.

The krb5 developers are discussing what we might do in 1.10 to improve performance, such as turning off the realm walk by default (so only two KDC queries per operation), deferring getaddrinfo calls until we need the answers, and briefly caching getaddrinfo results. I don't know yet what will happen.

A possible, although perhaps inadvisable, workaround would be to put IP addresses instead of hostnames in krb5.conf; this reduces the delay to the twelve KDC requests, which should take less than a second. To reduce the number of DNS requests to 0 I think you'd also have to add a line "master_kdc = 18.7.21.144:88".

comment:3 Changed 13 years ago by ghudson

I've implemented one measure (deferring DNS requests) which should reduce the number of round trips to 12 DNS and 12 KDC requests; in my tests, this cuts the delay roughly in half. We'll likely implement another measure which will cut the delay in half again (6 DNS and 6 KDC requests), which should make the delay mostly imperceptible.

comment:4 Changed 13 years ago by kchen

I just encountered this for the first time (because the only non-MIT place I was ssh'ing to was previously using sshv1, which I guess doesn't do GSSAPI). For what it's worth, it takes about 6-11 seconds (maybe depending on whether the DNS requests are in cache), rather than 3 seconds for me.

Total of 15 KDC requests:
host/foo.example.com: 3
krbtgt/EXAMPLE.COM: 3
krbtgt/COM: 3
krbtgt/EDU: 3
krbtgt/MIT.EDU: 3

Total of 312 DNS requests:
foo.example.com, A: 4
foo.example.com, AAAA: 1
foo.example.com.mit.edu, AAAA: 1
4.3.2.1.in-addr.arpa, PTR: 6
KERBEROS.MIT.EDU, A: 15
kerberos.mit.edu, A: 30
kerberos.mit.edu, AAAA: 30
kebreros.mit.edu.mit.edu, AAAA: 30
kerberos-1.mit.edu, A: 30
kebreros-1.mit.edu, AAAA: 30
kerberos-1.mit.edu.mit.edu, AAAA: 30
kerberos-2.mit.edu, A: 30
kerberos-2.mit.edu, AAAA: 30
kerberos-2.mit.edu.mit.edu, AAAA: 30
_kebreros.master._udp.ATHENA.MIT.EDU, SRV: 15

Disabling GSSAPIKeyExchange is effective, of course, but I figured I'd provide these stats.

comment:5 Changed 13 years ago by ghudson

I've implemented the other measure (eliminating the domain-based realm walk for client TGS requests) for krb5 1.10, which should reduce the typical delay to a second or so when the local KDC is reachable.

comment:6 Changed 12 years ago by tlyu

One way to cut down on the number of DNS queries is to include trailing dots for the KDC hostnames in krb5.conf. That will avoid the useless ".mit.edu.mit.edu" lookups.

Note: See TracTickets for help on using tickets.