Ticket #1131 (closed defect: fixed)

Opened 9 years ago

Last modified 7 years ago

NM/dnsmasq possibly makes debathena-dns-config obsolete

Reported by: geofft Owned by:
Priority: normal Milestone: Current Semester
Component: -- Keywords:
Cc: Fixed in version:
Upstream bug:

Description

I believe Ubuntu's doing  something clever with a local DNS resolver by default, such that we don't really want debathena-dns-config any more.

Change History

comment:1 Changed 9 years ago by jdreed

As I understand it (and Greg or Andrew or someone should chime in here), the primary goals behind a local cacheing nameserver in the first place was to balance the large number of queries each Athena machine makes against 10/half networks and DNS servers that were still DECstations living under someone's desk. I don't think either of those conditions are true anymore (blah blah older dorm networks blah blah). Should we just give it up anyway?

I know we briefly discussed this in 2009 when there was that document with a list of Athena 9 packages and their relevance in a Debathena 10 world, but I can't find that page right now. Locally installing BIND has been screwing us since Lucid. Should we just punt it anyway?

comment:2 Changed 9 years ago by geofft

Well, the current good reason to do so is so protection against various forms of attacks on DNS, which is, not coincidentally, one of the bigger reasons that Ubuntu is doing this by default upstream.

comment:3 Changed 9 years ago by jdreed

Seeing as how one of our design goals is to diverge as little as possible from upstream, I think we should do what upstream is doing.

comment:4 Changed 9 years ago by geofft

So the question, and I'm not at a Precise computer at the moment to answer it right now, is how much this dnsmasq stuff is tied to NetworkManager, which, while running, isn't managing DNS on cluster machines etc. (Which, I guess, we could fix.)

comment:5 Changed 9 years ago by ghudson

We used to have to run a local named to get class HS to work for Hesiod, before I made Hesiod use class IN (the normal class). Then we kept running one for network efficiency, but that's probably an obsolete concern. (Certainly, making the small percentage of machines which run Debathena more efficient about DNS queries isn't going to have much global impact on MITnet, although it could have an impact on some local networks. But it probably doesn't.)

comment:6 Changed 9 years ago by jweiss

Ops installed a caching resolver on all of our linux servers within the last year. Previously, we'd only had one on a handful of machines (tho back in the days of servers based on Athena, those always ran one). The primary reason for this is that we were seeing slowness with things as simple and interactive as sshing into the server (apparently the server performs some DNS queries in this case, I think reverse-resolving the client's IP address). These problems occurred when one of the MITnet DNS servers was less-than-fully-responsive (which in the few cases I checked was due to some machine out on the internet hosing it down). So, I believe that having a caching resolver is generally beneficial. Additionally, as jdreed points out, it's good to avoid diverging from upstream.

comment:7 Changed 9 years ago by jdreed

  • Status changed from new to committed

I looked at this. dnsmasq works just fine with non-NM-managed interfaces, and is as simply as installing the dnsmasq package and dropping a file in /etc/dnsmasq.d that restricts it to the loopback interface and disables the dhcp/tftp features.

dns-config 1.7/ r25672 adds this.

comment:8 Changed 9 years ago by jweiss

Do we care that dnsmasq isn't recursive, and would presumably rely on the MITnet nameservers for all upstream resolution? After some contemplation, I think this is probably okay, but definitely different than the behavior we get using bind as our caching resolver.

comment:9 Changed 9 years ago by jdreed

I consider it a feature. I think we want a caching resolver, not a caching nameserver (which is what we had before). I've always thought that running our own bind was a bit heavy-handed. I suppose it's worth giving a heads-up to NIST to see if they have any concerns about a bunch of Athena machines suddenly talking to our DNS servers for non-mit.edu queries after however many years of silence.

comment:10 Changed 9 years ago by jdreed

Any other feedback on this? NIST is fine with the change. I'd like to either deploy this or fix #1166 before we go live.

comment:11 Changed 9 years ago by jdreed

OK, this is a little bit harder, since if you install this on an NM-managed interface, you can't restart dnsmasq in the postinst. (NM does deeal if dnsmasq is already started at boot time). So, do we want to:
a) make this work on not NM-controlled interfaces; or b) move dns-config to cluster-only, and let people who have -workstation use the upstream config. (This means if they have -workstation, AND have a non NM-managed configuration, they don't get any caching DNS)

comment:12 Changed 8 years ago by jdreed

See also today's 3down note about the March 18 change. Since nobody has said anything in 7 months, I'm going to punt dns-config, and move its dnsmasq functionality to cluster only. People who have workstation use whatever their upstream network configuration is, because Debathena is all about diverging from upstream as little as possible. I'll also note this removes one of the major differences between workstation and login-graphical, if we ever want to move in the direction of unifying them.

comment:13 Changed 7 years ago by jdreed

OK, I've done a bit more research into this. The user (ostensibly) has a working DNS config when installing this package, so we can skip restarting dnsmasq, and just set the reboot flag. That having been said, the only way you _unknowingly_ end up _without_ dnsmasq is if you do a -workstation install via PXE. I'm not convinced we need to care about those cases, particularly if we can convince the installer to play nice with nm (which _should_ be a matter of setting netcfg/target_network_config to nm_config). For anyone else, if you install via the alternate or server CD, you might end up with an ifupdown config, but I don't think it's our job to scribble over that. I still want to move this to -cluster,but I'm concerned about the upgrade path. I think the right thing here is to just drop all the restart code in the postinst, set the reboot flag, and move on with life. Otherwise there are just too many variables (what if you have an ifupdown eth0 but an NM-managed eth1?)

comment:14 Changed 7 years ago by jdreed

  • Status changed from committed to review

comment:16 Changed 7 years ago by jdreed

  • Status changed from review to development

comment:17 Changed 7 years ago by jdreed

Well, we managed to remove it from workstation, but forgot to add it to -cluster.

comment:19 Changed 7 years ago by jdreed

  • Status changed from development to closed
  • Resolution set to fixed

Done. I never want to see this package again.

Note: See TracTickets for help on using tickets.