Ticket #1012 (closed defect: fixed)

Opened 13 years ago

Last modified 12 years ago

Clean up old kernels in /boot

Reported by: geofft Owned by:
Priority: blocker Milestone: Current Semester
Component: -- Keywords:
Cc: Fixed in version: debathena-auto-update 1.42.2
Upstream bug:

Description

Lucid cluster machines seem to have used up 75% of /boot in the time since they were installed:

opus:~ geofft$ athinfo w20-575-30 df
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/mapper/athena-root
                     234134408  12461768 209779292   6% /
/dev/sda1               233191    162536     58214  74% /boot

Alternatively, since GRUB 2 does LVM just fine, we could just not create a /boot partition. This is how I set up machines and I've never had a problem with it.

Change History

comment:1 Changed 12 years ago by jdreed

  • Priority changed from normal to blocker
  • Milestone changed from The Distant Future to Quantal Quetzal

This is now becoming a problem, because Precise will yell at you. I'm seeing workstations in the field with 95% /boots. Is there a simple cleanup-old-kernels thing we can do in a recovery hook? Or do we get to roll our own?

comment:2 Changed 12 years ago by jdreed

  • Status changed from new to committed

Recovery hook deployed today. Auto-update change committed in r25805. I think we want to do this by hand rather than playing with APT::NeverAutoRemove::, lest apt decide to leave you without any kernel package at all. This _shouldn't_ happen, but still, why tempt fate?

comment:3 Changed 12 years ago by jdreed

  • Status changed from committed to development

comment:4 follow-up: ↓ 5 Changed 12 years ago by jdreed

So, this works. The only potential issue I see is that subsequent apt transactions tell you to go ahead and autoremote the various -headers packages that correspond to the no longer needed kernels. Do we care? Or do we trust autoremove enough to do it on cluster machines?

comment:5 in reply to: ↑ 4 Changed 12 years ago by kaduk

Replying to jdreed:

So, this works. The only potential issue I see is that subsequent apt transactions tell you to go ahead and autoremote the various -headers packages that correspond to the no longer needed kernels. Do we care? Or do we trust autoremove enough to do it on cluster machines?

I am leery of trusting autoremove. We should probably do another update to the package to make it remove headers along with kernels.

comment:6 Changed 12 years ago by jdreed

What about autoremove with "-s"?

comment:7 Changed 12 years ago by jdreed

  • Status changed from development to committed
  • Fixed in version set to debathena-auto-update 1.42.1

comment:8 Changed 12 years ago by jdreed

  • Status changed from committed to development

comment:9 Changed 12 years ago by jdreed

So, removing the headers works, but... we keep the previous kernel around in case of problems. So the (n-1) headers are still going to be marked as autoremovable, and apt is still going to whine. I see a couple of solutions:
a) stop caring.
b) since we're now dealing with kernels and headers by hand, auto-update calls apt-mark, and marks all linux-headers packages as manually installed, and we rely on auto-update to clean up.
c) Punt our headers code, and run apt-get -s autoremove, and bail if it wants to remove anything that's not linux-headers.

I'd like to get this out soon, because /boot is starting to fill up again.

comment:10 Changed 12 years ago by jdreed

Thoughts on this? If nobody cares, I'm going with option (c)

comment:11 Changed 12 years ago by kaduk

(c) is probably fine for now, though we may get into trouble with some of the stuff we're planning for the future (in thirdparty land)?

comment:12 Changed 12 years ago by jdreed

  • Status changed from development to proposed

I'm going with option (a) instead.

comment:13 Changed 12 years ago by jdreed

Sigh. This doesn't catch old kernel headers lying around, only ones we remove when we remove the image. So all the headers are still there for kernels removed by the recovery hook. We can try the autoremove thing. But for some reason I just noticed cluster machines want to autoremove libcloog-ppl0. I don't know why. I suspect we the right answer is to suck it up and do autoremove, provided it won't remove any debathena metapckages. This puts us on a par with what aptitude used to do (since aptitude would in fact autoremove things).

comment:14 Changed 12 years ago by jdreed

  • Status changed from proposed to development
  • Fixed in version changed from debathena-auto-update 1.42.1 to debathena-auto-update 1.42.2

TTL expired on r25859

comment:15 Changed 12 years ago by jdreed

(1.42.1 is still in proposed, it can sit there for now)

comment:16 in reply to: ↑ description Changed 12 years ago by jdreed

Replying to geofft:

Alternatively, since GRUB 2 does LVM just fine, we could just not create a /boot partition. This is how I set up machines and I've never had a problem with it.

Sigh. partman insists that you cannot boot LVM without /boot and yells loudly at you (even in Precise). We can set partman-auto-lvm/no_boot to 'true' to clobber the warning, and I've done that in r25880. I suspect we'd care if we cared about rescue CDs for cluster, but we don't.

comment:17 Changed 12 years ago by jdreed

  • Status changed from development to proposed

1.42.2 -> proposed

comment:18 Changed 12 years ago by jdreed

  • Status changed from proposed to closed
  • Resolution set to fixed
Note: See TracTickets for help on using tickets.