Ticket #1012 (closed defect: fixed)
Clean up old kernels in /boot
Reported by: | geofft | Owned by: | |
---|---|---|---|
Priority: | blocker | Milestone: | Current Semester |
Component: | -- | Keywords: | |
Cc: | Fixed in version: | debathena-auto-update 1.42.2 | |
Upstream bug: |
Description
Lucid cluster machines seem to have used up 75% of /boot in the time since they were installed:
opus:~ geofft$ athinfo w20-575-30 df Filesystem 1K-blocks Used Available Use% Mounted on /dev/mapper/athena-root 234134408 12461768 209779292 6% / /dev/sda1 233191 162536 58214 74% /boot
Alternatively, since GRUB 2 does LVM just fine, we could just not create a /boot partition. This is how I set up machines and I've never had a problem with it.
Change History
comment:1 Changed 12 years ago by jdreed
- Priority changed from normal to blocker
- Milestone changed from The Distant Future to Quantal Quetzal
comment:2 Changed 12 years ago by jdreed
- Status changed from new to committed
Recovery hook deployed today. Auto-update change committed in r25805. I think we want to do this by hand rather than playing with APT::NeverAutoRemove::, lest apt decide to leave you without any kernel package at all. This _shouldn't_ happen, but still, why tempt fate?
comment:4 follow-up: ↓ 5 Changed 12 years ago by jdreed
So, this works. The only potential issue I see is that subsequent apt transactions tell you to go ahead and autoremote the various -headers packages that correspond to the no longer needed kernels. Do we care? Or do we trust autoremove enough to do it on cluster machines?
comment:5 in reply to: ↑ 4 Changed 12 years ago by kaduk
Replying to jdreed:
So, this works. The only potential issue I see is that subsequent apt transactions tell you to go ahead and autoremote the various -headers packages that correspond to the no longer needed kernels. Do we care? Or do we trust autoremove enough to do it on cluster machines?
I am leery of trusting autoremove. We should probably do another update to the package to make it remove headers along with kernels.
comment:7 Changed 12 years ago by jdreed
- Status changed from development to committed
- Fixed in version set to debathena-auto-update 1.42.1
comment:9 Changed 12 years ago by jdreed
So, removing the headers works, but... we keep the previous kernel around in case of problems. So the (n-1) headers are still going to be marked as autoremovable, and apt is still going to whine. I see a couple of solutions:
a) stop caring.
b) since we're now dealing with kernels and headers by hand, auto-update calls apt-mark, and marks all linux-headers packages as manually installed, and we rely on auto-update to clean up.
c) Punt our headers code, and run apt-get -s autoremove, and bail if it wants to remove anything that's not linux-headers.
I'd like to get this out soon, because /boot is starting to fill up again.
comment:10 Changed 12 years ago by jdreed
Thoughts on this? If nobody cares, I'm going with option (c)
comment:11 Changed 12 years ago by kaduk
(c) is probably fine for now, though we may get into trouble with some of the stuff we're planning for the future (in thirdparty land)?
comment:12 Changed 12 years ago by jdreed
- Status changed from development to proposed
I'm going with option (a) instead.
comment:13 Changed 12 years ago by jdreed
Sigh. This doesn't catch old kernel headers lying around, only ones we remove when we remove the image. So all the headers are still there for kernels removed by the recovery hook. We can try the autoremove thing. But for some reason I just noticed cluster machines want to autoremove libcloog-ppl0. I don't know why. I suspect we the right answer is to suck it up and do autoremove, provided it won't remove any debathena metapckages. This puts us on a par with what aptitude used to do (since aptitude would in fact autoremove things).
comment:14 Changed 12 years ago by jdreed
- Status changed from proposed to development
- Fixed in version changed from debathena-auto-update 1.42.1 to debathena-auto-update 1.42.2
TTL expired on r25859
comment:15 Changed 12 years ago by jdreed
(1.42.1 is still in proposed, it can sit there for now)
comment:16 in reply to: ↑ description Changed 12 years ago by jdreed
Replying to geofft:
Alternatively, since GRUB 2 does LVM just fine, we could just not create a /boot partition. This is how I set up machines and I've never had a problem with it.
Sigh. partman insists that you cannot boot LVM without /boot and yells loudly at you (even in Precise). We can set partman-auto-lvm/no_boot to 'true' to clobber the warning, and I've done that in r25880. I suspect we'd care if we cared about rescue CDs for cluster, but we don't.
comment:17 Changed 12 years ago by jdreed
- Status changed from development to proposed
1.42.2 -> proposed
comment:18 Changed 12 years ago by jdreed
- Status changed from proposed to closed
- Resolution set to fixed
This is now becoming a problem, because Precise will yell at you. I'm seeing workstations in the field with 95% /boots. Is there a simple cleanup-old-kernels thing we can do in a recovery hook? Or do we get to roll our own?