Ticket #98 (closed defect: fixed)

Opened 13 years ago

Last modified 13 years ago

Build server schroots get progressively slower

Reported by: ghudson Owned by:
Priority: normal Milestone:
Component: Keywords:
Cc: Fixed in version:
Upstream bug:

Description

Over time, creating and destroying LVM snapshots appears to get progressively slower. The slowdown is very gradual; on debuild the problem is likely to never become noticeable, but on linux-build-10 the presence of autodebathenify is causing a visible issue. Currently it takes about 105 seconds to create and tear down an schroot; it used to only take a few seconds.

Unfortunately, cleaning up leaked snapshots (there were about six as of yesterday) does not help, nor does rebooting. Backups of the LVM metadata were piling up in /etc/lvm/archive, but this was not the problem; nuking that directory and turning off archive snapshots does not help. Presumably there is some state inside the volume group which is piling up; I have not found any tools which will let me analyze the internal data structures at that level or perform any kind of garbage collection on them. "lvscan" is also very slow to complete.

It's conceivable that upgrading the kernel on the build server would help (the machine has not been kept up to date with Ubuntu). Barring that, the only temporary remediation I'm aware of is to blow away and recreate the volume group and re-make all the build chroots, a process which would be risky since I believe make-build-chroot does not work with current debootstrap (a separate issue).

A radical solution would be to use the unionfs schroot patch, as proposed in #97. That would eliminate the issue entirely, perhaps in exchange for other issues.

Change History

comment:1 Changed 13 years ago by ghudson

  • Status changed from new to closed
  • Resolution set to fixed

The problem turns out to be with /etc/lvm/cache/.cache, which is added to each time a snapshot is created and not cleaned up when a snapshot is destroyed. The fix is to remove the cache file (which is safe) and turn off write_cache_state in /etc/lvm/lvm.conf.

Snapshots are back to being created and destroyed in under a second.

Note: See TracTickets for help on using tickets.