Ticket #493 (closed defect: fixed)
/proc/mounts! It's over sixty-five thousand!
Reported by: | geofft | Owned by: | |
---|---|---|---|
Priority: | blocker | Milestone: | |
Component: | -- | Keywords: | |
Cc: | Fixed in version: | ||
Upstream bug: |
Description
I logged in to a cluster machine today. /proc/mounts had 65571 entries in it, including 63356 instances of what appear to be bind-mounts of /media. My xterm that I launch from .startup.X started after about two minutes at a black screen. GNOME took a while longer, and I noticed this issue because stracing gnome-terminal indicated it was trying to read /proc/mounts and that was taking a long time. I think this is what people are reporting when they say some cluster machines take them several minutes to log in...
See /mit/geofft/Public/proc-mounts-uniq-c for the output of cat /proc/mounts | uniq -c and /mit/geofft/Public/debathena-over-65000 for the relevant kernel logs wherein I pressed alt-sysrq-T and -W a bunch.
Change History
comment:2 Changed 15 years ago by jdreed
Unmounting /media in /usr/lib/debathena-reactivate/reactivate "fixes" the problem.
Rather than preparing /media in the init script, should we move this:
# Enable subtree operations on /media by making it a mount point, # then share it. if ! mountpoint -q /media; then mount --bind /media /media mount --make-shared /media fi
into snapshot-run, and then in "reactivate", we can just unmount /media (keeping in mind we may need to do so multiple times. And "mountpoint -q" says /media is not a mountpoint when it's bind-mounted, so I don't know what a good way to check is, other than "grep -iq media /proc/mounts | wc -l".
I like this idea better in general, since all preparations for the chroot happen before it's run, not at boot time.
comment:3 Changed 15 years ago by jdreed
At release-team, it was suggested that putting a tmpfs on /media instead of bind-mounting it to itself would be an improvement. It is not.
We should just umount everything under /media (and /media itself) at logout time, and re-mount and re-share it, and move on.
comment:4 Changed 15 years ago by kcarnold
quickstation-2 just took 2+ minutes to login, fans roaring. It was this problem. (See /mit/kcarnold/Public/mounts-uniq-c, but it's like geofft's.)
comment:5 Changed 15 years ago by rbasch
The reason the /media mounts double is in fact that the mountpoint test in the init script (which is invoked at the end of a session) doesn't work for a bind-mount. So the bind-mount is repeated after each session, and, since we set the mountpoint as shared, that is also propagated to the peer.
I addressed this in r24332 and r24333, by parsing "mount" output in the init script to determine whether the bind-mount has been done; this is now in -proposed (debathena-reactivate 2.0.8). Affected machines will need to be rebooted to clear out their mount tables.
It's not quite clear when it starts doubling and when it starts simply adding an extra /media line
/mit/jdreed/Public/debathena contains copies of /proc/mounts from both inside and outside a chroot.
However, on the previous machine, I saw /proc/mounts double from 4k to 8k to 16k after each login, so it's unclear at what point that begins happening.