Ticket #656 (closed defect: fixed)

Opened 11 years ago

Last modified 10 years ago

clusters should provide more local disk

Reported by: kaduk Owned by: jdreed
Priority: normal Milestone: Natty Beta
Component: login chroot Keywords: cluster chroot disk tmp
Cc: mitchb Fixed in version:
Upstream bug:

Description

We have traditionally recommended running data-intensive simulations into /tmp or other local disk, as this is both faster than AFS and does not eat up quota. Unfortunately, it appears that the current chroot configuration only provides 1G on any particular local filesystem.
geofft seemed to think that it might be okay to mount /tmp as an actual disk; if that is not a useful option, we should at least be able to give a few more gigs on one of the devices in the chroot.

Change History

comment:1 Changed 11 years ago by geofft

I think the best option is to start by bind-mounting /tmp, /var/tmp, etc. through.

Past that we should be dynamically calculating how big we can safely make the chroot tmpfs, and dynamically making a lot of swap or something. Alternatively we can go back to backing the writable half of the chroot directly on disk, but I fear like this might get us back to our slowness issues from LVM snapshots, by having both branches be on the same disk.

comment:2 Changed 11 years ago by jdreed

  • Priority changed from normal to low
  • Milestone changed from IAP 2011 to Natty Alpha

To be clear, we have always recommended /var/tmp, not /tmp (since if you get logged out, your data is not completely lost).

comment:3 Changed 11 years ago by mitchb

  • Keywords tmp added
  • Priority changed from low to normal
  • Cc mitchb added

A user got bitten by #844 in a cluster today, and nearly lost
a paper they'd been actively editing at the time. They got
very lucky, as it happened to be getting autosaved in /var/tmp,
but there was a long period of concern that it was in /tmp
and lost forever.

While we can recommend /var/tmp to users, some programs don't
give you the choice, and this episode has underscored that
lack of a /tmp store that persists long enough to recover right
after a reboot may eat your academic work, so we're bumping
the priority again.

comment:4 Changed 10 years ago by jdreed

  • Milestone changed from Natty Alpha to Natty Beta

The solution here is to ensure that /tmp and /var/tmp are bind-mounted.

comment:5 Changed 10 years ago by jdreed

In response to Geoff's initial comment, unless I'm reading something wrong, /tmp and /var/tmp have been bind-mounted since reactivate-2.0. The issue is the size of the tmpfs we create in reactivate's init script, right? Should we consider a separate filesystem for /var/tmp and just mount the actual disk itself?

comment:6 Changed 10 years ago by jdreed

So, these are in fact bind mounted. We suspect the problem may be that /etc/mtab has pre-chroot paths and some things are getting confused? Geoff successfully created a 9GB file in /tmp.

So when setting up the chroot, we should either use sed and strip out the /var/lib/mumble prefixes, or symlink /etc/mtab to /proc/mounts

comment:7 Changed 10 years ago by jdreed

  • Owner set to jdreed
  • Status changed from new to accepted

comment:8 Changed 10 years ago by jdreed

This should DTRT, right?

Index: snapshot-run
===================================================================
--- snapshot-run	(revision 25173)
+++ snapshot-run	(working copy)
@@ -39,6 +39,9 @@
 # by punting it
 schr rm -rf /home
 
+# Fix up mtab so that df and friends work correctly
+schr -i 's|/var/lib/schroot/mount/$session||' /etc/mtab
+
 # Run the session
 #
 # We wrap the target command in sudo because it runs initgroups(3)

comment:9 follow-up: ↓ 10 Changed 10 years ago by jdreed

Uh, let's try that again:

Index: snapshot-run
===================================================================
--- snapshot-run	(revision 25173)
+++ snapshot-run	(working copy)
@@ -39,6 +39,9 @@
 # by punting it
 schr rm -rf /home
 
+# Fix up mtab so that df and friends work correctly
+schr sed -i 's|/var/lib/schroot/mount/$session||' /etc/mtab
+
 # Run the session
 #
 # We wrap the target command in sudo because it runs initgroups(3)

comment:10 in reply to: ↑ 9 Changed 10 years ago by kaduk

Replying to jdreed:

Uh, let's try that again:

Index: snapshot-run
===================================================================
--- snapshot-run	(revision 25173)
+++ snapshot-run	(working copy)
@@ -39,6 +39,9 @@
 # by punting it
 schr rm -rf /home
 
+# Fix up mtab so that df and friends work correctly
+schr sed -i 's|/var/lib/schroot/mount/$session||' /etc/mtab
+

I don't think you want single quotes there?

comment:11 Changed 10 years ago by jdreed

In fact.

+# Fix up mtab so that df and friends work correctly
+schr sed -i "s|/var/lib/schroot/mount/$session||" /etc/mtab
+

comment:12 Changed 10 years ago by lizdenys

  • Status changed from accepted to committed

Tested, works. Committed in r25187.

comment:13 Changed 10 years ago by jdreed

  • Status changed from committed to development

This made it into -dev.

comment:14 Changed 10 years ago by jdreed

  • Status changed from development to closed
  • Resolution set to fixed
Note: See TracTickets for help on using tickets.