Ticket #217 (new defect)

Opened 12 years ago

Last modified 10 years ago

athena-auto-update should use flock(1)

Reported by: andersk Owned by:
Priority: low Milestone: The Distant Future
Component: -- Keywords:
Cc: Fixed in version:
Upstream bug:

Description

Since aptitude --download-only schedules the packages for installation, this could cause the login chroot to have a bunch of scheduled updates:

if ! v aptitude --quiet --assume-yes --download-only full-upgrade; then
  …
fi
…
  echo "** Creating root snapshot"
  /usr/sbin/athena-login-snapshot update-start
…
v aptitude --quiet --assume-yes keep-all
v aptitude --quiet --assume-yes --download-only dist-upgrade

(By the way, why do we use both aptitude --download-only full-upgrade and aptitude --download-only dist-upgrade?)

This could be a 6.033 case study:

# Avoid confusing the system by running two updates at once.
pidfile=/var/run/athena-update.pid
if [ -e $pidfile ]; then
  if ! kill -0 "$(cat $pidfile)" 2>/dev/null; then
    rm -f $pidfile
  fi
fi
(set -o noclobber; echo $$ > $pidfile) 2>/dev/null || exit

trap 'rm -f $pidfile' EXIT

Change History

comment:1 Changed 12 years ago by geofft

Also, after a sufficiently unclean shutdown, I got tyger (my awesome -cluster laptop) into a state where you needed to run dpkg --configure -a just to make aptitude update not complain (and permit upgrades to happen). It's not implausible that some punk could decide they want another power outlet in the cluster at just the right time, so we should run this command.

I'm a little annoyed that aptitude only tells me to run the command rather than actually running it, even with --assume-yes, but...

comment:2 Changed 11 years ago by jdreed

  • Milestone set to IAP 2010

comment:3 Changed 11 years ago by broder

For those of you who have enough classwork from classes without Anders assignment more, the race condition in his second snippet is that

  1. Process 1 opens $pidfile
  2. Process 2 checks that the file exists
  3. Process 2 attempts to kill -0 "$(cat $pidfile)", which is equivalent to kill -0 "", which returns an error
  4. Process 2 removes the $pidfile, and writes its own
  5. Process 1 writes its PID to the now unlinked file

Anders comments:

Now, I’d like to emphasize that my point is not that the code is wrong. My point is that the code is not obviously right. And there’s no excuse for that when we have perfectly good utilities like flock that are designed to do this job correctly for us.

comment:4 Changed 11 years ago by jdreed

  • Milestone changed from Summer 2010 (Lucid Deploy) to The Distant Future
  • Summary changed from athena-auto-update bugs to Potential race condition in athena-auto-update

The first part is moot in the reactivate-2.0 world, I believe, so I'm changing the description to reflect this.

comment:5 Changed 10 years ago by jdreed

  • Summary changed from Potential race condition in athena-auto-update to athena-auto-update should use flock(1)
Note: See TracTickets for help on using tickets.