Ticket #217 (new defect)
athena-auto-update should use flock(1)
Reported by: | andersk | Owned by: | |
---|---|---|---|
Priority: | low | Milestone: | The Distant Future |
Component: | -- | Keywords: | |
Cc: | Fixed in version: | ||
Upstream bug: |
Description
Since aptitude --download-only schedules the packages for installation, this could cause the login chroot to have a bunch of scheduled updates:
if ! v aptitude --quiet --assume-yes --download-only full-upgrade; then … fi … echo "** Creating root snapshot" /usr/sbin/athena-login-snapshot update-start … v aptitude --quiet --assume-yes keep-all v aptitude --quiet --assume-yes --download-only dist-upgrade
(By the way, why do we use both aptitude --download-only full-upgrade and aptitude --download-only dist-upgrade?)
This could be a 6.033 case study:
# Avoid confusing the system by running two updates at once. pidfile=/var/run/athena-update.pid if [ -e $pidfile ]; then if ! kill -0 "$(cat $pidfile)" 2>/dev/null; then rm -f $pidfile fi fi (set -o noclobber; echo $$ > $pidfile) 2>/dev/null || exit trap 'rm -f $pidfile' EXIT
Change History
comment:3 Changed 15 years ago by broder
For those of you who have enough classwork from classes without Anders assignment more, the race condition in his second snippet is that
- Process 1 opens $pidfile
- Process 2 checks that the file exists
- Process 2 attempts to kill -0 "$(cat $pidfile)", which is equivalent to kill -0 "", which returns an error
- Process 2 removes the $pidfile, and writes its own
- Process 1 writes its PID to the now unlinked file
Anders comments:
Now, I’d like to emphasize that my point is not that the code is wrong. My point is that the code is not obviously right. And there’s no excuse for that when we have perfectly good utilities like flock that are designed to do this job correctly for us.
comment:4 Changed 15 years ago by jdreed
- Milestone changed from Summer 2010 (Lucid Deploy) to The Distant Future
- Summary changed from athena-auto-update bugs to Potential race condition in athena-auto-update
The first part is moot in the reactivate-2.0 world, I believe, so I'm changing the description to reflect this.
Also, after a sufficiently unclean shutdown, I got tyger (my awesome -cluster laptop) into a state where you needed to run dpkg --configure -a just to make aptitude update not complain (and permit upgrades to happen). It's not implausible that some punk could decide they want another power outlet in the cluster at just the right time, so we should run this command.
I'm a little annoyed that aptitude only tells me to run the command rather than actually running it, even with --assume-yes, but...