source: trunk/doc/maintenance @ 12069

Revision 12069, 17.2 KB checked in by ghudson, 26 years ago (diff)
Update the description of the wash.
1This file contains notes about the care and feeding of the Athena
2source repository.  It is intended primarily for the administrators of
3the source tree, not for developers (except perhaps for the first
4section, "mailing lists").  See the file "procedures" in this
5directory for information about procedures relevant to developers.
7The areas covered in this file are:
9        Mailing lists
10        Permissions
11        The wash process
12        Imake templates
13        Release notes
14        Release cycles
15        Patch releases
16        Rel-eng machines
17        Cluster information
19Mailing lists
22Here are descriptions of the mailing lists related to the source tree:
24        * source-developers
26                For discussion of the policy and day-to-day
27                maintenance of the repository.  This is a public list,
28                and there is a public discuss archive on menelaus.
30        * source-reviewers
32                For review of changes to be checked into the
33                repository.  To be a member of this mailing list, you
34                must have read access to the non-public parts of the
35                source tree, but you do not need to be a staff member.
36                There is a non-public discuss archive on menelaus.
38        * source-commits
40                This mailing lists receives commit logs for all
41                commits to the repository.  This is a public mailing
42                list.  There is a public discuss archive on menelaus.
44        * source-diffs
46                This mailing list receives commit logs with diffs for
47                all commits to the repository.  To be on this mailing
48                list, you must have read access to the non-public
49                parts of the source tree.  There is no discuss archive
50                for this list.
52        * source-wash
54                This mailing list receives mail when the wash process
55                blows out.  This is a public mailing list.  There is
56                no discuss archive for this list.
58        * rel-eng
60                The release engineering mailing list.  Mail goes here
61                about patch releases and other release details.  There
62                is a public archive on menelaus.
64        * release-team
66                The mailing list for the release team, which sets
67                policy for releases.  There is a public archive on
68                menelaus (currently, it has the name "release-77").
73Following are descriptions of the various groups found on the acls of
74the source tree:
76        * read:source
77          read:staff
79                These two groups have identical permissions in the
80                repository, but read:source contains artificial
81                constructs (the builder user and service principals)
82                while read:staff contains people.  In the future,
83                highly restricted source could have access for
84                read:source and not read:staff.
86                Both of these groups have read access to non-public
87                areas of the source tree.
89        * write:staff
91                Contains developers with commit access to the source
92                tree.  This group has write access to the repository,
93                but not to the checked-out copy of the mainline
94                (/mit/source).
96        * write:update
98                Contains the service principal responsible for
99                updating /mit/source.  This group has write access to
100                /mit/source but not to the repository.
102        * adm:source
104                This group has administrative access to the repository
105                and to /mit/source.
107system:anyuser has read access to public areas of the source tree and
108list access to the rest.  system:authuser occasionally has read access
109to areas that system:anyuser does not (synctree is the only current
112The script CVSROOT/ in the repository makes sure the
113permissions are correct in the repository or in a working directory.
114Run it from the top level of the repository or of /mit/source, giving
115it the argument "repository" or "wd".
117The wash process
120The wash process is a nightly rebuild of the source repository from
121scratch, intended to alert the source tree maintainers when someone
122checks in a change which causes the source tree to stop building.  The
123general architecture of the wash process is:
125        * Each night at midnight, a machine (currently small-gods)
126          performs a cvs update of the checked-out tree in
127          /afs/  If the cvs update
128          fails, the update script sends mail to
129          This machine is on read:source and write:update.
131        * Each night at 4:30am, a machine of each architecture
132          (currently whirlpool, kenmore, and maytag) performs a build
133          of the tree into /var/, using the build directory
134          /var/build.  If the build fails, the wash script copies the
135          log of the failed build into AFS and sends mail to
136 with the last few lines of the log.  If
137          the build succeeds, the wash script moves /var/ to
138          /var/srvd, so that /var/srvd is always the last successful
139          build of the source tree.
141        * Each Sunday at 1:00am, the wash machines make a copy of
142          their last successful builds into a "srvd-current" directory
143          in AFS.  The copy is done without system:administrator
144          privileges, so the file permissions on srvd-current are all
145          wrong, but the current srvd is useful for development work.
147Source for the wash scripts lives in /afs/
148They are installed in /usr/local on the wash machines.  Logs of the
149start and end times of the wash processes on each machine live in
152Imake templates
155We don't like imake, but we maintain two sets of imake templates:
157        * packs/build/config
159                These templates are the legacy Athena build system.
160                They are specific to software in the athena hierarchy,
161                and one glorious day in the future they will no longer
162                be necessary.
164                For these templates, you should define TOPDIR to the
165                top-level source directory.
167        * packs/build/xconfig
169                These templates are used for building software which
170                uses X-style Imakefiles.  They may need periodic
171                updating as new versions of X are released.  These
172                templates are full of a lot of hacks, mostly because
173                the imake model isn't really adequate for dealing with
174                third-party software and local site customizations.
176                For these templates, you should define TOPDIR to "."
177                and SRCDIR to the top-level source directory.
179Release notes
182There are two kinds of release notes, the system release notes and the
183user release notes.  The system release notes are more comprehensive
184and assume a higher level of technical knowledge, and are used in the
185construction of the user release notes.  It is the job of the release
186engineer to produce a set of system release notes for every release,
187with early versions towards the beginning of the release cycle.  The
188best way to make sure this happens is to maintain the system release
189notes throughout the entire development cycle.
191Thus, it is the job of the release engineer to watch the checkins to
192the source tree and enter a note about all user-visible changes in the
193system release notes, which live in /afs/
194Highly visible changes should appear near the beginning of the file,
195and less visible changes should appear towards the end.  Changes to
196particular subsystems should be grouped together when possible.
198Release cycles
201Release cycles have five phases: crash and burn, alpha, beta, early,
202and the public release.  The release team has a set of criteria for
203entering and exiting each phase, which won't be covered here.  The
204following guidelines should help the release go smoothly:
206        * Crash and burn
208          This phase is for rel-eng internal testing.  The crash and
209          burn machines should be identified and used to test the
210          install and update.  System packs may be generated at will
211          by taking snapshots from the wash machine.  The system packs
212          volume does not need any replication.
214          System release notes should be prepared during this phase.
216          Before the transition from crash and burn to alpha, the
217          release engineer should do a sanity check on the new packs
218          by comparing a file listing of the new packs to a file
219          listing of the previous release's packs.  The release
220          engineer should also check the list of configuration files
221          for each platform (in packs/update/platform/*/configfiles)
222          and make sure that any configuration files which have
223          changed are listed as changed in the version script.
224          Finally, the release should be checked to make sure it won't
225          overflow partitions on any client machines; currently, SGIs
226          are not a problem (because they have one big partition) and
227          the most restrictive sizes on Solaris clients are 27713K and
228          51903K of useable space for the root and /usr partitions.
230        * Alpha
232          The alpha phase is for internal testing by the release team.
233          System packs may still be regenerated at will by taking
234          snapshots, but the system packs volume (and os volume)
235          should be read-only so it can be updated by a vos release.
236          Changes to the packs do not need to be propagated in patch
237          releases; testers are expected to be able to ensure
238          consistency by forcing repeat updates or reinstalling their
239          machines.
241          A draft of the system release notes should be ready by the
242          beginning of this phase.  User release notes should be
243          prepared during this phase.
245          Before the transition from alpha to beta, doc/third-party
246          should be checked to see if miscellaneous third-party files
247          (the ones not under the "third" hierarchy) should be
248          updated.
250        * Beta
252          The beta phase involves outside testers.  System packs and
253          os volumes should be replicated on multiple servers, and
254          permissions should be set to avoid accidental changes
255          (traditionally this means giving write access to
256          system:packs, a normally empty group).  Changes to the packs
257          must be propagated by patch releases.
259          User release notes should be essentially finished by the end
260          of this phase.  System release notes may continue to be
261          updated as bug fixes occur.  Ideally, no new features should
262          be committed to the source tree during the beta phase.
264          At the end of the beta phase, a release branch should
265          be created with a name of the form athena-8_1, and tagged
266          with athena-8_1-early.  A checked-out tree should be made in
267          /afs/ for the release branch, with a name
268          of the form src-8.1.  It should have a locker with a name of
269          the form source-8.1.  A final full build of the system packs
270          should be done from the release branch, with the build tree
271          located in /afs/  The new
272          release build machines should be set up for incremental
273          changes to the new release at this point (which means
274          turning off the wash).
276        * Early
278          The early release involves more outside testers and some
279          cluster machines.  The release should be considered ready
280          for public consumption.
282          The release branch should be tagged with a name of the form
283          athena-8_1-early.
285        * Release
287          The release branch should be tagged with a name of the form
288          athena-8_1-release.
290One thing that needs to happen externally during a release cycle, if
291there is an OS upgrade involved, is the addition of compatibility
292symlinks under the arch directories of various lockers.  All of the
293lockers listed in packs/glue/specs definitely need to be hit, and the
294popular software lockers need to be hit as well.  Here is a reasonable
295list of popular lockers to get in addition to the glue ones:
297        consult
298        games
299        gnu
300        graphics
301        outland
302        sipb
303        tcl
304        watchmaker
305        windowmanagers
306        /afs/sipb/project/tcsh
308In addition, the third-party software lockers need to be updated; the
309third-party software group keeps their own list.
311Patch releases
314Once a release has hit beta test, all changes to the release must be
315propagated through patch releases.  The steps to performing a patch
316release are:
318        * Check in the changes on the mainline (if they apply) and on
319          the release branch and update the relevant sections of the
320          source tree in /afs/
322        * If the update needs to do anything other than track against
323          the system packs, you must prepare a version script which
324          deals with any transition issues, specifies whether to track
325          the OS volume, specifies whether to deal with a kernel
326          update, and specifies which if any configuration files need
327          to be updated.  See the update script
328          (packs/update/ for details.  See
329          packs/build/update/platform/*/configfiles for a list of
330          configuration files for a given platform.  The version
331          script should be checked in on the mainline and on the
332          release branch.
334        * Make sure to add symlinks in the build tree for any files
335          you have added.  Note that you probably added a build script
336          if the update needs to do anything other than track against
337          the system packs.
339        * In the build tree, bump the version number in
340          packs/build/version (the symlink should be broken for this
341          file to avoid having to change it in the source tree).
343        * If you are going to need to update binaries that users run
344          from the packs, go into the packs and move (don't copy) them
345          into a .deleted directory at the root of the packs.  This is
346          especially important for binaries like emacs and dash which
347          people run for long periods of time, to avoid making the
348          running processes dump core when the packs are released.
350        * Update the read-write volume of the packs to reflect the
351          changes you've made.  You can use the script to
352          build and install specific packages, or you can use the
353 script to build the package and then install specific
354          files (cutting and pasting from the output of "make -n
355          install DESTDIR=/srvd" is the safest way); updating the
356          fewest number of files is preferrable.  Remember to install
357          the version script.
359        * Use the script to build and install
360          packs/build/finish.  This will fix ownerships and update the
361          track lists and the like.
363        * It's a good idea to test the update from the read-write
364          packs by symlinking the read-write packs to /srvd on a test
365          machine and taking the update.  Note that when the machine
366          comes back up with the new version, it will probably
367          re-attach the read-write packs, so you may have to re-make
368          the symlink if you want to test stuff that's on the packs.
370        * At some non-offensive time, release the packs in the dev
371          cell.
373        * Send mail to rel-eng saying that the patch release went out,
374          and what was in it.  (You can find many example pieces of
375          mail in the discuss archive.)  Include instructions
376          explaining how to propagate the release to the athena cell.
378Rel-eng machines
381There are three rel-eng machines for each platform:
383        * A current release build machine, for doing incremental
384          updates to the last public release.  This machine may also
385          be used by developers for building software.
387        * A new release build machine, for building and doing
388          incremental updates to releases which are still in testing.
389          Before a new release goes into testing, this machine should
390          perform the wash.  This machine may also be used by
391          developers for building software, or if they want a snapshot
392          of the new system packs to build things against.
394        * A crash and burn machine, usually located in the release
395          engineer's office for easy physical access.
397Here is a list of the rel-eng machines for each platform:
399                                Sun             Indy            O2
401Current release build           downy           snuggle         bounce
402New release build               whirlpool       kenmore         maytag
403Crash and burn                  sourcery        pyramids        reaper-man
405For reference, here are some names that fit various laundry and
406construction naming schemes:
408        * Washing machines: kenmore, whirlpool, ge, maytag
409        * Laundry detergents: fab, calgon, era, cheer, woolite,
410                tide, ultra-tide
411        * Bleaches: clorox, ajax
412        * Fabric softeners: downy, final-touch, snuggle, bounce
413        * Heavy machinery: steam-shovel, pile-driver, dump-truck,
414                wrecking-ball, crane
415        * Construction kits: lego, capsela, technics, k-nex, playdoh,
416                construx
417        * Construction materials: rebar, two-by-four, plywood,
418                sheetrock
419        * Heavy machinery companies: caterpillar, daewoo, john-deere,
420                sumitomo
421        * Buildings: empire-state, prudential, chrysler
426The getcluster(8) man explains how clients interpret cluster
427information.  This section documents the clusters related to the
428release cycle, and how they should be managed.
430There are five clusters for each platform, each of the form
431PHASE-PLATFORM, where PHASE is a phase of the release cycle (crash,
432alpha, beta, early, public) and PLATFORM is the machtype name of the
433platform.  There are two filsys entries for each platform and release
434pointing to the athena cell and dev cell system packs for the release;
435they have the form athena-PLATFORMsys-XY and dev-PLATFORMsys-XY, where
436X and Y are the major and minor numbers of the release.  For the SGI,
437we currently also have athena-sgi-inst-XY and dev-sgi-inst-XY.
439At the crash and burn, alpha, and beta phases of the release cycle,
440the appropriate cluster (PHASE-PLATFORM) should be updated to include
441data records of the form:
443        Label: syslib           Data: dev-PLATFORMsys-XY X.Y t
444(SGI)   Label: instlib          Data: dev-sgi-inst-XY X.Y t
446This change will cause console messages to appear on the appropriate
447machines informing their maintainers of a new testing release which
448they can take manually.
450At the early and public phases of the release cycle, the 't' should be
451removed from the new syslib records in the crash, alpha, and beta
452clusters, and the appropriate cluster (early-PLATFORM or
453public-PLATFORM) should be updated to include data records:
455        Label: syslib           Data: athena-PLATFORMsys-XY X.Y
456(SGI)   Label: instlib          Data: athena-sgi-inst-XY X.Y
458This change will cause AUTOUPDATE machines in the appropriate cluster
459(as well as the crash, alpha, and beta clusters) to take the new
460release; console messages will appear on non-AUTOUPDATE machines.
Note: See TracBrowser for help on using the repository browser.