source: trunk/doc/maintenance @ 10296

Revision 10296, 17.7 KB checked in by ghudson, 27 years ago (diff)
Correct /source to /mit/source.
Line 
1This file contains notes about the care and feeding of the Athena
2source repository.  It is intended primarily for the administrators of
3the source tree, not for developers (except perhaps for the first
4section, "mailing lists").  See the file "procedures" in this
5directory for information about procedures relevant to developers.
6
7The areas covered in this file are:
8
9        Mailing lists
10        Permissions
11        The wash process
12        Imake templates
13        Release notes
14        Release cycles
15        Patch releases
16        Rel-eng machines
17        Cluster information
18
19Mailing lists
20-------------
21
22Here are descriptions of the mailing lists related to the source tree:
23
24        * source-developers
25
26                For discussion of the policy and day-to-day
27                maintenance of the repository.  This is a public list,
28                and there is a public discuss archive on menelaus.
29
30        * source-reviewers
31
32                For review of changes to be checked into the
33                repository.  To be a member of this mailing list, you
34                must have read access to the non-public parts of the
35                source tree, but you do not need to be a staff member.
36                There is a non-public discuss archive on menelaus.
37
38        * source-commits
39
40                This mailing lists receives commit logs for all
41                commits to the repository.  This is a public mailing
42                list.  There is a public discuss archive on menelaus.
43
44        * source-diffs
45
46                This mailing list receives commit logs with diffs for
47                all commits to the repository.  To be on this mailing
48                list, you must have read access to the non-public
49                parts of the source tree.  There is no discuss archive
50                for this list.
51
52        * source-wash
53
54                This mailing list receives mail when the wash process
55                blows out.  This is a public mailing list.  There is
56                no discuss archive for this list.
57
58        * rel-eng
59
60                The release engineering mailing list.  Mail goes here
61                about patch releases and other release details.  There
62                is a public archive on menelaus.
63
64        * release-team
65
66                The mailing list for the release team, which sets
67                policy for releases.  There is a public archive on
68                menelaus (currently, it has the name "release-77").
69
70Permissions
71-----------
72
73Following are descriptions of the various groups found on the acls of
74the source tree:
75
76        * read:source
77          read:staff
78
79                These two groups have identical permissions in the
80                repository, but read:source contains artificial
81                constructs (the builder user and service principals)
82                while read:staff contains people.  In the future,
83                highly restricted source could have access for
84                read:source and not read:staff.
85
86                Both of these groups have read access to non-public
87                areas of the source tree.
88
89        * write:staff
90
91                Contains developers with commit access to the source
92                tree.  This group has write access to the repository,
93                but not to the checked-out copy of the mainline
94                (/mit/source).
95
96        * write:update
97
98                Contains the service principal responsible for
99                updating /mit/source.  This group has write access to
100                /mit/source but not to the repository.
101
102        * adm:source
103
104                This group has administrative access to the repository
105                and to /mit/source.
106
107system:anyuser has read access to public areas of the source tree and
108list access to the rest.  system:authuser occasionally has read access
109to areas that system:anyuser does not (synctree is the only current
110example).
111
112The script CVSROOT/afs-protections.sh in the repository makes sure the
113permissions are correct in the repository or in a working directory.
114Run it from the top level of the repository or of /mit/source, giving
115it the argument "repository" or "wd".
116
117The wash process
118----------------
119
120The wash process is a nightly rebuild of the source repository from
121scratch, intended to alert the source tree maintainers when someone
122checks in a change which causes the source tree to stop building.  The
123general architecture of the wash process is:
124
125        * Each night at midnight, a machine (currently small-gods)
126          performs a cvs update of the checked-out tree in
127          /afs/dev.mit.edu/source/src-current.  If the cvs update
128          fails, the update script sends mail to source-wash@mit.edu.
129          This machine is on read:source and write:update.
130
131        * Each night at 4:30am, a machine of each architecture
132          (currently whirlpool and snuggle) recreates empty /build and
133          /localsrvd filesystems and performs a build of the tree with
134          /srvd pointed at /localsrvd.  If the build fails, the update
135          script sends mail to source-wash@mit.edu with the last few
136          lines of the wash log, and saves the wash log in /var/wash
137          on the local disk.
138
139Source for the wash scripts lives in /afs/dev.mit.edu/service/wash.
140They are installed in /usr/local on the wash machines, along with a
141copy of krbtgp from the net-tools locker in /usr/local/bin.  Logs of
142the start and end times of the wash processes on each machine live in
143/afs/dev.mit.edu/service/wash/status/`hostname`.
144
145Imake templates
146---------------
147
148We don't like imake, but we maintain two sets of imake templates:
149
150        * packs/build/config
151
152                These templates are the legacy Athena build system.
153                They are specific to software in the athena hierarchy,
154                and one glorious day in the future they will no longer
155                be necessary.
156
157                For these templates, you should define TOPDIR to the
158                top-level source directory.
159
160        * packs/build/xconfig
161
162                These templates are used for building software which
163                uses X-style Imakefiles.  They may need periodic
164                updating as new versions of X are released.  These
165                templates are full of a lot of hacks, mostly because
166                the imake model isn't really adequate for dealing with
167                third-party software and local site customizations.
168
169                For these templates, you should define TOPDIR to "."
170                and SRCDIR to the top-level source directory.
171
172Release notes
173-------------
174
175There are two kinds of release notes, the system release notes and the
176user release notes.  The system release notes are more comprehensive
177and assume a higher level of technical knowledge, and are used in the
178construction of the user release notes.  It is the job of the release
179engineer to produce a set of system release notes for every release,
180with early versions towards the beginning of the release cycle.  The
181best way to make sure this happens is to maintain the system release
182notes throughout the entire development cycle.
183
184Thus, it is the job of the release engineer to watch the checkins to
185the source tree and enter a note about all user-visible changes in the
186system release notes, which live in /afs/dev.mit.edu/project/relnotes.
187Highly visible changes should appear near the beginning of the file,
188and less visible changes should appear towards the end.  Changes to
189particular subsystems should be grouped together when possible.
190
191Release cycles
192--------------
193
194Release cycles have five phases: crash and burn, alpha, beta, early,
195and the public release.  The release team has a set of criteria for
196entering and exiting each phase, which won't be covered here.  The
197following guidelines should help the release go smoothly:
198
199        * Crash and burn
200
201          This phase is for rel-eng internal testing.  The crash and
202          burn machines should be identified and used to test the
203          install and update.  System packs may be generated at will
204          by taking snapshots from the wash machine.  The system packs
205          volume does not need any replication.
206
207          System release notes should be prepared during this phase.
208
209          Before the transition from crash and burn to alpha, the
210          release engineer should do a sanity check on the new packs
211          by comparing a file listing of the new packs to a file
212          listing of the previous release's packs.  The release
213          engineer should also check the list of configuration files
214          for each platform (in packs/update/platform/*/configfiles)
215          and make sure that any configuration files which have
216          changed are listed as changed in the version script.
217          Finally, the release should be checked to make sure it won't
218          overflow partitions on any client machines; currently, SGIs
219          are not a problem (because they have one big partition) and
220          the most restrictive sizes on Solaris clients are 27713K and
221          51903K of useable space for the root and /usr partitions.
222
223        * Alpha
224
225          The alpha phase is for internal testing by the release team.
226          System packs may still be regenerated at will by taking
227          snapshots, but the system packs volume (and os volume)
228          should be read-only so it can be updated by a vos release.
229          Changes to the packs do not need to be propagated in patch
230          releases; testers are expected to be able to ensure
231          consistency by forcing repeat updates or reinstalling their
232          machines.
233
234          A draft of the system release notes should be ready by the
235          beginning of this phase.  User release notes should be
236          prepared during this phase.
237
238          Before the transition from alpha to beta, doc/third-party
239          should be checked to see if miscellaneous third-party files
240          (the ones not under the "third" hierarchy) should be
241          updated.
242
243          At the end of the alpha phase, a release branch should
244          be created with a name of the form athena-8_1, and tagged
245          with athena-8_1-beta.  A checked-out tree should be made in
246          /afs/dev.mit.edu/source for the release branch, with a name
247          of the form src-8.1.  A final snapshot of the system packs
248          should be constructed from the release branch, and the build
249          tree copied into /afs/dev.mit.edu/project/release.  Build
250          machines for the new release should be set up.
251
252        * Beta
253
254          The beta phase involves outside testers.  System packs and
255          os volumes should be replicated on multiple servers, and
256          permissions should be set to avoid accidental changes
257          (traditionally this means giving write access to
258          system:packs, a normally empty group).  Changes to the packs
259          must be propagated by patch releases.
260
261          User release notes should be essentially finished by the end
262          of this phase.  System release notes may continue to be
263          updated as bug fixes occur.
264
265        * Early
266
267          The early release involves more outside testers and some
268          cluster machines.  The release should be considered ready
269          for public consumption.
270
271          The release branch should be tagged with a name of the form
272          athena-8_1-early.
273
274        * Release
275
276          The release branch should be tagged with a name of the form
277          athena-8_1-release.
278
279One thing that needs to happen externally during a release cycle, if
280there is an OS upgrade involved, is the addition of compatibility
281symlinks under the arch directories of various lockers.  All of the
282lockers listed in packs/glue/specs definitely need to be hit, and the
283popular software lockers need to be hit as well.  Here is a reasonable
284list of popular lockers to get in addition to the glue ones:
285
286        consult
287        games
288        gnu
289        graphics
290        outland
291        sipb
292        tcl
293        watchmaker
294        windowmanagers
295        /afs/sipb/project/tcsh
296
297In addition, the third-party software lockers need to be updated; the
298third-party software group keeps their own list.
299
300Patch releases
301--------------
302
303Once a release has hit beta test, all changes to the release must be
304propagated through patch releases.  The steps to performing a patch
305release are:
306
307        * Check in the changes on the mainline (if they apply) and on
308          the release branch and update the relevant sections of the
309          source tree in /afs/dev.mit.edu/source.
310
311        * If the update needs to do anything other than track against
312          the system packs, you must prepare a version script which
313          deals with any transition issues, specifies whether to track
314          the OS volume, specifies whether to deal with a kernel
315          update, and specifies which if any configuration files need
316          to be updated.  See the update script
317          (packs/update/do-update.sh) for details.  See
318          packs/build/update/platform/*/configfiles for a list of
319          configuration files for a given platform.  The version
320          script should be checked in on the mainline and on the
321          release branch.
322
323        * Make sure to add symlinks in the build tree for any files
324          you have added.  Note that you probably added a build script
325          if the update needs to do anything other than track against
326          the system packs.
327
328        * In the build tree, bump the version number in
329          packs/build/version (the symlink should be broken for this
330          file to avoid having to change it in the source tree).
331
332        * If you are going to need to update binaries that users run
333          from the packs, go into the packs and move (don't copy) them
334          into a .deleted directory at the root of the packs.  This is
335          especially important for binaries like emacs and dash which
336          people run for long periods of time, to avoid making the
337          running processes dump core when the packs are released.
338
339        * Update the read-write volume of the packs to reflect the
340          changes you've made.  You can use the build.sh script to
341          build and install specific packages, or you can use the
342          do.sh script to build the package and then install specific
343          files (cutting and pasting from the output of "make -n
344          install DESTDIR=/srvd" is the safest way); updating the
345          fewest number of files is preferrable.  Remember to install
346          the version script.
347
348        * Use the build.sh script to build and install
349          packs/build/finish.  This will fix ownerships and update the
350          track lists and the like.
351
352        * It's a good idea to test the update from the read-write
353          packs by symlinking the read-write packs to /srvd on a test
354          machine and taking the update.  Note that when the machine
355          comes back up with the new version, it will probably
356          re-attach the read-write packs, so you may have to re-make
357          the symlink if you want to test stuff that's on the packs.
358
359        * At some non-offensive time, release the packs in the dev
360          cell.
361
362        * Send mail to rel-eng saying that the patch release went out,
363          and what was in it.  (You can find many example pieces of
364          mail in the discuss archive.)  Include instructions
365          explaining how to propagate the release to the athena cell.
366
367Rel-eng machines
368----------------
369
370There are six roles for rel-eng machines for each platform:
371
372        * A wash machine, for nightly rebuilds of the source tree
373          during the development cycle.
374
375        * A crash and burn machine, for testing the release.
376
377        * A current release build machine, for doing incremental
378          updates to the last public release.
379
380        * A new release build machine, for doing incremental updates
381          to the new release during the beta and early phases.
382
383        * A current release developer machine, for other developers to
384          build and test software on under the current release.
385
386        * A new release developer machine, for other developers to
387          build and test software on under the next release.
388
389Six machines for each platform is a lot, especially when two of them
390are only needed during a release cycle.  The following modifications
391can collapse the number of required machines to three:
392
393        * During the beta and early phases of a release cycle, the
394          wash can be shut down and the wash machines used as new
395          release build engines.
396
397        * The new release build machine can be used as a new release
398          developer machine during the beta and early phases.  Having
399          a separate machine is preferrable since it is useful to have
400          a new release developer machine during the entire
401          development cycle, not just during the last two phases of
402          the release cycle.  Sometimes the crash and burn machine may
403          be useful for developers, although it cannot be treated as a
404          reliable resource.
405
406        * The current release build machine can be used as a current
407          release developer machine.
408
409Here is a list of the rel-eng machines for each platform, with repeat
410machine names listed in parentheses:
411
412                                        Sun             SGI
413
414        Wash                            whirlpool       kenmore
415        Current release build           downy           snuggle
416        Crash and burn                  sourcery        pyramids
417        New release build               (whirlpool)     (snuggle)
418        Current release developer       (downy)         (kenmore)
419        New release developer           (whirlpool)     (snuggle)
420
421For reference, here are some names that fit various laundry and
422construction naming schemes:
423
424        * Washing machines: kenmore, whirlpool, ge, maytag
425        * Laundry detergents: fab, calgon, era, cheer, woolite,
426                tide, ultra-tide
427        * Bleaches: clorox, ajax
428        * Fabric softeners: downy, final-touch, snuggle, bounce
429        * Heavy machinery: steam-shovel, pile-driver, dump-truck,
430                wrecking-ball, crane
431        * Construction kits: lego, capsela, technics, k-nex, playdoh,
432                construx
433        * Construction materials: rebar, two-by-four, plywood,
434                sheetrock
435        * Heavy machinery companies: caterpillar, daewoo, john-deere,
436                sumitomo
437        * Buildings: empire-state, prudential, chrysler
438
439Clusters
440--------
441
442The getcluster(8) man explains how clients interpret cluster
443information.  This section documents the clusters related to the
444release cycle, and how they should be managed.
445
446There are five clusters for each platform, each of the form
447PHASE-PLATFORM, where PHASE is a phase of the release cycle (crash,
448alpha, beta, early, public) and PLATFORM is the machtype name of the
449platform.  There are two filsys entries for each platform and release
450pointing to the athena cell and dev cell system packs for the release;
451they have the form athena-PLATFORMsys-XY and dev-PLATFORMsys-XY, where
452X and Y are the major and minor numbers of the release.  For the SGI,
453we currently also have athena-sgi-inst-XY and dev-sgi-inst-XY.
454
455At the crash and burn, alpha, and beta phases of the release cycle,
456the appropriate cluster (PHASE-PLATFORM) should be updated to include
457data records of the form:
458
459        Label: syslib           Data: dev-PLATFORMsys-XY X.Y t
460(SGI)   Label: instlib          Data: dev-sgi-inst-XY X.Y t
461
462This change will cause console messages to appear on the appropriate
463machines informing their maintainers of a new testing release which
464they can take manually.
465
466At the early and public phases of the release cycle, the 't' should be
467removed from the new syslib records in the crash, alpha, and beta
468clusters, and the appropriate cluster (early-PLATFORM or
469public-PLATFORM) should be updated to include data records:
470
471        Label: syslib           Data: athena-PLATFORMsys-XY X.Y
472(SGI)   Label: instlib          Data: athena-sgi-inst-XY X.Y
473
474This change will cause AUTOUPDATE machines in the appropriate cluster
475(as well as the crash, alpha, and beta clusters) to take the new
476release; console messages will appear on non-AUTOUPDATE machines.
Note: See TracBrowser for help on using the repository browser.