1 | This file contains notes about the care and feeding of the Athena |
---|
2 | source repository. It is intended primarily for the administrators of |
---|
3 | the source tree, not for developers (except perhaps for the first |
---|
4 | section, "mailing lists"). See the file "procedures" in this |
---|
5 | directory for information about procedures relevant to developers. |
---|
6 | |
---|
7 | The areas covered in this file are: |
---|
8 | |
---|
9 | Mailing lists |
---|
10 | Permissions |
---|
11 | The wash process |
---|
12 | Imake templates |
---|
13 | Release notes |
---|
14 | Release cycles |
---|
15 | Patch releases |
---|
16 | Rel-eng machines |
---|
17 | Cluster information |
---|
18 | |
---|
19 | Mailing lists |
---|
20 | ------------- |
---|
21 | |
---|
22 | Here are descriptions of the mailing lists related to the source tree: |
---|
23 | |
---|
24 | * source-developers |
---|
25 | |
---|
26 | For discussion of the policy and day-to-day |
---|
27 | maintenance of the repository. This is a public list, |
---|
28 | and there is a public discuss archive on menelaus. |
---|
29 | |
---|
30 | * source-reviewers |
---|
31 | |
---|
32 | For review of changes to be checked into the |
---|
33 | repository. To be a member of this mailing list, you |
---|
34 | must have read access to the non-public parts of the |
---|
35 | source tree, but you do not need to be a staff member. |
---|
36 | There is a non-public discuss archive on menelaus. |
---|
37 | |
---|
38 | * source-commits |
---|
39 | |
---|
40 | This mailing lists receives commit logs for all |
---|
41 | commits to the repository. This is a public mailing |
---|
42 | list. There is a public discuss archive on menelaus. |
---|
43 | |
---|
44 | * source-diffs |
---|
45 | |
---|
46 | This mailing list receives commit logs with diffs for |
---|
47 | all commits to the repository. To be on this mailing |
---|
48 | list, you must have read access to the non-public |
---|
49 | parts of the source tree. There is no discuss archive |
---|
50 | for this list. |
---|
51 | |
---|
52 | * source-wash |
---|
53 | |
---|
54 | This mailing list receives mail when the wash process |
---|
55 | blows out. This is a public mailing list. There is |
---|
56 | no discuss archive for this list. |
---|
57 | |
---|
58 | * rel-eng |
---|
59 | |
---|
60 | The release engineering mailing list. Mail goes here |
---|
61 | about patch releases and other release details. There |
---|
62 | is a public archive on menelaus. |
---|
63 | |
---|
64 | * release-team |
---|
65 | |
---|
66 | The mailing list for the release team, which sets |
---|
67 | policy for releases. There is a public archive on |
---|
68 | menelaus (currently, it has the name "release-77"). |
---|
69 | |
---|
70 | Permissions |
---|
71 | ----------- |
---|
72 | |
---|
73 | Following are descriptions of the various groups found on the acls of |
---|
74 | the source tree: |
---|
75 | |
---|
76 | * read:source |
---|
77 | read:staff |
---|
78 | |
---|
79 | These two groups have identical permissions in the |
---|
80 | repository, but read:source contains artificial |
---|
81 | constructs (the builder user and service principals) |
---|
82 | while read:staff contains people. In the future, |
---|
83 | highly restricted source could have access for |
---|
84 | read:source and not read:staff. |
---|
85 | |
---|
86 | Both of these groups have read access to non-public |
---|
87 | areas of the source tree. |
---|
88 | |
---|
89 | * write:staff |
---|
90 | |
---|
91 | Contains developers with commit access to the source |
---|
92 | tree. This group has write access to the repository, |
---|
93 | but not to the checked-out copy of the mainline |
---|
94 | (/mit/source). |
---|
95 | |
---|
96 | * write:update |
---|
97 | |
---|
98 | Contains the service principal responsible for |
---|
99 | updating /mit/source. This group has write access to |
---|
100 | /mit/source but not to the repository. |
---|
101 | |
---|
102 | * adm:source |
---|
103 | |
---|
104 | This group has administrative access to the repository |
---|
105 | and to /mit/source. |
---|
106 | |
---|
107 | system:anyuser has read access to public areas of the source tree and |
---|
108 | list access to the rest. system:authuser occasionally has read access |
---|
109 | to areas that system:anyuser does not (synctree is the only current |
---|
110 | example). |
---|
111 | |
---|
112 | The script CVSROOT/afs-protections.sh in the repository makes sure the |
---|
113 | permissions are correct in the repository or in a working directory. |
---|
114 | Run it from the top level of the repository or of /mit/source, giving |
---|
115 | it the argument "repository" or "wd". |
---|
116 | |
---|
117 | The wash process |
---|
118 | ---------------- |
---|
119 | |
---|
120 | The wash process is a nightly rebuild of the source repository from |
---|
121 | scratch, intended to alert the source tree maintainers when someone |
---|
122 | checks in a change which causes the source tree to stop building. The |
---|
123 | general architecture of the wash process is: |
---|
124 | |
---|
125 | * Each night at midnight, a machine (currently small-gods) |
---|
126 | performs a cvs update of the checked-out tree in |
---|
127 | /afs/dev.mit.edu/source/src-current. If the cvs update |
---|
128 | fails, the update script sends mail to source-wash@mit.edu. |
---|
129 | This machine is on read:source and write:update. |
---|
130 | |
---|
131 | * Each night at 4:30am, a machine of each architecture |
---|
132 | (currently whirlpool, kenmore, and maytag) performs a build |
---|
133 | of the tree into /var/srvd.new, using the build directory |
---|
134 | /var/build. If the build fails, the wash script copies the |
---|
135 | log of the failed build into AFS and sends mail to |
---|
136 | source-wash@mit.edu with the last few lines of the log. If |
---|
137 | the build succeeds, the wash script moves /var/srvd.new to |
---|
138 | /var/srvd, so that /var/srvd is always the last successful |
---|
139 | build of the source tree. |
---|
140 | |
---|
141 | * Each Sunday at 1:00am, the wash machines make a copy of |
---|
142 | their last successful builds into a "srvd-current" directory |
---|
143 | in AFS. The copy is done without system:administrator |
---|
144 | privileges, so the file permissions on srvd-current are all |
---|
145 | wrong, but the current srvd is useful for development work. |
---|
146 | |
---|
147 | Source for the wash scripts lives in /afs/dev.mit.edu/service/wash. |
---|
148 | They are installed in /usr/local on the wash machines. Logs of the |
---|
149 | start and end times of the wash processes on each machine live in |
---|
150 | /afs/dev.mit.edu/service/wash/status/`hostname`. |
---|
151 | |
---|
152 | Imake templates |
---|
153 | --------------- |
---|
154 | |
---|
155 | We don't like imake, but we maintain two sets of imake templates: |
---|
156 | |
---|
157 | * packs/build/config |
---|
158 | |
---|
159 | These templates are the legacy Athena build system. |
---|
160 | They are specific to software in the athena hierarchy, |
---|
161 | and one glorious day in the future they will no longer |
---|
162 | be necessary. |
---|
163 | |
---|
164 | For these templates, you should define TOPDIR to the |
---|
165 | top-level source directory. |
---|
166 | |
---|
167 | * packs/build/xconfig |
---|
168 | |
---|
169 | These templates are used for building software which |
---|
170 | uses X-style Imakefiles. They may need periodic |
---|
171 | updating as new versions of X are released. These |
---|
172 | templates are full of a lot of hacks, mostly because |
---|
173 | the imake model isn't really adequate for dealing with |
---|
174 | third-party software and local site customizations. |
---|
175 | |
---|
176 | For these templates, you should define TOPDIR to "." |
---|
177 | and SRCDIR to the top-level source directory. |
---|
178 | |
---|
179 | Release notes |
---|
180 | ------------- |
---|
181 | |
---|
182 | There are two kinds of release notes, the system release notes and the |
---|
183 | user release notes. The system release notes are more comprehensive |
---|
184 | and assume a higher level of technical knowledge, and are used in the |
---|
185 | construction of the user release notes. It is the job of the release |
---|
186 | engineer to produce a set of system release notes for every release, |
---|
187 | with early versions towards the beginning of the release cycle. The |
---|
188 | best way to make sure this happens is to maintain the system release |
---|
189 | notes throughout the entire development cycle. |
---|
190 | |
---|
191 | Thus, it is the job of the release engineer to watch the checkins to |
---|
192 | the source tree and enter a note about all user-visible changes in the |
---|
193 | system release notes, which live in /afs/dev.mit.edu/project/relnotes. |
---|
194 | Highly visible changes should appear near the beginning of the file, |
---|
195 | and less visible changes should appear towards the end. Changes to |
---|
196 | particular subsystems should be grouped together when possible. |
---|
197 | |
---|
198 | Release cycles |
---|
199 | -------------- |
---|
200 | |
---|
201 | Release cycles have five phases: crash and burn, alpha, beta, early, |
---|
202 | and the public release. The release team has a set of criteria for |
---|
203 | entering and exiting each phase, which won't be covered here. The |
---|
204 | following guidelines should help the release go smoothly: |
---|
205 | |
---|
206 | * Crash and burn |
---|
207 | |
---|
208 | This phase is for rel-eng internal testing. The crash and |
---|
209 | burn machines should be identified and used to test the |
---|
210 | install and update. System packs may be generated at will |
---|
211 | by taking snapshots from the wash machine. The system packs |
---|
212 | volume does not need any replication. |
---|
213 | |
---|
214 | System release notes should be prepared during this phase. |
---|
215 | |
---|
216 | Before the transition from crash and burn to alpha, the |
---|
217 | release engineer should do a sanity check on the new packs |
---|
218 | by comparing a file listing of the new packs to a file |
---|
219 | listing of the previous release's packs. The release |
---|
220 | engineer should also check the list of configuration files |
---|
221 | for each platform (in packs/update/platform/*/configfiles) |
---|
222 | and make sure that any configuration files which have |
---|
223 | changed are listed as changed in the version script. |
---|
224 | Finally, the release should be checked to make sure it won't |
---|
225 | overflow partitions on any client machines; currently, SGIs |
---|
226 | are not a problem (because they have one big partition) and |
---|
227 | the most restrictive sizes on Solaris clients are 27713K and |
---|
228 | 51903K of useable space for the root and /usr partitions. |
---|
229 | |
---|
230 | * Alpha |
---|
231 | |
---|
232 | The alpha phase is for internal testing by the release team. |
---|
233 | System packs may still be regenerated at will by taking |
---|
234 | snapshots, but the system packs volume (and os volume) |
---|
235 | should be read-only so it can be updated by a vos release. |
---|
236 | Changes to the packs do not need to be propagated in patch |
---|
237 | releases; testers are expected to be able to ensure |
---|
238 | consistency by forcing repeat updates or reinstalling their |
---|
239 | machines. |
---|
240 | |
---|
241 | A draft of the system release notes should be ready by the |
---|
242 | beginning of this phase. User release notes should be |
---|
243 | prepared during this phase. |
---|
244 | |
---|
245 | Before the transition from alpha to beta, doc/third-party |
---|
246 | should be checked to see if miscellaneous third-party files |
---|
247 | (the ones not under the "third" hierarchy) should be |
---|
248 | updated. |
---|
249 | |
---|
250 | * Beta |
---|
251 | |
---|
252 | The beta phase involves outside testers. System packs and |
---|
253 | os volumes should be replicated on multiple servers, and |
---|
254 | permissions should be set to avoid accidental changes |
---|
255 | (traditionally this means giving write access to |
---|
256 | system:packs, a normally empty group). Changes to the packs |
---|
257 | must be propagated by patch releases. |
---|
258 | |
---|
259 | User release notes should be essentially finished by the end |
---|
260 | of this phase. System release notes may continue to be |
---|
261 | updated as bug fixes occur. Ideally, no new features should |
---|
262 | be committed to the source tree during the beta phase. |
---|
263 | |
---|
264 | At the end of the beta phase, a release branch should |
---|
265 | be created with a name of the form athena-8_1, and tagged |
---|
266 | with athena-8_1-early. A checked-out tree should be made in |
---|
267 | /afs/dev.mit.edu/source for the release branch, with a name |
---|
268 | of the form src-8.1. It should have a locker with a name of |
---|
269 | the form source-8.1. A final full build of the system packs |
---|
270 | should be done from the release branch, with the build tree |
---|
271 | located in /afs/dev.mit.edu/project/release. The new |
---|
272 | release build machines should be set up for incremental |
---|
273 | changes to the new release at this point (which means |
---|
274 | turning off the wash). |
---|
275 | |
---|
276 | * Early |
---|
277 | |
---|
278 | The early release involves more outside testers and some |
---|
279 | cluster machines. The release should be considered ready |
---|
280 | for public consumption. |
---|
281 | |
---|
282 | The release branch should be tagged with a name of the form |
---|
283 | athena-8_1-early. |
---|
284 | |
---|
285 | * Release |
---|
286 | |
---|
287 | The release branch should be tagged with a name of the form |
---|
288 | athena-8_1-release. |
---|
289 | |
---|
290 | One thing that needs to happen externally during a release cycle, if |
---|
291 | there is an OS upgrade involved, is the addition of compatibility |
---|
292 | symlinks under the arch directories of various lockers. All of the |
---|
293 | lockers listed in packs/glue/specs definitely need to be hit, and the |
---|
294 | popular software lockers need to be hit as well. Here is a reasonable |
---|
295 | list of popular lockers to get in addition to the glue ones: |
---|
296 | |
---|
297 | consult |
---|
298 | games |
---|
299 | gnu |
---|
300 | graphics |
---|
301 | outland |
---|
302 | sipb |
---|
303 | tcl |
---|
304 | watchmaker |
---|
305 | windowmanagers |
---|
306 | /afs/sipb/project/tcsh |
---|
307 | |
---|
308 | In addition, the third-party software lockers need to be updated; the |
---|
309 | third-party software group keeps their own list. |
---|
310 | |
---|
311 | Patch releases |
---|
312 | -------------- |
---|
313 | |
---|
314 | Once a release has hit beta test, all changes to the release must be |
---|
315 | propagated through patch releases. The steps to performing a patch |
---|
316 | release are: |
---|
317 | |
---|
318 | * Check in the changes on the mainline (if they apply) and on |
---|
319 | the release branch and update the relevant sections of the |
---|
320 | source tree in /afs/dev.mit.edu/source. |
---|
321 | |
---|
322 | * If the update needs to do anything other than track against |
---|
323 | the system packs, you must prepare a version script which |
---|
324 | deals with any transition issues, specifies whether to track |
---|
325 | the OS volume, specifies whether to deal with a kernel |
---|
326 | update, and specifies which if any configuration files need |
---|
327 | to be updated. See the update script |
---|
328 | (packs/update/do-update.sh) for details. See |
---|
329 | packs/build/update/platform/*/configfiles for a list of |
---|
330 | configuration files for a given platform. The version |
---|
331 | script should be checked in on the mainline and on the |
---|
332 | release branch. |
---|
333 | |
---|
334 | * Make sure to add symlinks in the build tree for any files |
---|
335 | you have added. Note that you probably added a build script |
---|
336 | if the update needs to do anything other than track against |
---|
337 | the system packs. |
---|
338 | |
---|
339 | * In the build tree, bump the version number in |
---|
340 | packs/build/version (the symlink should be broken for this |
---|
341 | file to avoid having to change it in the source tree). |
---|
342 | |
---|
343 | * If you are going to need to update binaries that users run |
---|
344 | from the packs, go into the packs and move (don't copy) them |
---|
345 | into a .deleted directory at the root of the packs. This is |
---|
346 | especially important for binaries like emacs and dash which |
---|
347 | people run for long periods of time, to avoid making the |
---|
348 | running processes dump core when the packs are released. |
---|
349 | |
---|
350 | * Update the read-write volume of the packs to reflect the |
---|
351 | changes you've made. You can use the build.sh script to |
---|
352 | build and install specific packages, or you can use the |
---|
353 | do.sh script to build the package and then install specific |
---|
354 | files (cutting and pasting from the output of "make -n |
---|
355 | install DESTDIR=/srvd" is the safest way); updating the |
---|
356 | fewest number of files is preferrable. Remember to install |
---|
357 | the version script. |
---|
358 | |
---|
359 | * Use the build.sh script to build and install |
---|
360 | packs/build/finish. This will fix ownerships and update the |
---|
361 | track lists and the like. |
---|
362 | |
---|
363 | * It's a good idea to test the update from the read-write |
---|
364 | packs by symlinking the read-write packs to /srvd on a test |
---|
365 | machine and taking the update. Note that when the machine |
---|
366 | comes back up with the new version, it will probably |
---|
367 | re-attach the read-write packs, so you may have to re-make |
---|
368 | the symlink if you want to test stuff that's on the packs. |
---|
369 | |
---|
370 | * At some non-offensive time, release the packs in the dev |
---|
371 | cell. |
---|
372 | |
---|
373 | * Send mail to rel-eng saying that the patch release went out, |
---|
374 | and what was in it. (You can find many example pieces of |
---|
375 | mail in the discuss archive.) Include instructions |
---|
376 | explaining how to propagate the release to the athena cell. |
---|
377 | |
---|
378 | Rel-eng machines |
---|
379 | ---------------- |
---|
380 | |
---|
381 | There are three rel-eng machines for each platform: |
---|
382 | |
---|
383 | * A current release build machine, for doing incremental |
---|
384 | updates to the last public release. This machine may also |
---|
385 | be used by developers for building software. |
---|
386 | |
---|
387 | * A new release build machine, for building and doing |
---|
388 | incremental updates to releases which are still in testing. |
---|
389 | Before a new release goes into testing, this machine should |
---|
390 | perform the wash. This machine may also be used by |
---|
391 | developers for building software, or if they want a snapshot |
---|
392 | of the new system packs to build things against. |
---|
393 | |
---|
394 | * A crash and burn machine, usually located in the release |
---|
395 | engineer's office for easy physical access. |
---|
396 | |
---|
397 | Here is a list of the rel-eng machines for each platform: |
---|
398 | |
---|
399 | Sun Indy O2 |
---|
400 | |
---|
401 | Current release build downy snuggle bounce |
---|
402 | New release build whirlpool kenmore maytag |
---|
403 | Crash and burn sourcery pyramids reaper-man |
---|
404 | |
---|
405 | For reference, here are some names that fit various laundry and |
---|
406 | construction naming schemes: |
---|
407 | |
---|
408 | * Washing machines: kenmore, whirlpool, ge, maytag |
---|
409 | * Laundry detergents: fab, calgon, era, cheer, woolite, |
---|
410 | tide, ultra-tide |
---|
411 | * Bleaches: clorox, ajax |
---|
412 | * Fabric softeners: downy, final-touch, snuggle, bounce |
---|
413 | * Heavy machinery: steam-shovel, pile-driver, dump-truck, |
---|
414 | wrecking-ball, crane |
---|
415 | * Construction kits: lego, capsela, technics, k-nex, playdoh, |
---|
416 | construx |
---|
417 | * Construction materials: rebar, two-by-four, plywood, |
---|
418 | sheetrock |
---|
419 | * Heavy machinery companies: caterpillar, daewoo, john-deere, |
---|
420 | sumitomo |
---|
421 | * Buildings: empire-state, prudential, chrysler |
---|
422 | |
---|
423 | Clusters |
---|
424 | -------- |
---|
425 | |
---|
426 | The getcluster(8) man explains how clients interpret cluster |
---|
427 | information. This section documents the clusters related to the |
---|
428 | release cycle, and how they should be managed. |
---|
429 | |
---|
430 | There are five clusters for each platform, each of the form |
---|
431 | PHASE-PLATFORM, where PHASE is a phase of the release cycle (crash, |
---|
432 | alpha, beta, early, public) and PLATFORM is the machtype name of the |
---|
433 | platform. There are two filsys entries for each platform and release |
---|
434 | pointing to the athena cell and dev cell system packs for the release; |
---|
435 | they have the form athena-PLATFORMsys-XY and dev-PLATFORMsys-XY, where |
---|
436 | X and Y are the major and minor numbers of the release. For the SGI, |
---|
437 | we currently also have athena-sgi-inst-XY and dev-sgi-inst-XY. |
---|
438 | |
---|
439 | At the crash and burn, alpha, and beta phases of the release cycle, |
---|
440 | the appropriate cluster (PHASE-PLATFORM) should be updated to include |
---|
441 | data records of the form: |
---|
442 | |
---|
443 | Label: syslib Data: dev-PLATFORMsys-XY X.Y t |
---|
444 | (SGI) Label: instlib Data: dev-sgi-inst-XY X.Y t |
---|
445 | |
---|
446 | This change will cause console messages to appear on the appropriate |
---|
447 | machines informing their maintainers of a new testing release which |
---|
448 | they can take manually. |
---|
449 | |
---|
450 | At the early and public phases of the release cycle, the 't' should be |
---|
451 | removed from the new syslib records in the crash, alpha, and beta |
---|
452 | clusters, and the appropriate cluster (early-PLATFORM or |
---|
453 | public-PLATFORM) should be updated to include data records: |
---|
454 | |
---|
455 | Label: syslib Data: athena-PLATFORMsys-XY X.Y |
---|
456 | (SGI) Label: instlib Data: athena-sgi-inst-XY X.Y |
---|
457 | |
---|
458 | This change will cause AUTOUPDATE machines in the appropriate cluster |
---|
459 | (as well as the crash, alpha, and beta clusters) to take the new |
---|
460 | release; console messages will appear on non-AUTOUPDATE machines. |
---|