Ticket #1258 (closed defect: fixed)

Opened 9 years ago

Last modified 9 years ago

Stop sending afs byte-range lock messages to wslogger

Reported by: jdreed Owned by:
Priority: high Milestone: Quantal Quetzal
Component: -- Keywords:
Cc: nthaler@…, jweiss@… Fixed in version: debathena-syslog-config 1.9
Upstream bug:

Description

Firefox is now doing a lot more byte-range locking than it used to, and AFS likes to spew about this every single time to kern.warning. This causes urania's root partition to fill up. We should drop these client-side, since server-side filtering can't be done.

Change History

comment:1 Changed 9 years ago by jdreed

  • Cc nthaler@… added

comment:2 Changed 9 years ago by jdreed

  • Status changed from new to committed
  • Fixed in version set to debathena-syslog-config 1.7

Tested and committed in r25771.

comment:3 Changed 9 years ago by jdreed

  • Status changed from committed to development

comment:4 Changed 9 years ago by jdreed

  • Status changed from development to proposed

Uploaded to proposed. You should be able to un-block m56-129-{1..3} and w20-575-{1..7} and see no more messages after the machines take the new package (desync'd 6 hours). If nobody complains, this will go to production on Monday.

comment:5 Changed 9 years ago by jdreed

  • Status changed from proposed to closed
  • Resolution set to fixed

Uploaded to production, and machines will update over the next 6 hours.

comment:6 Changed 9 years ago by jweiss

  • Status changed from closed to reopened
  • Resolution fixed deleted

We're still seeing these logs arrive at urania. The first two machines I checked appear to already have the patch:

Oct 2 15:30:58 w20-575-16 W20-575-16 kernel: [953274.620183] afs: byte-range locks only enforced for processes on this machine (pid 28657 (firefox), user 90713).
Oct 2 15:30:58 lib-ldew-17 LIB-LDEW-17 kernel: [79091.246684] afs: byte-range locks only enforced for processes on this machine (pid 18166 (firefox), user 109293).

speaker-for-the-dead:~$ athinfo w20-575-16 packages | grep -i syslog
debathena-syslog 1.1~ubuntu12.04
debathena-syslog-config 1.7~ubuntu12.04
rsyslog 5.8.6-1ubuntu8
speaker-for-the-dead:~$ athinfo lib-ldew-17 packages | grep -i syslog
debathena-syslog 1.1~ubuntu12.04
debathena-syslog-config 1.7~ubuntu12.04
rsyslog 5.8.6-1ubuntu8

comment:7 Changed 9 years ago by jdreed

  • Status changed from reopened to proposed
  • Fixed in version changed from debathena-syslog-config 1.7 to debathena-syslog-config 1.8

OK, well, I give up then. I don't have time to debug this before I leave, so I'm just going to drop everything below err. If _that_ still doesn't work, then I have no idea what is going on. We can re-investigate this later. It's unclear what we're losing by dropping *.warning, but it's clearly an improvement over dropping *.*.

comment:8 follow-up: ↓ 9 Changed 9 years ago by jdreed

  • Cc jweiss@… added

OK, I've built 1.9 (1.7 with "restart" instead of "reload" in the postinst) to proposed. Please unblock all the beta-linux cluster machines and let me know if the logs from them start disappearing over the next 6 hours.

comment:9 in reply to: ↑ 8 Changed 9 years ago by jweiss

Replying to jdreed:

OK, I've built 1.9 (1.7 with "restart" instead of "reload" in the postinst) to proposed. Please unblock all the beta-linux cluster machines and let me know if the logs from them start disappearing over the next 6 hours.

Tentatively, things look good. I don't see any byte-range logs from w20-575-(1|2|3|4|5|6|7) or m56-129-(1|2|3) since Oct 3 15:08:38. That said, as noted yesterday, we're trying to prove a negative here (unfortunately my back of the envelope calculations say we should expect byte-range messages from 25-60% of machines in a 24 hour period, so I don't think our 10 machine beta-test is guaranteed to be statistically significant yet. That said, it doesn't look like we've accident;y started dropping all logs from these machines or anything, so I don't think it is wrong to push this patch to production, and then see if it works, especially given your schedule.

comment:10 Changed 9 years ago by jdreed

  • Status changed from proposed to closed
  • Fixed in version changed from debathena-syslog-config 1.8 to debathena-syslog-config 1.9
  • Resolution set to fixed

Third time's the charm.

comment:11 follow-up: ↓ 12 Changed 9 years ago by jweiss

Well, so far today, we've seen byte-range log messages from two athena9.4 machines, two cluster workstations that haven't taken the new package yet (E51-075-5 and w20-575-19) and one machine that appears to have updated on Friday, and I have no idea why it is still sending messages (LIB-LMUS-03), but I believe we've more or less solved the problem. I've released all of the blocks on urania (and am leaving this ticket marked as closed).

comment:12 in reply to: ↑ 11 Changed 9 years ago by jweiss

Replying to jweiss:

Well, so far today, we've seen byte-range log messages from two athena9.4 machines, two cluster workstations that haven't taken the new package yet (E51-075-5 and w20-575-19) and one machine that appears to have updated on Friday, and I have no idea why it is still sending messages (LIB-LMUS-03), but I believe we've more or less solved the problem. I've released all of the blocks on urania (and am leaving this ticket marked as closed).

I think I've figured out the LIB-LMUS-3 problem. It appears that firefox/AFS can log so many of these messages together than they get garbled, and only parts of the text appear in the syslog. I hypothesize that this is happening on the workstation itself, and that I was searching for "byte-range" and the workstation was punting things that contained "afs: byte-range lock" so some garbled messages were making it to urania, and with the looser regexp I was finding them. We'll have to see if we've eliminated enoug of the logs not to fill urania's disk any more. I'm adding some additional punt regexps to the daily gazette to try to deal with the garbled logs (which since there are name ways to garble them don't get rolled together well by the gazette.

Note: See TracTickets for help on using tickets.