Ticket #516 (closed enhancement: fixed)

Opened 14 years ago

Last modified 14 years ago

syslog if auto-update fails

Reported by: geofft Owned by:
Priority: blocker Milestone:
Component: -- Keywords:
Cc: Fixed in version:
Upstream bug:

Description

and coordinate with ops so they pick up the logs and report on that usefully, so we know in advance when cluster machines haven't updated in a month.

Change History

comment:1 Changed 14 years ago by jdreed

  • Status changed from new to closed
  • Resolution set to fixed

auto-update already syslogs on user.notice if something bad happens.

Mail sent to ops about collecting failure notices, RT #1179647.

I think we can close this.

comment:2 Changed 14 years ago by jdreed

  • Status changed from closed to reopened
  • Resolution fixed deleted

Hrm, or not. Camilla notes that ops is moving to Nagios and away from the gazette/reports for this sort of thing. Having Nagios parse update.log is suckful and prone to false positives, and we lose historical data

I propose we have auto-update create a flag file somewhere which contains the following:
date-of-last-update date-of-last-attempt up-to-date|failed msg

and expose that over athinfo.

So something like:
$ athinfo w20-575-1 update-status
3/12/10 2:00pm 3/12/10 2:00pm up-to-date
$ athinfo w20-575-1 update-status
10/3/09 1:00pm 3/12/10 2:00pm failed unable to download packages

comment:3 Changed 14 years ago by jdreed

  • Status changed from reopened to proposed

comment:4 Changed 14 years ago by broder

  • Status changed from proposed to closed
  • Resolution set to fixed

This has all been moved to production:

fanty:~ evan$ athinfo w20-575-1 update-status
Fri Apr  2 03:42:27 EDT 2010|Sun Apr  4 09:42:14 EDT 2010|ok|No updates
Note: See TracTickets for help on using tickets.