Ticket #153 (closed defect: fixed)

Opened 6 years ago

Last modified 5 years ago

Do something clever with ~/.xsession-errors so that the dialog box is useful

Reported by: jdreed Owned by:
Priority: normal Milestone: Summer 2009 Deployment
Component: dotfiles Keywords:
Cc: Fixed in version:
Upstream bug:  https://bugs.launchpad.net/bugs/382879

Description

Currently, when logins fail, gdm will happily write the error log to ~/.xsession-errors and then refuse to open it to show it to the user, because it has no tokens. While it is possible to go log in to another machine remotely and view the file, that kind of sucks for login debugging, especially. Selecting a Failsafe GNOME session won't necessarily help, either, if .xsession-errors is complaining about the login volume lying around.

In /etc/X11/Xsession, $ERRFILE is set to $HOME/.xsession-errors. I wonder if in /etc/gdm/Xsession or the PreSession?, we could change this variable to point somewhere else, somewhere that gdm would be able to read the file.

Presumably it's also a bug that gdm opens that error window in a session that doesn't have tokens, and we should report that, but we should also try to fix this before summer.

Change History

comment:1 Changed 6 years ago by ghudson

Here's a walkthrough of the current gdm implementation (you can follow along with "apt-get source gdm"):

  • GDM separates its work into the daemon process which runs as root, the greeter process which runs as user gdm, and the slave login process which starts out running as root and then switches to the user being logged in. The GUI code is a separate program (gdmgreeter) while the slave login code is part of gdm itself. Only the slave code ever gets access to the user home directory.
  • The slave redirects output to $HOME/.xsession-errors; this is hardcoded. (The line you found in /etc/X11/Xsession does not apply to the gdm world; see the comments in /etc/gdm/Xsession, which is the operable script.) If the session fails, the slave sends a little RPC-like request to the daemon asking it to show the .xsession-errors file; the filename is again hardcoded with no configuration. This is all in daemon/slave.c; search for "xsession-errors".
  • Back in daemon/gdm.c (search for GDM_SOP_SHOW_ERROR_DIALOG), the daemon receives the message and filename from the slave. It then displays an error dialog directly to the display--although the greeter process normally handles GUI interactions, it is not involved here. This is convenient for gdm because the greeter process wouldn't have permissions to read user homedirs anyway.

So, unfortunately, there is no probably configuration change to alleviate the problem. I can see two avenues for a code fix which might make it upstream:

(1) Make the xsession-errors filename configurable.

(2) Pass the contents of xsession-errors from the slave to the daemon, and not just the filename. It's just going to get read into memory by the daemon process anyway.

comment:2 Changed 6 years ago by wdc

This bug is very important to resolve, so I've also added ATN-61 that points at
this trac.
 https://jira.mit.edu/jira/browse/ATN-61

comment:3 Changed 6 years ago by ghudson

I suppose we could stop relying on gdm to display the contents of xsession-errors, and change reactivate's version of /etc/gdm/Xsession to display that file if the session fails.

(Arguably this applies to -login-graphical and not just -cluster, so reactivate isn't the best integration point, but reactivate currently eats all of the good integration points.)

comment:4 Changed 6 years ago by jdreed

Here's a terrible idea (although it doesn't involve modifying any upstream files):

We add the following at the top of /usr/athena/lib/init/xsession.tcsh

xconsole -daemon -file $HOME/.xsession-errors 
echo "******  Running Athena Dotfiles ******"

and then right before the exec at the bottom:

pkill xconsole

This makes xconsole pop up in time for the user to see any output in it before they acknowledge GNOME's "Your session lasted less than 10 seconds" dialog. Of course, it makes xconsole pop up for all tcsh users, but it quickly vanishes if all is well.

Basically, the problem I'm trying to solve is tcsh users who reference undefined variables in their dotfiles. If we're going to drastically change dotfile and shell behavior, we have to present the users with useful error information. I'm less concerned about bash users because undefined variables aren't fatal in bash. I'm also not that concerned about all tcsh users seeing xconsole briefly pop up and vanish -- that could be part of our "Here's why you want to switch to bash" campaign.

comment:5 Changed 6 years ago by tabbott

If we're going this route, we should record the pid from the call to xconsole and just kill that process later, rather than using pkill.

comment:6 Changed 5 years ago by jdreed

  • Priority changed from minor to major
  • Milestone set to Fall Release

At release-team, we agreed to go with the annoying xconsole-based solution. Tim's suggestion of recording the pid and killing it later needs to be implemented, and then we should package this.

This should get deployed either before or at the same time as the dotfile cleanup (#151), because the whole point of this is to allow users to see what caused weird login failures.

comment:7 Changed 5 years ago by jdreed

  • Component changed from -- to dotfiles
  • Milestone changed from Fall Release to Summer Deployment

Moving to Summer milestone to be on par with #256.

These should happen for the Summer Milestone so that we can update our training docs to reflect things like the new prompt for tcsh users.

comment:8 Changed 5 years ago by jdreed

  • Upstream bug set to https://bugs.launchpad.net/bugs/382879

comment:9 Changed 5 years ago by broder

  • Status changed from new to proposed

Should be fixed in r23841. That fix has been uploaded to proposed. Other people should test it before thinking about moving it into production.

comment:10 Changed 5 years ago by jdreed

The current version in proposed (r23852) has been tested.

comment:11 Changed 5 years ago by broder

  • Status changed from proposed to closed
  • Resolution set to fixed

Fix moved to production.

Note: See TracTickets for help on using tickets.