Ticket #928 (closed defect: fixed)
cluster logins don't get tokens (and fail) on natty
Reported by: | kaduk | Owned by: | |
---|---|---|---|
Priority: | blocker | Milestone: | Natty Beta |
Component: | -- | Keywords: | |
Cc: | Fixed in version: | ||
Upstream bug: |
Description
In a stock install of cluster on natty, logins fail.
This seems to be largely because when we run the session in the schroot, schroot's pam stack gets run, and it gives the user a new PAG (and keyring entry), but no tokens. Without tokens, access to the homedir fails, so the login bails pretty quickly.
Russ said that this may be because KRB5CCNAME is not in the PAM environment. However, the suggested possible workaround of adding always_aklog to the pam_afs_session arguments did not help. This may be because schroot is not fully running the session stack?
It will probably be useful to instrument a cluster login on lucid and see what has changed, so as to have a better sense of where to poke at to fix things.
Change History
comment:1 Changed 13 years ago by jdreed
- Priority changed from normal to blocker
- Milestone changed from The Distant Future to Natty Beta
comment:2 Changed 13 years ago by kaduk
I got to poke at a lucid cluster machine today; it also gets a new keyring entry and PAG in the chroot. However, KRB5CCNAME is passed into the pam environment for schroot's pam stack, which allows pam_afs_session to actually obtain tokens. (I am slightly confused that I only get athena-cell tokens for the 'sipbtest' user, since the 'always_aklog' option is passed to pam_afs_session in pam.d/common-session, and I want to say that this had gotten me tokens for all cells listed in .xlog on lola-granola. Could be the newer pam_afs_session, I suppose.
So, we need to figure out why pam_krb5 (?) is not introducing KRB5CCNAME into the pam environment, though I fear this may be a "feature" of an updated PAM.
comment:3 Changed 13 years ago by kaduk
Following a tip from Russ, libpam-afs-session_2.4-1 seems to provide a workaround, by always checking the full environment for KRB5CCNAME rather than just checking the PAM environment.
comment:4 follow-up: ↓ 5 Changed 13 years ago by kaduk
I was being silly -- the
D(2): pam_putenv: set KRB5CCNAME=FILE:/tmp/krb5cc_20922_JIjcZi
line I pasted to zephyr that shows KRB5CCNAME being set in the PAM environment on a lucid cluster machine is clearly debugging output from schroot, where schroot itself is setting up a pam environment. In natty's schroot (1.4.17), this is a "minimal environment" which includes just HOME, LOGNAME, PATH, SHELL, and USER; the version 1.4.0 in lucid must have been less minimal, no matter the lack of a changelog entry.
So, we can blame the schroot version for the source of the problem, but will probably still need to use the workaround of inserting KRB5CCNAME into the PAM environment ourself.
comment:5 in reply to: ↑ 4 Changed 13 years ago by jdreed
Replying to kaduk:
So, we can blame the schroot version for the source of the problem, but will probably still need to use the workaround of inserting KRB5CCNAME into the PAM environment ourself.
I naively tried this using /etc/security/pam_env.conf and could not convince PAM to copy KRB5CCNAME into the environment. I did succeed in setting KRBNAME=$KRB5CCNAME in snapshot-run, and then setting KRB5CCNAME=$KRBNAME in pam_env.conf, but I still don't end up with tokens. Not sure what I'm missing. And the "debug" arg for pam_env appears to be full of lies. Adding an explicit aklog in Xsession.debathena-orig works, of course, but is wrong.
comment:6 Changed 13 years ago by jdreed
The consensus on zephyr is that we should just debathenify libpam-afs-session and add the code to look in the environment for KRB5CCNAME. Before I put effort into that, I'd love it if someone double-checked to make sure we can't do this with pam_env.
comment:7 Changed 13 years ago by jdreed
- Status changed from new to committed
Committed a hack for this in r25223. Barring a better idea in bounded time, we should move forward with this. It's unclear what we should conditionalize on (if anything). Certainly this will work on Lucid as well. I'm not interested in hearing about distributions other than Lucid and Natty, since we don't support cluster on them.
comment:8 Changed 13 years ago by jdreed
We should figure out whose fault this is and file bugs against the packages. Testing on maverick will give us a better idea of when this broke.
comment:9 Changed 13 years ago by jdreed
- Status changed from committed to development
reactivate 2.0.22 ->dev
comment:11 Changed 13 years ago by jdreed
- Status changed from proposed to closed
- Resolution set to fixed
Remainder of this tracked as #982