Ticket #476 (closed task: fixed)
Follow-up with Kernel on ENOEXEC/ENOENT for libc5 binaries
Reported by: | jdreed | Owned by: | jdreed |
---|---|---|---|
Priority: | normal | Milestone: | Current Semester |
Component: | -- | Keywords: | |
Cc: | Fixed in version: | debathena-ldso1-stub 1.1, debathena-workstation 1.10 | |
Upstream bug: | http://lkml.org/lkml/2009/7/9/74 |
Description
This continues #219 and ATN-41.
Geoff sent a patch to kernel.org, which was rejected. Relevant threads are http://lkml.org/lkml/2009/7/9/74 and http://lkml.org/lkml/2009/7/30/211
It's fairly clear from the POSIX spec (see the "ERRORS" section of http://www.opengroup.org/onlinepubs/9699919799/functions/exec.html) that the current kernel behavior of ENOENT when /lib/ld-linux.so.1 doesn't exist is wrong, and that it should be ENOEXEC or EINVAL.
Note the paragraph under RATIONALE which begins "One common historical implementation is that the execl(), execv(), execle(), and execve() functions return an [ENOEXEC] error...".
We should follow up and point out that the current behavior does not conform to POSIX. They may or may not decide to care.
Change History
comment:3 Changed 14 years ago by geofft
... huh. Does Fedora/RHEL have such a kernel patch? Compare:
shining-armor:~/web_scripts geofft$ /mit/mime/bin/richtext bash: /mit/mime/bin/richtext: /lib/ld-linux.so.1: bad ELF interpreter: No such file or directory
dr-wily:~ geofft$ /mit/mime/bin/richtext bash: /mit/mime/bin/richtext: No such file or directory
I bet one or more of "Here, LKML, take this Red Hat patch" or "Here, Debian/Ubuntu?, take this Red Hat patch" might be easier.
comment:4 Changed 14 years ago by geofft
- Summary changed from Follow-up with Kernel on ENOEXEC/ENONENT for libc5 binaries to Follow-up with Kernel on ENOEXEC/ENOENT for libc5 binaries
Quentin points out that this is bash doing the check, not the kernel, and strace confirms that the kernel is still returning ENOENT, but bash is then going and opening the binary, noticing that it exists, and faking up the "bad ELF interpreter" message.
(I was able to confirm this by racing against while true; do ln -s /usr/bin/id; rm id; done &; after a few successes and a few "No such file or directory"s, I got "bash: ./id: /lib64/ld-linux-x86-64.so.2: bad ELF interpreter: No such file or directory".)
So maybe we want to patch upstream bash and tcsh to do this?
comment:5 Changed 14 years ago by amu
An alternative user-space workaround would be to supply a stub /lib/ld-linux.so.1 (which would presumably need to be statically linked) that would simply print a reasonably clear error message and exit with status 127.
comment:6 Changed 12 years ago by jdreed
- Status changed from new to accepted
- Owner set to jdreed
- Milestone changed from Upstream Utopia to Precise Release
Upstream told us to go away. I suspect we should just implement the stub Geoff described in the LP patch and Aaron described here. Probably the right way to do this is debathena-libc5-stub which drops in /lib/ld-linux.so.1.debathena, and symlinks /lib/ld-linux.so.1 to it. That will allow people to still go install libc5 from feisty or something if they want to. I think this is _NOT_ a c-p-d package, since we only want to symlink if it doesn't already exist on the end user's machine, and it's easy enough to cleanup the symlink if readlink -f shows it points back at us.
comment:7 Changed 12 years ago by geofft
Yeah, that's not a c-p-d thing, but I'm hesitant to make a program interpreter be a possibly-dangling symlink. Can we just have debathena-libc5-stub Provide/Conflict/Replace? libc5, and depend on debathena-libc5-stub | libc5?
I'm also really hesitant that libc5 from Feisty will even work, so I'm happy to just declare that you don't get to do that and the local sysadmin can play tricks with equivs if they think they know what you're doing.
comment:8 Changed 12 years ago by jdreed
This sort of works. Using /mit/graphics/bin as a testbed, "xli" correctly prints the error. "mpack" and "mtv.real" segfault (whereas they previously also displayed ENOENT.
comment:9 Changed 12 years ago by jdreed
See zephyr discussion today. This works if built on everything except Precise. (Quantal not tested)
comment:10 Changed 12 years ago by jdreed
Quantal also fails, but wheezy works. So Ubuntu did something starting with Precise that makes building this not work.
comment:11 Changed 12 years ago by jdreed
We should figure out why this segfaults sometimes on Precise and above, or we should wontfix this.
comment:12 Changed 12 years ago by jdreed
- Status changed from accepted to committed
r25862 rewrites it in assembly because why not, and avoids the segfault issue. Someone who is not me should sanity-check it.
comment:14 Changed 11 years ago by jdreed
People should test this. It's not in a metapackage yet.
comment:15 Changed 11 years ago by jdreed
- Fixed in version set to debathena-ldso1-stub 1.1, debathena-workstation 1.10
comment:17 Changed 11 years ago by jdreed
- Status changed from proposed to closed
- Resolution set to fixed
Part of this should be updating geofft's kernel patch to return ENOEXEC if there's /any/ problem executing the interpreter (i.e. EACCES).