[OT-Semi] On Linux, Permissions and Madness
andre at andregarzia.com
Wed Dec 1 16:00:09 EST 2010
Here goes a cautionary tale about madness, linux, php and huge web
development support task at hand. As many here know, I've got a part time
job writing PHP code for a company here in Brazil. The product I work is a
web application for email marketing (a.k.a. spam) and is used by many of the
biggest companies in Brazil to send their newsletters, promotions and
general stuff you'd rather not be receiving (by the way, our OptOut works).
Our main web interface machine was an old slackware (slax64) box but we were
doing all our development under CentOS. There is only my boss and me here on
this mailsending coding business. We decided that this made no sense at all
and decided to upgrade the main interface machine to CentOS. The process was
done during a weekend and it was working fine until the middle of the day of
wednesday when lots of features started breaking.
After some investigation, we noticed that our PHP code was receiving a
"permission denied" error whenever it tried to execute some external
command, like doing shell() with LiveCode. We used a lot of shell commands
to call Tidy utility to validade HTML, to call SpamAssassin to evaluate what
would be sent for spam potential and more. Our first thought was to check if
we had php safe_mode on, which is like LiveCode secure mode. It was off.
Then we checked suEXEC but it was ok and we started checking everything
possible under the sun. I am familiar with linux and know my way around and
I could see nothing wrong. All the binaries had the correct permission set
and were owned by the correct groups and users. We lost two days searching
for it and this includes going to sleep at 2:00 AM trying to find why
"permission denied" was there. The clients were going crazy too.
Today I came early to the office and started working on this trouble but
could not find why the problem was happening. We started trying heroic
solutions. Heroic solutions are the ones you are sure will never work but
you try them our of desperation with faith in the remote possibility they
might solve something. No heroes here. We even upgraded the development box
to run the exact same version of CentOS to try to replicate the problem but
we could not.
I've launched Apache server with strace enabled to track all the system
calls being done. I did this on a production server with massive access. I
went crazy reading the logs and seeing those silly:
stat("/usr/sbin/sh", 0x7fff52c83180) = -1 EACCES (Permission denied)
I've spent days and hours on this one. Then at 6:30 PM, out of desperation I
decided to re-check all the permissions while trying to execute
/usr/bin/tidy. Tidy was 755 so it was correct, the bin folder was correct as
well, so I went on to check "/" which would be madness if it had the wrong
perms, 755, right, so I went to check the "usr" folder...
OH HOLLY HACK!!!!!
The "usr" folder has perms 644, which means that the whole system was
somewhat compromissed because no one would be able to execute anything. But
since the permission is a 6xx, it means that the root user could execute
stuff, that is why the system was still answering to me thru ssh and things
appeared normal, it only failed for the other users, since there were no
other user but root and apache and nobody, we never thought of that.
Somehow, something, something EVIL changed the permissions there and it took
us three days to figure out that up in the chain a folder had no execute
So the next time, something does not execute on your linux box, check all
the perms, from the root / folder down to the binary, somewhere in the
middle might be a 644 and you will not loose time like I did.
This was really hard to find....
PS: I *NEED* a vacation...
http://www.andregarzia.com All We Do Is Code.
More information about the Use-livecode