The sweet taste of victory
Jul. 16th, 2009 11:21 amOh man, after spending 2 days with wireshark and Dunkin' Donuts coffee, I have tasted the sweet thrill of victory. There are very few things as satisfying as staring at a 2GB pile of SMTP packets and randomly trying out filters when suddenly things click and the lighthouse turns on, complete with angelic chior (all playing tubas, of course).
A user hears from his contacts that the From address is coming out garbled. All users of this system use the same web interface, but this guy is the only one having difficulty. He sends that off to the programmer/dba/site owner, who sends it off to me, his sysadmin. I've never seen anything quite like it before in an email, so I fire up tcpdump and leave it there for a week.
The only thing I have to go off of is:
Return-path: =?iso-8859-1?B?PGV4YW1wbGVAY29vbHNpdGUuY29tOz4=?=
In between desktop support tickets, I try to figure out what this could be. I know the iso-8859-1 part is a character encoding scheme, but it should be plaintext here. Maybe the ?B is a formatting thing, and it's trying to bold the user's email address within the headers?
Google and I came up empty during that time, so I wiresharked around without any useful clues. It did tell me that the mail server wasn't mangling anything. I hadn't thought to tcpdump from the web server too, and it's already been most of a week. It's too late to wait another couple of days for more tcpdump data.
IRC rocks. Someone commented about how it looks like a hash or crypt output. That got me thinking... After judicious application of clever filters, I locate a small collection of these broken emails. There's an odd break in one of the other headers, a windows-style CRLF newline, even though all these servers run CentOS. So I backtrack a bit, and it turns out that the break is in that same header in all the broken emails. A little more effort and what do you know, the garbage in each field is the same garbage in each borkd emails.
At this point all the pieces clicked together. Only the headers are messed up, only one user has a problem, the headers are messed up the same way each time, the "crypt" bit... hmm... decrypt, de... de... dec... decode! Copy the relevant bit, paste into a base64 decoder, and I had my answer. The code wasn't sanitizing inputs properly! Ha! Victory dance indeed.
I recounted the "story thus far" (pre-solution) to my girlfriend over dinner last night and she almost fell asleep. I wanted to post this somewhere where it would at least be understood, because this was way too much fun.
Also, I forgot to mention this, but I'm sure at least half of you have already decoded it for yourselves, but PGV4YW1wbGVAY29vbHNpdGUuY29tOz4=
decodes out to
<example@coolsite.com;>
A user hears from his contacts that the From address is coming out garbled. All users of this system use the same web interface, but this guy is the only one having difficulty. He sends that off to the programmer/dba/site owner, who sends it off to me, his sysadmin. I've never seen anything quite like it before in an email, so I fire up tcpdump and leave it there for a week.
The only thing I have to go off of is:
Return-path: =?iso-8859-1?B?PGV4YW1wbGVAY29vbHNpdGUuY29tOz4=?=
In between desktop support tickets, I try to figure out what this could be. I know the iso-8859-1 part is a character encoding scheme, but it should be plaintext here. Maybe the ?B is a formatting thing, and it's trying to bold the user's email address within the headers?
Google and I came up empty during that time, so I wiresharked around without any useful clues. It did tell me that the mail server wasn't mangling anything. I hadn't thought to tcpdump from the web server too, and it's already been most of a week. It's too late to wait another couple of days for more tcpdump data.
IRC rocks. Someone commented about how it looks like a hash or crypt output. That got me thinking... After judicious application of clever filters, I locate a small collection of these broken emails. There's an odd break in one of the other headers, a windows-style CRLF newline, even though all these servers run CentOS. So I backtrack a bit, and it turns out that the break is in that same header in all the broken emails. A little more effort and what do you know, the garbage in each field is the same garbage in each borkd emails.
At this point all the pieces clicked together. Only the headers are messed up, only one user has a problem, the headers are messed up the same way each time, the "crypt" bit... hmm... decrypt, de... de... dec... decode! Copy the relevant bit, paste into a base64 decoder, and I had my answer. The code wasn't sanitizing inputs properly! Ha! Victory dance indeed.
I recounted the "story thus far" (pre-solution) to my girlfriend over dinner last night and she almost fell asleep. I wanted to post this somewhere where it would at least be understood, because this was way too much fun.
Also, I forgot to mention this, but I'm sure at least half of you have already decoded it for yourselves, but PGV4YW1wbGVAY29vbHNpdGUuY29tOz4=
decodes out to
<example@coolsite.com;>
no subject
Date: 2009-07-16 06:06 pm (UTC)Good when we fix the problems.
Bad when no one understands the fixes but us...
I have this posted in my cubicle - it fits perfectly
Date: 2009-07-16 06:08 pm (UTC)no subject
Date: 2009-07-16 06:10 pm (UTC)The worst part is that it seems like the most exciting ones are the ones we have the hardest time explaining, not to mention the "Wait, you're excited because you spent all day looking at numbers?" factor.
You hit the nail on the head right there.
Re: I have this posted in my cubicle - it fits perfectly
Date: 2009-07-16 06:11 pm (UTC)no subject
Date: 2009-07-16 06:20 pm (UTC)I've been bitten by the "Bare CR" thing before - Qmail will actively REFUSE email with bare CRs in it. (There's an RFC that specifically prohibits bare CRs in SMTP mail.) Normally, this isn't a problem... but in the realty world, they use lockboxes made by GE that send emails when they're opened. GE is apparently running the network that sends the emails, from looking at the headers, on some godforsaken Windows 2000 box with running s script that bounces out a plaintext template containing... you guessed it, bare CRs. The annoying thing? After spending fucking WEEKS trying to figure out what the hell was going on (any MTA that accepts mail with bare CRs also silently sanitizes them to CR/LF pairs - I had to capture a session with tcpdump and view it in a hex editor to see what was going on!), I politely let them know what the problem was, and that they should edit the template their script was using to replace the CRs with CR/LF... I got no response whatsoever.
Six months later, that particular division of General Electric is STILL sending out hundreds of thousands of emails with bare CRs in it. Schmucks.
Re: I have this posted in my cubicle - it fits perfectly
Date: 2009-07-16 06:22 pm (UTC)no subject
Date: 2009-07-16 06:49 pm (UTC)http://www.isi.edu/in-notes/rfc2047.txt
It is all explained.
Re: I have this posted in my cubicle - it fits perfectly
Date: 2009-07-16 07:55 pm (UTC)no subject
Date: 2009-07-16 07:56 pm (UTC)no subject
Date: 2009-07-16 11:27 pm (UTC)Re: I have this posted in my cubicle - it fits perfectly
Date: 2009-07-17 12:01 am (UTC)And sometimes one has to bring up the geek level of those around, so one can then explain one's triumph. And it's not bragging (much), it's sharing the high.
no subject
Date: 2009-07-17 01:07 am (UTC)Back in 1998 I got an interview for a job with a RFC-quizz.
Re: I have this posted in my cubicle - it fits perfectly
Date: 2009-07-17 07:00 am (UTC)Re: I have this posted in my cubicle - it fits perfectly
Date: 2009-07-17 07:12 am (UTC)