You are viewing phreakhead

Previous Entry | Next Entry

no grass for the oxen!

After much debugging and research, it turns out that you cannot actually count bounces (soft or hard) using the SMTP log, like this script, or this guy. The reason is that the SMTP responses are not recorded in a linear sequence. Direct your attention to the following snippet of an SMTP log (names and addresses changed of course):

-------------
01 2009-01-23 23:29:57 xx.xx.xxx.xx OutboundConnectionCommand SMTPSVC1 YOURSERVERNAME - 25 MAIL - FROM:<mailing4_contact10144@yourservername.yoursite.com>+SIZE=31870 0 0 4 0 0 SMTP - - - -
02 2009-01-23 23:29:57 xx.xx.xxx.xx OutboundConnectionResponse SMTPSVC1 YOURSERVERNAME - 25 - - 250+2.1.5+sender+OK 0 0 19 0 0 SMTP - - - -
03 2009-01-23 23:29:57 xx.xx.xxx.xx OutboundConnectionCommand SMTPSVC1 YOURSERVERNAME - 25 RCPT - TO:<address10030@mail2.yoursite.com> 0 0 4 0 0 SMTP - - - -
04 2009-01-23 23:29:57 xx.xx.xxx.xx OutboundConnectionResponse SMTPSVC1 YOURSERVERNAME - 25 - - 552+5.2.2+Not+enough+space,+only+0+bytes+available+for+address10030@mail2.yoursite.com 0 0 88 0 32 SMTP - - - -
05 2009-01-23 23:29:57 xx.xx.xxx.xx OutboundConnectionCommand SMTPSVC1 YOURSERVERNAME - 25 RSET - - 0 0 4 0 32 SMTP - - - -
06 2009-01-23 23:29:57 xx.xx.xxx.xx OutboundConnectionResponse SMTPSVC1 YOURSERVERNAME - 25 - - 250+2.0.0+connection+reset 0 0 26 0 32 SMTP - - - -
07 2009-01-23 23:29:57 xx.xx.xxx.xx OutboundConnectionResponse SMTPSVC1 YOURSERVERNAME - 25 - - 552+5.2.2+Not+enough+space,+only+0+bytes+available+for+address1003@mail2.yoursite.com 0 0 87 0 47 SMTP - - - -
08 2009-01-23 23:29:57 xx.xx.xxx.xx OutboundConnectionCommand SMTPSVC1 YOURSERVERNAME - 25 RSET - - 0 0 4 0 47 SMTP - - - -
09 2009-01-23 23:29:57 xx.xx.xxx.xx OutboundConnectionResponse SMTPSVC1 YOURSERVERNAME - 25 - - 250+2.0.0+connection+reset 0 0 26 0 47 SMTP - - - -
10 2009-01-23 23:29:57 xx.xx.xxx.xx OutboundConnectionResponse SMTPSVC1 YOURSERVERNAME - 25 - - 552+5.2.2+Not+enough+space,+only+0+bytes+available+for+address10029@mail2.yoursite.com 0 0 88 0 78 SMTP - - - -
11 2009-01-23 23:29:57 xx.xx.xxx.xx OutboundConnectionCommand SMTPSVC1 YOURSERVERNAME - 25 RSET - - 0 0 4 0 78 SMTP - - - -
12 2009-01-23 23:29:57 xx.xx.xxx.xx OutboundConnectionResponse SMTPSVC1 YOURSERVERNAME - 25 - - 250+2.0.0+connection+reset 0 0 26 0 78 SMTP - - - -
13 2009-01-23 23:29:58 xx.xx.xxx.xx OutboundConnectionCommand SMTPSVC1 YOURSERVERNAME - 25 RSET - - 0 0 4 0 1281 SMTP - - - -
14 2009-01-23 23:29:58 xx.xx.xxx.xx OutboundConnectionResponse SMTPSVC1 YOURSERVERNAME - 25 - - 250+2.0.0+connection+reset 0 0 26 0 1281 SMTP - - - -
15 2009-01-23 23:29:58 xx.xx.xxx.xx OutboundConnectionCommand SMTPSVC1 YOURSERVERNAME - 25 MAIL - FROM:<mailing4_contact10147@yourservername.yoursite.com>+SIZE=31870 0 0 4 0 1281 SMTP - - - -
16 2009-01-23 23:29:58 xx.xx.xxx.xx OutboundConnectionResponse SMTPSVC1 YOURSERVERNAME - 25 - - 250+2.1.5+sender+OK 0 0 19 0 1281 SMTP - - - -
17 2009-01-23 23:29:58 xx.xx.xxx.xx OutboundConnectionCommand SMTPSVC1 YOURSERVERNAME - 25 RCPT - TO:<address10033@mail2.yoursite.com> 0 0 4 0 1281 SMTP - - - -
18 2009-01-23 23:29:58 xx.xx.xxx.xx OutboundConnectionCommand SMTPSVC1 YOURSERVERNAME - 25 RSET - - 0 0 4 0 1297 SMTP - - - -
19 2009-01-23 23:29:58 xx.xx.xxx.xx OutboundConnectionCommand SMTPSVC1 YOURSERVERNAME - 25 RSET - - 0 0 4 0 1266 SMTP - - - -
20 2009-01-23 23:29:58 xx.xx.xxx.xx OutboundConnectionResponse SMTPSVC1 YOURSERVERNAME - 25 - - 250+2.0.0+connection+reset 0 0 26 0 1297 SMTP - - - -
21 2009-01-23 23:29:58 xx.xx.xxx.xx OutboundConnectionCommand SMTPSVC1 YOURSERVERNAME - 25 MAIL - FROM:<mailing4_contact10145@yourservername.yoursite.com>+SIZE=31870 0 0 4 0 1297 SMTP - - - -
22 2009-01-23 23:29:58 xx.xx.xxx.xx OutboundConnectionResponse SMTPSVC1 YOURSERVERNAME - 25 - - 250+2.0.0+connection+reset 0 0 26 0 1266 SMTP - - - -
23 2009-01-23 23:29:58 xx.xx.xxx.xx OutboundConnectionCommand SMTPSVC1 YOURSERVERNAME - 25 MAIL - FROM:<mailing4_contact10146@yourservername.yoursite.com>+SIZE=31870 0 0 4 0 1266 SMTP - - - -
24 2009-01-23 23:29:58 xx.xx.xxx.xx OutboundConnectionResponse SMTPSVC1 YOURSERVERNAME - 25 - - 250+2.1.5+sender+OK 0 0 19 0 1282 SMTP - - - -
25 2009-01-23 23:29:58 xx.xx.xxx.xx OutboundConnectionCommand SMTPSVC1 YOURSERVERNAME - 25 RCPT - TO:<address10032@mail2.yoursite.com> 0 0 4 0 1282 SMTP - - - -
26 2009-01-23 23:29:58 xx.xx.xxx.xx OutboundConnectionResponse SMTPSVC1 YOURSERVERNAME - 25 - - 250+2.1.5+sender+OK 0 0 19 0 1313 SMTP - - - -
27 2009-01-23 23:29:58 xx.xx.xxx.xx OutboundConnectionCommand SMTPSVC1 YOURSERVERNAME - 25 RCPT - TO:<address10031@mail2.yoursite.com> 0 0 4 0 1313 SMTP - - - -
28 2009-01-23 23:29:58 xx.xx.xxx.xx OutboundConnectionResponse SMTPSVC1 YOURSERVERNAME - 25 - - 552+5.2.2+Not+enough+space,+only+0+bytes+available+for+address10033@mail2.yoursite.com 0 0 88 0 1312 SMTP - - - -
29 2009-01-23 23:29:58 xx.xx.xxx.xx OutboundConnectionCommand SMTPSVC1 YOURSERVERNAME - 25 RSET - - 0 0 4 0 1312 SMTP - - - -
30 2009-01-23 23:29:58 xx.xx.xxx.xx OutboundConnectionResponse SMTPSVC1 YOURSERVERNAME - 25 - - 250+2.0.0+connection+reset 0 0 26 0 1312 SMTP - - - -
31 2009-01-23 23:29:58 xx.xx.xxx.xx OutboundConnectionResponse SMTPSVC1 YOURSERVERNAME - 25 - - 552+5.2.2+Not+enough+space,+only+0+bytes+available+for+address10031@mail2.yoursite.com 0 0 88 0 1375 SMTP - - - -
32 2009-01-23 23:29:58 xx.xx.xxx.xx OutboundConnectionCommand SMTPSVC1 YOURSERVERNAME - 25 RSET - - 0 0 4 0 1375 SMTP - - - -
33 2009-01-23 23:29:58 xx.xx.xxx.xx OutboundConnectionResponse SMTPSVC1 YOURSERVERNAME - 25 - - 552+5.2.2+Not+enough+space,+only+0+bytes+available+for+address10032@mail2.yoursite.com 0 0 88 0 1344 SMTP - - - -
34 2009-01-23 23:29:58 xx.xx.xxx.xx OutboundConnectionCommand SMTPSVC1 YOURSERVERNAME - 25 RSET - - 0 0 4 0 1344 SMTP - - - -
---------------

As you can see on lines 15-17, an email is being requested to be sent to "address10033@mail2.yoursite.com". To deduce which mailing this is and the ID of which contact it was sent to, we look at the previous "MAIL" command which encodes that information in the FROM address (line 15: the mailing ID is "4", the contact ID is "10147"). Then, if there is a bounce (such as on line 28), we look at the last "MAIL" command to determine which contact bounced which mailing.

Unfortunately, since the SMTP server log is not a correct linear progression, the last "MAIL" command is on line 23, which is giving us the mailing and contact IDs for the INCORRECT contact (this is the info for the contact related to the address on either line 27 or 25 - there is no way to tell). The correct "MAIL" command (with the correct mailing and contact IDs gleaned from the FROM address) is actually on line 15. Also notice that the bounce error messages on lines 31 and 33 do not follow the same sequence as they were sent on lines 25 and 27. Thus there is no way of knowing which "MAIL" command matches up with any error message indicating a bounce.

Therefore, there is no foolproof way to match an error message indicating a bounce to a certain mailing and contact ID reliably. It can work occasionally, but we wouldn't want to erroneously count bounces for an address that didn't actually bounce.

The most we can do is to ignore the possibly erroneous last "MAIL" command's mailing and contact IDs, and instead match the email address mentioned in the bounce error message to an address in the contacts database table. This will allow us to at least indicate if a certain address is bouncing, even if we don't know which mailing it came from. Then, a workaround would be to increment the bounce count for ALL sent mailings that include that bounced address in their send-to list, even though it may have very well bounced for one mailing but not the others.

However, most SMTP servers do not even include the email address in the bounce error message. More often than not they will give an extremely short message, like "552+5.2.2+Mailbox+full", assuming that if you wanted to know WHICH email's mailbox was full you could simply look at the previous "RCPT" command in the log. As the snippet above demonstrates, that assumption is false, so actually there is no reliable way to count bounces from the SMTP log.

This leaves us with no choice but to rely mainly on Delivery Reciept Notifications (DSNs): the emails sent to the sender's inbox indicating a bounce. Many email providers have discontinued sending these DSNs and force you to observe the errors in the SMTP log as your only notification of a bounce, however as just demonstrated this is usually impossible. We can try to use the info from the few bounce error messages that actually mention the problem email address, but besides that we are stuck with the DSNs.

I am very interested in the solution these ISPs implemented when they decided not to send out DSNs anymore and forced everyone to use the SMTP log, especially since a lot of whitelists will make sure that you do this before they will whitelist your mailing list servers.

Comments

( 2 comments — Leave a comment )
(Anonymous)
Feb. 25th, 2011 02:49 pm (UTC)
Not always possible
In my SMTP log there are also 2 different lines for FROM and TO. And there is also no temporily order between all lines.

But the error message of the answering server is in the same line as the TO. Additionally there is a unique ID connecting the TO and the FROM line.

So i suppose the problem you mention does not apply in my case? Which means SMTP logs are not everywhere the same format.

(Anonymous)
Jan. 5th, 2012 11:11 pm (UTC)
I am looking into options for my company to get most bounced email addresses.

Log entries are ordered in a linear sequence by the time they happened. What you are missing in your log is a session ID. The sessions are linear but leapfrog other session entries as they occur.

In my logs we have a session ID that identifies the session for each mailing attempt. Using a merge sort I believe I can sort based on session ID and time. So they would be ordered by session and then time within each session. Since many entries have the same time, merge sort preserves the original order so I don't have to worry about them being out of order within each session. Session IDs can be reused but since entries are in order I believe it will be manageable to not bleed into a previous or later session with the same ID.

Based on the logs I still can't get all the addresses of bounced emails. For example, if an email is sent to an email address with a bad domain name, it fails before the TO email address is even mentioned. So it will never be perfect especially with non-standard responses and formats as well. But logs with session IDs in them help to get around the main issue you were pointing out.

As far as not whitelisting due to not sending a DSN response, a lot of places don't worry about a white list. In the case of a whitelist a spammer can send emails to a server and forge the emails causing "Backscatter" to look like it is coming from the whitelisted domain. Spammers and other malevolent emailers have caused server administrators to not bother with a lot of protocols and features.
( 2 comments — Leave a comment )

Latest Month

August 2009
S M T W T F S
      1
2345678
9101112131415
16171819202122
23242526272829
3031     

Page Summary

Powered by LiveJournal.com
Designed by Tiffany Chow