I’ve learned so much in the last 24 hours.
I opened a support ticket and included 2 sets of email headers, I thought showing that email was being delayed by a bottleneck within dreamhost.
The first set of headers was dreamhost origin, and Comcast recipient. The second set of headers was email output from a script running on my dreamhost webserver to a dreamhost mail account, which was also delayed.
Support finally responded to my ticket and told me:
The funny thing about it is, that reply was not blocked or delayed on my Comcast account and was received immediately, essentially at the same time it was sent (taking into account timezone corrections of course). No mention was made of a second set of email headers I had sent showing a from dreamhost to dreamhost scenario.
I replied to the ticket, and when the second reply came it was much more complete. It showed lines pasted from the outbound mail log and explained that with example 1 (Comcast) there was an attempt to send that Comcast refused, then later the mail server tried again and succeeded.
The explanation for the dreamhost to dreamhost scenario was more vague and I in fact still don’t understand (and have questioned further, but they haven’t yet replied).
Now I did understand one thing however, and that was the problem that I thought was a more global network bottleneck was actually 2 separate more isolated problems.
I remembered the Dreamhoststatus.com post about the TrendMicro RBL and read it again, but was confused, because until this past weekend, I didn’t have problems with dreamhost -> Comcast emails. In fact, I previously tested it without problem after reading some of the comments under that post.
Next I picked up the phone and called a contact in the Comcast Corporate Office in Philly that I had met previously at a conference in San Francisco a few years ago. My contact previously worked in the “president’s office customer service” dept (I’m not sure that’s what they actually call the dept, but that’s the function). Being a holiday, which I had forgotten, I got voicemail. He in fact returned my call from his cellphone a few minutes later. He remembered me, we exchanged pleasantries etc., and then I got to the point of my call.
It turns out his job function has changed and he was unwilling to get involved directly due to corporate politics at One Comcast Way (the address of the corporate office in Philly) and his change in job function. He did however spend about 30 minutes talking to me in an informative way about the problem, and how Comcast handles such issues, which is very much automated and without too much human intervention. Users report SPAM and the knowledge is used to create an automated IP by IP or sometimes host by host reputation system. He further said, even if he was willing to get involved in that depts functions there was little that would be done anyway because today it’s a much automated system. The message was that it's fairly industry standard that the sending servers admins must monitor and control spam output and if they become blocked it is the sending admin's responsibility to get unblocked, and that in the case of Comcast and many other providers case that was now thru the use of responding to automated warning and using online tools for Mail admin.
He in fact told me that he didn’t believe that Comcast used the TrendMicro RBL directly, but it was possible that indirectly a block may be picked up there by vendors Comcast did use directly. He pointed me to http://postmaster.comcast.net/index.html for additional information, which also includes links for Mail Server admins, to use for blocklist removal or even more importantly to sign up for Comcast’s Feedback loop, an automated service to prevent such blocks in the first place. He also specifically mentioned this very informative FAQ http://postmaster.comcast.net/avoidblocks.html
On further discussion I asked about conditional blocks, or why some mail seemed to be getting though while other mail did not or was delayed. At that point he shifted gears and said that didn’t sound like a block and asked if it was an Exchange Server. I told him it was not. He explained there was something new they were doing that he didn’t know very much about, DNSSEC which can produce intermittent delivery issues on improperly configured Exchange Servers, and that in fact they had documented an Exchange Server setting that needed to be changed as a result of DNSSEC implementation. (read the top right column on http://postmaster.comcast.net/index.html ) He apologized but said he was out of the loop on the topic and did not know if there was an equivalent change needed on *nix servers. He also pointed out what's stated on the page that he referred to, Comcast is one of the first major carriers to implement DNSSEC, but many more would be following soon.
I spent another half hour after we hung up reading various parts of http://postmaster.comcast.com and then wrote this post. That’s what I’ve learned, I’m much more informed than I was 24 hours ago. But the question remains whether Dreamhost is in fact enrolled in the Comcast FBL program and/or whether DNSSEC is part of the cause of this issue.