SpamAssassin Changes for Dreamhost
Posted October 26th, 2008 at 12:19 PM in the Projects category; there are 11 comments

When hosting companies accommodate increased traffic they’re likely to make changes to the network architecture and Dreamhost is no exception. Most recently, they separated the mail and hosting servers for better performance. On the upside my e-mail is much faster. On the downside my SpamAssassin (SA) installation is not working.

One solution for this problem is to use Dreamhost’s installation of SA. However, there’s no support for Bayesian rules. Those rules require a per-user database. Save whitelists, blacklists, and rule sensitivity, there’s no customization allowed. Another annoyance is logging into webmail to do anything meaningful with false positives; it’s explained further on the DH wiki.

Another option, and the best in my opinion, is using an IMAP client of my choice. This requires some modifications to my previous tutorial which takes you through the entire process of installing SA. Assuming that first tutorial is complete, you can now begin this tutorial. Yes, you will undo some of the work in the original tutorial. That’s the way it goes.

Because mail is being forwarded, moved, and synchronized all over the place, I thought a basic flowchart could simplify the end objective of this tutorial. As you can see, everything is setup to move e-mail to and from the mail server and hosting server.
Tutorial Overview

A few assumptions before we get started…

  1. Create a new fully hosted e-mail account. All of your e-mail aliases should be re-routed to this account’s address; this is now the gateway account through which everything must pass. You’ll have to create a similar account for each user with a SpamAssassin installation. The address doesn’t have to be user friendly, but choose something you can remember. In this example, I’ve created a new fully hosted address called spam_checker@unsaturated.com.

    If you have an alias called my.cool.address@unsaturated.com which forwards to your main user account, it will have to forward to spam_checker@unsaturated.com. Do not update all your aliases now because we want the transition to be seamless. Let’s keep e-mail flowing until all steps are complete.
    Fully hosted e-mail
  2. Create a new forward-only e-mail address. After your mail is checked by SpamAssassin it will be forwarded to this new address, which then passes the e-mail along to your main user account, usually your_user_name@yourdomain.com. This step allows your e-mail to find its way back to the mail servers from the hosting servers. In this example, I’ve created a new forward-only address called spam_checker_passed@unsaturated.com.
    Forward-only
  3. Move the spam.rc file to your home folder. The spam.rc file was originally in a folder called procmail but it was the only file in there. This is a matter of preference but it’s important to keep this in mind for the rest of the tutorial. If nothing else is stored in your procmail folder, then delete it. Type the following at the command prompt:
    %> cd ~
    %> mv ~/procmail/spam.rc ~
    %> rm -Rf procmail/
    
  4. Update the .procmailrc file in your home directory. You’ll notice several changes to this file, most notably all e-mail is forwarded to the address made in step 2 or then deleted. Remember, the spam has already been filtered to the .Spam folder according to the spam.rc file. Don’t forget to update the forwarding address! Type the following at the command prompt and insert the text shown:
    %> pico .procmailrc
    
    #=======================================
    # ~/.procmailrc
    #
    # Uses Maildir format mail directory.
    
    # Uncomment the following three lines to debug
    #LOGFILE=$HOME/procmail.log
    #VERBOSE=yes
    #LOGABSTRACT=all
    
    ## Directory for storing procmail-related files 
    PMDIR=$HOME 
    
    # Message directory (Courier IMAP and mutt)
    MAILDIR=$HOME/MaildirSync 
    
    # Spam filtering rules should run last
    INCLUDERC=$HOME/spam.rc 
    
    # Forward non-spam mail to validated address
    # REMEMBER TO UPDATE THIS ADDRESS!!!
    :0c
    ! spam_checker_passed@unsaturated.com
    
    # Everything should be filtered to the local .Spam folder 
    # or forwarded to the new mail server, so go ahead and 
    # delete whatever remains 
    :0
    /dev/null
    #=======================================
    
  5. Rename the Maildir directory. Again, you might be wondering why this is necessary. In my opinion, the typical name for Maildir is okay if you’re actually using it for reading mail. However, this directory is now used exclusively for synchronizing mail accounts. Type the following at the command prompt:
    %> mv Maildir MaildirSync
    %> chmod -R 700 MaildirSync
    
  6. Download and extract offlineimap. The features and speed of offlineimap looked compelling, so I tried it and decided to keep it. You can configure the script for many different scenarios but we’re keeping the steps basic. This step deletes the default configuration files but don’t worry because I’ll provided one later. Type the following at the command prompt:
    %> cd ~
    %> wget <Valid URL for: offlineimap_6.0.3.tar.gz>
    %> tar xvfz offlineimap_6.0.3.tar.gz
    %> cd offlineimap
    %> rm offlineimap.conf
    %> rm offlineimap.conf.minimal
    
  7. Create an .offlineimaprc configuration file. Offlineimap needs to know some basic information like where to find your remote and local mail, folders to ignore, and more. To complete this you’ll need to know your mail server’s name, which can be accessed in the panel. You might wonder why this program is necessary. Consider a false positive e-mail, which is incorrectly marked spam. You could move that message to the correct folder but that change needs to be reflected on the hosting server where SpamAssassin can learn from the change. Remember to update all the values marked YOUR_. Type the following at the command prompt and insert the text shown:
    %> cd ~
    %> pico .offlineimaprc
    
    #=======================================
    # ~/.offlineimaprc
    
    [general]
    accounts = MainAccount
    metadata = ~/.offlineimap
    ignore-readonly = no
    ui = Noninteractive.Quiet
    
    [Account MainAccount]
    localrepository = Local
    remoterepository = Remote
    
    [Repository Local]
    type = Maildir
    localfolders = ~/MaildirSync
    
    [Repository Remote]
    type = IMAP
    remotehost = YOUR_MAIL_SERVER_NAME.mail.dreamhost.com
    ssl = true
    remoteuser = main_account@YOUR_DOMAIN.com
    remotepass = YOUR_PASSWORD
    nametrans = lambda foldername: re.sub('^INBOX\.*', '.', foldername)
    folderfilter = lambda foldername: foldername in ['INBOX.Spam']
    maxconnections = 1
    holdconnectionopen = no
    #=======================================
    
  8. Update the salearn.bat script. The old Bayes update script is obsolete because we have to ensure spam is synchronized between your hosting account and your mail account. This script synchronizes your .Spam folders, copies all messages (read and unread) to a single directory, tells SpamAssassin to update the Bayesian database, purges all the spam, then does a final synchronization. Type the following at the command prompt and insert the text shown:
    echo '========================================================'
    TESTED=false
    if [ "$1" ]
    then
      if [ $1 = "spam" ]
      then
        TESTED=true
        echo '--------------------------------------------------------'
        echo Synchronizing Spam folder...
    
        ~/offlineimap/offlineimap.py
        mv ~/MaildirSync/.Spam/new/* ~/MaildirSync/.Spam/cur/
    
        echo Messages synchronized and ready for processing.
        ~/sausr/bin/sa-learn -V
        echo Learning what is spam...
        ~/sausr/bin/sa-learn --spam ~/MaildirSync/.Spam/cur
        rm -f ~/MaildirSync/.Spam/cur/*
        echo Learning complete.  All spam messages were deleted.
        echo '--------------------------------------------------------'
        echo Resynchronizing your spam folder...
        ~/offlineimap/offlineimap.py
        echo All folders are synchronized. 
      elif [ $1 = "ham" ]
      then
        TESTED=true
        ~/sausr/bin/sa-learn -V
        echo '--------------------------------------------------------'
        echo Learning what is ham...
        ~/sausr/bin/sa-learn --ham ~/MaildirSync/cur  
      fi
      
      if [ $TESTED = true ]
      then
        echo '--------------------------------------------------------'
        echo Summary statistics of Bayes database...
        ~/sausr/bin/sa-learn --dump magic
        echo '--------------------------------------------------------'
      fi
    else
      echo Enter one argument:  [ham | spam]
    fi
    echo '========================================================'
    
  9. Update your cron jobs. Because some e-mail are marked as spam incorrectly (false positives), we need to ensure all mail is synchronized. This gives you a chance to correct those false positives and move valid e-mail back into your inbox. The first cron task runs frequently to ensure your folders are up-to-date. Some spam systems would do this once daily but I prefer to know sooner if an important e-mail got trashed. The second task runs the Bayes update script. Type the following at the command prompt, acknowledge prompts for e-mail according to your preferences, and enter the text shown:
    %> crontab -e
    
    30 * * * * ~/offlineimap/offlineimap.py
    10 0 * * 7 ~/salearn.bat spam
    
  10. Create an e-mail filter through the panel. Remember that e-mail account you created back in step 1 (spam_checker@unsaturated.com)? It’s time to give it a trivial filter using the panel.
    E-mail Filter
    This will create a .procmailrc on the mail server, which you can’t see or edit via the shell. By trivial, I mean an obvious pass. Your rule would go something like this.
    Trivial filter

    Now once the trivial filter is establish you want to select the “Forward to shell account” option. Be sure to select the matching user account from the drop-down menu.
    Forward to shell

  11. Update all e-mail aliases. Remember that e-mail account you created back in step 1 (spam_checker@unsaturated.com)? It’s time to update all your other aliases to point to that address. With each updated alias, the system goes “live” so you might try it on an infrequently used address, then send a test message.
    Update aliases
  12. Send a test message. If everything is working properly, you can run the salearn.bat script and you should be error-free. Send a test message and look for SA headers (X-Spam-Level, etc). Lastly, if you find an error in my tutorial please post a comment.

11 Comments on “SpamAssassin Changes for Dreamhost”

  1. KRKeegan

    Nice tutorial. Some of it may be a bit brief for more novice users, and you really don’t explain that the user would need to check their email using the spam_checker_passed email address now, but I am sure people would figure it out.

    I had the same idea as you a while back and I even did a similar flow chart to explain it all.

    However, I thought up a solution that allows users to use the same account without the separate “spam_checker_passed” and “spam_checker” email accounts like you have. Also it still allows users to sort into more distinct folders rather than just inbox. It wasn’t possible before the new filter by headers feature recently added, but it should work perfectly now.

    I will have to migrate my email to email only accounts soon, so I should be able to update this with my results and some example files to help others.

    http://krkeegan.com/archives/89-How-to-Resurrect-Procmail-and-Spamassassin-on-Dreamhost.html

  2. Anders Liljeqvist

    Great tutorial, I just used it to get my own spamfilters working again.

    One thing worth pointing out is that users might want to avoid setups with excessive forwarding between mail accounts. Dreamhost have an SMTP quota that allows no more than 100 emails / hour. If users are not careful, it is easy to end up doing more than 100 forwards an hour, with a bounce loop as result. The smtp server will reject the 101th email going back to your spam_checker_passed@ and send it back to the spam_checker@ account which in turn tries to forward the bounce to spam_checker_passed@, creating a loop.

    To break such forwarding loops, users might want to add an X-Loop header before the email forward in .procmailrc. In my case, I added this recipe to .procmailrc just before the forward to spam_checker_passed@.

    :0fw
    | formail -A “X-Loop: loop_alert@liljeqvist.com

    I search all incoming emails for that X-Loop header to detect and break any loops. Here is one of my first procmail recipes:

    # If we find an X-loop, then it is a bounce.
    :0
    * ^X-Loop: loop_alert@liljeqvist.com
    $MAILDIR/.LoopProblems/

    I want to create backups of all emails but do not want to forward the backup copies to my main account. Instead I create a backup copy of each email in a folder called ShortTermArchive. I then sync this to my mail server using offlineimap. I also want to copy any bounce emails across, so must therefore sync several folders using offlineimap. To synchronise more than one folder, write something like this in .offlineimaprc:

    folderfilter = lambda foldername: foldername in ['INBOX.ProbablySpam','INBOX.ShortTermArchive','INBOX.LoopProblems']

    (Here, I sync three folders called ProbablySpam, ShortTermArchive and LoopProblems).

    Again, nice tutorial.

    Cheers,

  3. Dan

    Thanks for posting this up, a nice tutorial. I understand Dreamhosts reason for breaking up email, but what I don’t understand is getting 72 hours notice. While this is pretty easy to understand for someone who has been doing this for a while, its definately not for the beginner. It is also not very clean. The right answer is to give us accounts on both mail and web hosts. Dreamhost SUCKS.

  4. matthew

    Thanks for the tips, Anders. I didn’t know about the SMTP quota. I’ll try your suggestions and update my tutorial.

  5. STEREO

    Email became faster?
    Couldn’t that be due to massive base of custom SA installations been disabled?

  6. Dallas

    @Dan Giving users shell accounts on the mail servers would be a big step backwards for us. Getting user-run processes OFF of the mail servers is one of the main reasons for this change, in addition to the improved data storage isolation.

    @STEREO The new servers perform better due to less random stuff running on them as well as better performance from the data storage due to less resource contention from other servers (ie the web servers).

  7. Dan

    Dallas, I’m sure you have your reasons, but the day I can’t run procmail, I’m no longer a dreamhost customer. Your mail filters are fine for my grandma, but they don’t cut it for me.

  8. KRKeegan

    There is a major flaw with this setup and any setup which uses a user account to process mail and then forward back to a mail account.

    The procmail install on the user machines blocks forwarding for large institutional domain names, such as paypal, ebay, schwab, etc. This is done to prevent phisers from using Dreamhost to solicit information. However, it has the very unfortunate side effect of bouncing all mail from these sites.

    I have been haggling with dreamhost for weeks and they just finally figured it out and set me this response:

    —–

    Hello,

    I’m a little surprised that Andrea missed this, the problem here, is that you have “ebay” in a “from” header. The server then responds:

    Error: No mail with this sender address allowed (in reply to end of DATA command)

    Basically, we have a pretty blunt instrument on the web server side, that blocks mail with any “from” headers that contain known domains like “ebay”, “paypal”, “bankofamerica”, stuff like that.

    Since we’re now asking users to send through the web servers, we may need to look into lifting those restrictions, however, that’s what’s happening at this time, and why you’re getting that bounce.

    Again, I’ll bring this to the attention of our administration, but for now, we can’t remove that restriction without review.

    Sorry about that!

    Thanks!
    Brian H

    ——-

    Well thanks Brian for telling me. But hey could you be a little more cavalier about bouncing what is essentially my most important emails. (sarcasm)

    So what I did in the mean time is to setup the following procmail filter:

    :0
    * ^From: MAILER-DAEMON@olds.dreamhost.com
    $HOME/Maildir/

    And then offlineimap syncs these emails over eventually. So much for instant email.

    This is obviously less than ideal. And I hope Dreamhost fixes this ASAP. I suggested they look into allowing forwarding of these domains to DH servers only. This would still provide protection while allowing procmail to function.

    Ugh, thanks DH for dropping the ball . . . again.(also sarcasm)

  9. Dan

    Brian/Others:

    I just ran into this issue too. I WAS getting emails from ebay until 12/4, however I am not getting these now. I’m now getting bounces for anything sent from ebay when I try to procmail and forward. This is seriously getting out of control. I am month to month now so I may just have to bail. DH you need to fix this.

    Dan

  10. Dan

    So here is my response:

    “It looks like that our system will not allow messages sent from ebay,
    paypal to be resent because our system shows that the message can be
    considered “phishing”.

    We have specific restrictions such as this to prevent phishing scams.

    I would suggest setting up a new email address for paypal and ebay and
    don’t forward that address to the shell just add a separate filter from
    the panel to discard everything that doesn’t come from Ebay/Paypal.

    Or, you can even add a new filter that comes before the forward to shell
    and that would send any ebay/paypal messages to another folder and select
    the “execute and stop” for that filter.”

    I’m not giving up yet…

    My response:

    It looks like that our system will not allow messages sent from ebay,
    paypal to be resent because our system shows that the message can be
    considered “phishing”.

    We have specific restrictions such as this to prevent phishing scams.

    I would suggest setting up a new email address for paypal and ebay and
    don’t forward that address to the shell just add a separate filter from
    the panel to discard everything that doesn’t come from Ebay/Paypal.

    Or, you can even add a new filter that comes before the forward to shell
    and that would send any ebay/paypal messages to another folder and select
    the “execute and stop” for that filter.

  11. Dan

    I got a response saying that they don’t care if I don’t like it and that I’m out of luck if I want to see their list of domains they are blocking. Since this is a serious issue and I’m sure is one of many to come, I’ve created a mailing list to discuss this and other issues since there seem to be an assortment of blogs may or maynot be watched by all those interested. If you want to subscribe, the list is here;

    http://lists.datasmuggler.com/listinfo.cgi/dhprocmailusers-datasmuggler.com

Write a Comment

Validation Image