Spam Links

Spam Filtering Research

There is a lot more to filtering spam than simply blocking IP addresses or separating out email with the word “viagra” present.

Spam Filtering Mechanisms

Specific descriptions and examples on how to filter spam on a server level.

Top Index

Spam Filtering Techniques

Spam Filtering for Mail Exchangers - tldp.org/HOWTO/Spam-Filtering-for-MX/
Filtering Standards Anti Spam Research Group (ASRG) Subgroup - asrg.sp.am/subgroups/filtering.shtml
Server Index Query (SIQ) (draft) - www.milter.info/sendmail/milter-siq/
Notes on stopping Unsolicited Commercial Email - www.sprocket.com/security/stopping-uce.php
Second-Generation Anti-Spam Solutions - overcomeemailoverload.com/advice/AntiSpamTools.html
Spam filtering techniques - www-128.ibm.com/developerworks/linux/library/l-spamf.html?t=gr,lnxw15=SFT
E-Mail Spamming countermeasures - www.ciac.org/ciac/bulletins/i-005c.shtml
How to effectively block spam and junk mail - www.redearthsoftware.com/spam-filter-article.htm
Reverse Spam Filtering - www.ii.com/internet/messaging/spam/
Bloqueando, Filtrando - www.absoluta.org/seguranca/seg_spam.htm - Portugese
Filtering Unsolicited E-mail - ist.uwaterloo.ca/security/howto/2000-09-27/
Anti-Spam Methods & Checks - www.pivotalveracity.com/NewsRes/AntiSpam.php
Technical approaches to spam - www.taugh.com/spamtech.pdf
Technologies to Combat Spam - www.sans.org/rr/whitepapers/email/1130.php
URL filtering - www.sophos.com/pressoffice/news/articles/2004/02/sa_cutsspam.html
Anti-Spam Solutions and Security - www.securityfocus.com/infocus/1763
Stopping Email Abuse - en.wikipedia.org/wiki/Anti-spam_techniques_(e-mail)
SpamGuru Overview - www.research.ibm.com/spam/papers/spamguru-overview.pdf
A Multifaceted Approach to Spam Reduction - www.research.ibm.com/spam/papers/multifaceted-approach.pdf
Tutorial on Junk E-mail Filtering - research.microsoft.com/%7Ejoshuago/ICMLTutorialAnnounce.htm
Spam Filtering Techniques - www.outblaze.com/main.php?id=antispam&page=anti_filter
Effective Filtering - www.spamhaus.org/effective_filtering.html
Technical Standards for E-mail Delivery - postmaster.aol.com/guidelines/standards.html
Technical and Blocklist Restrictions - www.tuffmail.com/mx-restrictions.php
Mail Filtering - www.acme.com/mail_filtering/
MX+ - mxplus.org/
The Effect of Filters on Spam Mail - www.kellogg.northwestern.edu/research/math/papers/1402.pdf
Second defense: Further validation - outblaze.com/main.php?id=antispam&page=anti_filter02
What is Anti-Spam? - www.circleid.com/posts/what_is_anti_spam/
How this system filters mail - www.sput.nl/spam/filter-mail.html
Anti-Spam Technologies - www.oecd-antispam.org/article.php3?id_article=241
Computer Tyme Spam and Virus Filter: How it Works! - www.junkemailfilter.com/spam/how_it_works.html
Understanding the Network-Level Behavior of Spammers - www.nanog.org/mtg-0606/pdf/nick-feamster.pdf
Separating Wheat from the Chaff: A Deployable Approach to Counter Spam - www.cs.indiana.edu/~minaxi/pubs/sruti06.pdf
Kaboom — email filtering II - www.cyberdelix.net/tech/kaboom.htm
Filtering Spam At Your Leisure: post delivery filtering - www.uoregon.edu/~joe/maawg7/maawg7.ppt
FortiGuard AntiSpam Technology Overview - www.fortiguardcenter.com/antispam/antispam_info.html#spamtech
e-scribe Antispam: Technical Details - e-scribe.com/antispam/
“Default Deny” — A Paradigm Shift for E-Mail - matthias.leisi.net/
Spam Control: The Current Landscape - www.ferris.com/2007/01/02/the-commodity-status-of-spam-control/
The minimum antispam features of a modern SMTP server - utcc.utoronto.ca/~cks/space/blog/spam/MinimumSMTPFeatures
How PerfectMail works - www.xpmsoftware.com/index.php/xpm/howItWorks
Keeping Spam Out of the Network - www.avertlabs.com/research/blog/?p=194
Validating the sender domain - www.avertlabs.com/research/blog/?p=241
Email Relay Detection - mel.byu.edu/spam/
Sieve: A Mail Filtering Language - www.faqs.org/rfcs/rfc3028.html
SIEVE Email Filtering: Spamtest and VirusTest Extensions - www.faqs.org/rfcs/rfc3685.html
SPAM or NOT - spamornot.org/
Understanding the Network Level Behavior of Spammers - www.nanog.org/mtg-0606/pdf/nick-feamster.pdf
Anti-Spam: The MagicMail Philosophy - www.linuxmagic.com/opensource/anti_spam/philosophy
Fake MX - www.fakemx.org/
High Speed Image Part Recognition (IPR) - www.comdomsoft.com/en/antispam/white-papers/high-speed-image-part-recognition-ipr.html
Sendmail Best Practices for Combating Spam - www.sendmail.com/sm/wp/spam_best_practices/
Proofpoint MLX Technology Whitepaper - www.proofpoint.com/id/mlxwp/
Fighting Back Against the Spam-Zombie Hordes - research.microsoft.com/news/featurestories/publish/SpamFighting.aspx
How Dynamic Are IP Addresses? - research.microsoft.com/projects/sgps/sigcomm2007.pdf
Spamming Botnets: Signatures and Characteristics -

Top of Section Top Index

Spam Filtering Case Studies

Local Mail Blocking Mechanisms - www.er6.eng.ohio-state.edu/mail_blocking.html
Anti-Spam Mechanisms on our Mail Servers - www.ultradesign.com/support/email/spamfilters.html
Mails rejected by anti-spam rules - web.ccr.jussieu.fr/anti-spam/rejet/rejet.html#english
Spam Filtering in a Small Business Environment, a Case Study - www.sans.org/rr/whitepapers/email/1213.php
Controlling Spam in a Small Business - www.sans.org/rr/whitepapers/email/1248.php
How to filter unsolicited e-mail on your mail server - www.sans.org/rr/whitepapers/email/582.php
Anti spam software – Spam Filter vs. Spam Block - spameater.com/anti-spam-software.html
Spam server details - gconnor.livejournal.com/97154.html
Spam Filtering Survey - ist-socrates.berkeley.edu:7309/public/spam_survey.html
A Study of Supervised Spam Detection - plg.uwaterloo.ca/~gvcormac/spamcormack.html
Review of Gordon Cormack's Study of Spam Detection - www.zdziarski.com/papers/cormack.html
Comparing SpamAssassin with CBDF email filtering - www.cs.bham.ac.uk/~mgl/cluk/papers/obrian.pdf
Bayesian Filtering, a review - freshmeat.net/articles/view/964/comparative
More on Spam - www.cookco.us/more_on_spam.htm
Traveler, a Spam-resistant E-mail System - www.vsta.org/spam/Traveler.html
Spam filtering best practice and how we filter spam - www.antespam.co.uk/how-we-filter-spam/
Spam Fighting at CERN - mmmtf.web.cern.ch/mmmtf/Minutes/2003-02-18/spamkiller.pdf
CERN AntiSpam Server side - https://websvc06.cern.ch/mmmservices/Antispam/ActionServer.aspx
Deployment Experience: Rolling Out a New Antispam Solution in a Large Corporation - www.ceas.cc/2006/2.pdf
Solving big problems with Open Source: e-mail - www.potentialtech.com/wmoran/spam.pdf
Email filtering with MIMEDefang - www.xs4all.nl/~johnpc/mimedefang-modular/yapceu2005.pdf
Using PostFix To Reject Spam - honeypot.net/filtering-spam-postfix
Greylisting, SpamAssassin, SpamProbe, Image Spam, DNSWL, and Viruses - www.chaosreigns.com/spam/
Fighting Spam in an ISP Environment - www.roaringpenguin.com/files/isp-spam.pdf
ASSP — extracting the ham from the spam - www.uniforum.chi.il.us/slides/assp.ppt - mirrors: 1

Top of Section Top Index

Spam Filtering References

Internet Standards

Top of Section Top Index

Verisign's SiteFinder and Spam Filtering

Verisign's Wildcard Service Deployment - www.icann.org/general/wildcard-history.htm

Top of Section Top Index

Email and Spam Research Groups

IBM Anti-Spam Research - domino.research.ibm.com/comm/research_projects.nsf/pages/spam.index.html
DoI: Denial of Information - www-static.cc.gatech.edu/projects/doi/
Collaborative Center for Internet Epidemiology and Defenses (CCIED) - www.ccied.org/
PARC email research group - www.parc.com/research/projects/email/
Max-Planck-Institut Informatik Machine Learning Group - www.mpi-inf.mpg.de/departments/rg2/
Microsoft Research Machine Learning and Applied Statistics (MLAS) group - research.microsoft.com/mlas/
Microsoft Research S-GPS: Spammer Global Positioning System - research.microsoft.com/research/sv/sgps/
Cloudmark Research Group - www.cloudmark.com/research/
Verisign Security Research Anti-spam Schema - www.verisign.com/research/Security_Research/037091.html
Knowledge Discovery and Data Mining Laboratory at UAB - www.cis.uab.edu/kddm/ - papers
Alek Kolcz et. al. - ir.iit.edu/~alek/publications.html

Top of Section Top Index

Spammer Techniques

The Spammers' Compendium - www.jgc.org/tsc/
Observed Trends in Spam Construction Techniques: A Case Study of Spam Evolution - www.ceas.cc/2006/4.pdf
Spammer Tricks - www.rickconner.net/spamweb/tricks.html
How to spot a spam website - www.rickconner.net/spamweb/spamwebsites.html
Tricks for protecting spam websites - www.rickconner.net/spamweb/web-dns-tricks.html
Spammer Tricks - gregsearle.tripod.com/spam_tech.html
The Effects of AntiSpam Methods on Spam Mail - www.ceas.cc/2006/24.pdf
Spam Techniques - st.do.homeunix.org/
A day in the life of a spammer - matthias.leisi.net/archives/126-A-day-in-the-life-of-a-spammer.html
Pathological Study of Junk Mails - junkmatcher.sourceforge.net/Pathology/
Round robin DNS - www.spamtrackers.eu/wiki/index.php?title=Round_robin
Pharmacy Alert Security Team - pharmalert.zoomshare.com/
Image spam by the numbers - www.csoonline.com/article/221254
ISP Spam Issues - www.spamhaus.org/faq/answers.lasso?section=ISP%20Spam%20Issues
Host cloaking technique used by spammers - thespamdiaries.blogspot.com/2006/02/new-host-cloaking-technique-used-by.html
Know Your Enemy: Fast-Flux Service Networks - www.honeynet.org/papers/ff/
Spamscatter: Characterizing Internet Scam Hosting Infrastructure - www.cs.ucsd.edu/~voelker/pubs/spamscatter-security07.pdf
Anatomy of Spam - anatomyofspam.spaces.live.com/ RSS

Top of Section Top Index

Spam Filtering Benchmarks and Reviews

Testing the effectiveness of spam filtering.

Top Index

Spam Filter Benchmarking and Testing

VeriTest Anti-Spam Benchmark Service - www.lionbridge.com/lionbridge/en-US/services/outsourced-testing.htm
TREC Spam Filter Evaluation Tool Kit - plg.uwaterloo.ca/~gvcormac/jig/
Discovery Challenge - www.ecmlpkdd2006.org/challenge.html
Generic Test for Unsolicited Bulk Email (GTUBE) - spamassassin.apache.org/gtube/
Spirent Avalanche - www.spirentcom.com/

Top of Section Top Index

Spam Filter Reviews

Anti-spam Tool League Table - www.jgc.org/astlt/
Security appliances keep mail stream clean - www.gcn.com/print/24_7/35399-1.html
Spam Filters - freshmeat.net/articles/view/964/comparative
SC Magazine anitspam - www.scmagazineus.com/Topic/Spam-Techniques/tag/41/0/
SC Magazine content security awards - www.scmagazineuk.com/Awards/section/341/
SC Magazine US Email Content Filtering 2007 - www.scmagazineus.com/Email-Content-Filtering-2007/GroupTest/47/
Spam Filter Reviews - spam-filter-review.toptenreviews.com/
Winning the War on spam: Comparison of Bayesian spam filters - home.dataparty.no/kristian/reviews/bayesian/
WhichSpamFilter - www.whichspamfilter.com/
PCMAG antispam software - www.pcmag.com/category2/0,1874,4795,00.asp
PCWorld spam reviews - www.pcworld.com/browse/1353/topic.html?page=1&typeId=3
Spam Filtering II - sam.holden.id.au/writings/spam2/
pi's Bogofilter page - piology.org/bogofilter/
Four Cans of Anti-Spam - sartryck.idg.se/Art/Antispamboxar_1_NOK182005e.html
Email & Spam Filtering Stats: Jan 12, 2006 - daggle.com/060113-114819.html
A Last Go At Spam Filtering Before Whitelisting - daggle.com/060119-123011.html
Stats Say: Sticking With Gmail! - daggle.com/060120-112335.html
DNS Blocklist Accuracy Figures (as of July 2005) - wiki.apache.org/SpamAssassin/DnsblAccuracy082005
Enterprise Spam Filters Review - www.networkcomputing.com/showArticle.jhtml?articleId=173602950
Connection scoring beats spam filtering - windowssecrets.com/comp/060126/#story1
Wait a minute Mr. Postman! - www.gcn.com/print/25_14/40896-1.html
Anti-Spam State of the Art - spam.ani.univie.ac.at/files/FA384018-1.pdf
Email Classification - www.massey.ac.nz/~tameyer/research/spambayes/
Network World anti-spam buyer's guide - www.networkworld.com/buyersguides/guide.php?cat=865463
Network World: Spam in the Wild, The Sequel - www.networkworld.com/reviews/2004/122004spampkg.html

Top of Section Top Index

More Than Just Mail Filtering

Firewalls, routers, DNS and playing it slow.

Top Index

Network Based Spam Filtering

Cutting off IP connectivity to spam sources - spam.abuse.net/adminhelp/ip.shtml
Spam Blocking with a Dynamically Updated Firewall Ruleset - deny-spammers.sourceforge.net/
Packetbl - wiki.duskglow.com/tiki-index.php?page=Packetbl
MAPS RBL BGP Feed Configuration FAQ for Cisco Routers - www.pch.net/documents/tutorials/maps-rbl-bgp-cisco-config-faq.html

Top of Section Top Index

Source Device Fingerprinting

Openbsd's fingerprinting and shaping used for evil^Wgood - use.perl.org/~merlyn/journal/17094
Some p0f Data - taint.org/2006/10/03/193930a.html
Passively OS Fingerprinting Email with PF - blog.insidesystems.net/articles/2006/06/06/OS-Fingerprinting-Email
p0f analyzer - www.ijs.si/software/p0f-analyzer.pl

Top of Section Top Index

Playing it Slow

Spam Traffic Management

Greylisting

Top of Section Top Index

Nolisting

Nolisting - www.joreybump.com/code/howto/nolisting.html

Top of Section Top Index

Unlisting

Unlisting: Port Knocking for SMTP - www.joreybump.com/code/howto/unlisting.html

Top of Section Top Index

Blacklisting vs. Content Filters

Some views about the benefits or otherwise of “blacklists” or content filters.

Filters vs. Blacklists - www.paulgraham.com/falsepositives.html
Who Runs The Blocklists? - www.linxnet.com/misc/spam/blocklists.html
Why Content Blocking Does Not Work - www.knujon.com/contentblock.html

Top Index

Distributed Spam Filtering

Using shared data to fine-tune filtering algorithms.

Distributed [spam] Early Warning System - www.radagast.org/~dplatt/dews/dews-design-sketch.txt
Spam Agent Architecture - linux.ucla.edu/~larva/spam-agent/
Attack Resistant Trust Metric Metadata HOWTO - www.levien.com/free/tmetric-HOWTO.html
Spam Inoculation Messages - www.zdziarski.com/papers/draft-spamfilt-inoculation-03.txt
Personalised, Collaborative Spam Filtering (CASSANDRA) - https://www.cs.tcd.ie/publications/tech-reports/reports.04/TCD-CS-2004-36.pdf [1]
Reputation Network Analysis for Email Filtering - trust.mindswap.org/papers/emailPaper/ [1]
Complement Set Filtering - en.wikipedia.org/wiki/Complement_set_email_filtering
SpamWatch - www.cs.berkeley.edu/~zf/spamwatch/
Personalised, collaborative spam filtering - www.ceas.cc/papers-2004/132.pdf
Berkeley Workshop on Collaborative Filtering - www2.sims.berkeley.edu/resources/collab/
Collaborative Filtering Research Papers - jamesthornton.com/cf/
Collaborative Filtering - pespmc1.vub.ac.be/COLLFILT.html

Top Index

Text Classification Spam Filtering

Most text classification spam filters use machine learning of some form to learn how to filter, rather than building rules manually. First implemented in several client side spam filters, it now holds much more potential as part of a server side spam filter, or as a feed for a DNSBL with less serious collateral damage. Baye's theorem is the buzzword of the day.

Spam Classification Overviews

Introduction to Bayesian Filtering - www.process.com/precisemail/bayesian_filtering.htm
SpamBayes Background Reading - spambayes.sourceforge.net/background.html
Why Bayesian filtering is the most effective - www.gfi.com/whitepapers/why-bayesian-filtering.pdf
Machine Learning for Text Classification - www.daviddlewis.com/publications/slides/lewis-2003-0117-spamconf.html
Filtering Research - www.paulgraham.com/bayeslinks.html
Spam Detection - radio.weblogs.com/0101454/stories/2002/09/16/spamDetection.html
Statistics and the war on spam. In Statistics, A Guide to the Unknown
Stopping spam with statistic - research.microsoft.com/~joshuago/significance-spam_edited2-times.pdf
A Plan for Spam - www.paulgraham.com/spam.html
Better Bayesian Filtering - www.paulgraham.com/better.html
About Bayesian Spam Filtering - email.about.com/cs/bayesianfilters/a/bayesian_filter.htm

Top of Section Top Index

Spam Classification Bibliographies

Bibliography on Machine Learning for Spam Detection - liinwww.ira.uka.de/bibliography/Ai/MLSpamBibliography.html
www.iis.sinica.edu.tw/~jhwang/spam-paper.html
research.microsoft.com/~joshuago/spambibliography.mht

Top of Section Top Index

Spam Classification Research

Top of Section Top Index

Spam Classification Research 2007

Technology Impact Assessment: Fingerprinting versus Bayesian Filtering
Intent Based Filtering of Spam
Improving Spam Filtering by Detecting Gray Mail
Hardening Fingerprinting by Context
Dirichlet-Enhanced Spam Filtering based on Biased Samples

Top of Section Top Index

Spam Classification Research 2006

Spam Filtering with Naive Bayes — Which Naive Bayes?
Online Discriminative Spam Filter Training
Batch and Online Spam Filter Comparison
Learning at Low False Positive Rates
Fast Uncertainty Sampling for Labeling Large E-mail Corpora
Breaking Anti-Spam Systems with Parasitic Spam
An Adaptive, Semi-Structured Language Model Approach to Spam Filtering on a New Corpus
An Empirical Study of Clustering Behavior of Spammers and Group-based Anti-Spam Strategies
The challenges of service-side personalized spam filtering: scalability and beyond
Topic Models Based Personalized Spam Filter - see: PDF, slides

Top of Section Top Index

Spam Classification Research 2005

SMTP Path Analysis
Spam Corpus Creation for TREC
Spamato – An Extendable Spam Filter System
GoodWord Attacks on Statistical Spam Filters
Naive Bayes Spam Filtering Using Word-Position-Based Attributes
Scalable and Reliable Collaborative Spam Filters: Harnessing the Global Social Email Networks
Stopping Outgoing Spam by Examining Incoming Server Logs
Comparative Graph Theoretical Characterization of Networks of Spam and Legitimate Email
Spam Deobfuscation using a Hidden Markov Model
Let Your CyberAlter Ego Share Information and Manage Spam

Top of Section Top Index

Spam Classification Research 2004

Canning more than SPAM
A Unified Model Of Spam Filtration
Scalable Centralized Bayesian Spam Mitigation with Bogofilter
Improving spam filtering by combining Naïve Bayes with simple k-nearest neighbor searches
Chung Kwei
The Impact of Feature Selection on Signature-Driven Spam Detection
Word Stemming to Enhance Spam Filtering
Exploring Support Vector Machines and Random Forests for Spam Detection
Filtron: A Learning-Based Anti-Spam Filter
On Attacking Statistical Spam Filters
SpamBayes: Effective open-source email classification system
Trends in Spam Products and Methods
Spamguru: An enterprise anti-spam filtering system
On attacking statistical spam filters
How to beat a bayesian spam filter
Advanced language classification using chained tokens
The plateau at 99.9
The more things change: Volatility and stability in spam features
Behavior based spam detection
Email Mining Toolkit
An artificial neural network spam classifier
Spam filtering using contextual network graphs
Spam, damn spam, and statistics: Using statistical analysis to locate spam web pages
Characterizing Spam Traffic
Bayesian Noise Reduction: Contextual Symmetry Logic Utilizing Pattern Consistency Analysis
Personal Email Networks: An Effective Anti-Spam Tool
Learning to Filter Junk E-Mail from Positive and Unlabeled Examples

Top of Section Top Index

Spam Classification Research 2003

A comparison of event models for naive bayes anti-spam e-mail filtering
On memory-bound functions for fighting spam
Moderately Hard, Memory-bound Functions
A case-based approach to spam filtering that can track concept drift
'in vivo' spam filtering: A challenge problem for data mining
Using latent semantic indexing to filter spam
Parameterization of Naïve Bayes for Spam Filters
A memory-based approach to anti-spam filtering for mailing lists
Spam filters: Bayes vs. chi-squared; letters vs. words
Bayesian spam filtering tweaks
Sparse binary polynomial hash message filtering and the crm114 discriminator
Automatic feature induction for text classification

Top of Section Top Index

Spam Classification Research 2002

Robust Feature Selection by Mutual Information Distributions
Evaluating cost-sensitive unsolicited bulk email categorization

Top of Section Top Index

Spam Classification Research 2001

Boosting Trees for Anti-Spam Email Filtering [extended version]
Stacking classifiers for anti-spam filtering of e-mail
A Memory-Based Approach to Anti-Spam Filtering for Mailing Lists [1]
SVM-based filtering of e-mail spam with content-specific misclassification costs

Top of Section Top Index

Spam Classification Research 2000

An evaluation of Naïve Bayesian anti-spam filtering
ifile: An application of machine learning to mail filtering [1]
Learning to filter spam-email: A comparision of a naïve Bayesian and memory-based approach
A comparative study of classification-based personal e-mail filtering
An experimental comparison of naive bayesian and keyword-based anti-spam filtering with personal e-mail messages
Combining text and heuristics for cost-sensitive spam filtering

Top of Section Top Index

Spam Classification Research 1999

Naïve-Bayes vs. Rule-Learning in Classification of Email
Performance Comparison between Genetic Programming & Naïve Bayes

Top of Section Top Index

Spam Classification Research 1998 and earlier

A Bayesian Approach to Filtering Junk E-mail [1]
SpamCop: A Spam Classification & Organization Program
Learning Rules that classify Email

Top of Section Top Index

everything you didn't want to have to know about spam
Hosted by spam.abuse.net, with help from Neil Schwartzman. Domain registration by Gregg DesElms. Logo by Art101.
Spam Links Home Creative Commons License
This work is licensed under a Creative Commons License. SPAM is a trademark of Hormel Foods.
Unsubscribe
Page last updated: 05-Jul-2008