CybersecurityNew method to rid inboxes of unsolicited e-mail

Published 26 November 2012

Spam used to be text-based, but has recently turned high-tech, using layers of images to fool automatic filters; thanks to some sophisticated new cyber-sleuthing, researchers at are working toward a cure

These days spam plagues e-mail inboxes around the world, hawking miracle pills and enticing the gullible with tales of offshore bank accounts containing untold fortunes.

These once-text-based e-mail infiltrators have recently turned high-tech, using layers of images to fool automatic filters. Thanks to some sophisticated new cyber-sleuthing, researchers at Concordia University’s Institute of Information Systems Engineering are working toward a cure.

A Concordia University release reports that Ph.D. candidate Ola Amayri and thesis supervisor, Nizar Bouguila, have conducted a comprehensive study of several spam filters in the process of developing a new and efficient one. They have now proposed a new statistical framework for spam filtering that quickly and efficiently blocks unwanted messages.

“The majority of previous research has focused on the textual content of spam e-mails, ignoring visual content found in multimedia content, such as images. By considering patterns from text and images simultaneously, we’ve been able to propose a new method for filtering out spam,” says Amayri, who recently published her findings online in a series of international conferences and peer-reviewed journals.

Amayri explains that new spam messages often employ sophisticated tricks, such as deliberately obscuring text, obfuscating words with symbols, and using batches of the same images with different backgrounds and colors that might contain random text from the Web.

Until now, however, the majority of research in the domain of e-mail spam filtering has focused on the automatic extraction and analysis of the textual content of spam e-mails and has ignored the rich nature of image-based content. When these tricks are used in combination, traditional spam filters are powerless to stop the messages, because they normally focus on either text or images but rarely both.

So how do we stop spam before it inundates our inboxes?

“Our new method for spam filtering is able to adapt to the dynamic nature of spam emails and accurately handle spammers’ tricks by carefully identifying informative patterns, which are automatically extracted from both text and images content of spam emails,” says Amayri.

By conducting extensive experiments on traditional spam filtering methods that were general and limited to patterns found in texts or images, she has developed a much stronger way, based on techniques used in pattern recognition and data mining, to filter out unwanted e-mails. Although the new method has been tested on English spam emails, Amayri says it can be easily extended to other languages.

The release notes that while this new spam-detecting approach is still in the development stage, Amayri and Bouguila are currently working on a plug-in for SpamAssassin, the world’s most widely used open-source spam filter. Amayri hopes that this plug-in will allow other researchers to perform further tests and make more progress in the field of spam detection.

“Spammers keep adapting their methods so that they can trick the spam filters,” says Amayri.

“Researchers in this field need to band together to keep adapting our methods too, so that we can keep spam out and focus on those messages that are really important.”

The completion of this research was made possible thanks to the Natural Sciences and Engineering Research Council of Canada (NSERC).