Unsolicited bulk email, commonly known as spam, represent significant problem on the Internet. The seriousness of the situation is reflected by the fact that approximately 97% of the total e-mail trace currently (2009) is spam. To fight this problem, various anti-spam methods have been proposed and are implemented to filter out spam before it gets delivered to recipients, but none of these methods are entirely satisfactory. In this project we investigate the effectiveness of several spam filtering techniques and technologies. Our analysis was performed by analyzing email traffic under different conditions. We show that genetic algorithm based spam filters perform best at server level and Bayesian filters are the most appropriate for filtering at user level.


Spam (Monty Python, 1989) is probably the greatest single nuisance for users of the Internet. As well as being annoying, spam also introduces many serious security risks and is often used to conduct fraud, as a conduit for malicious software and to carry out denial of service attacks on mail servers. Despite significant anti-spam efforts, the development of powerful spam filtering technologies, and even new legislation in many countries, the incidence of spam remains stubbornly high (Geerthikm 2013).

The main current anti-spam techniques focus on filtering incoming email, either based on parsing message content, Bayesian filtering, maintaining Domain Name System (DNS) block lists, and/or the use of collaborative filtering databases (Alan Schwartz, 1998). Spammers tend to be resourceful though, and quickly find ways to get around most countermeasures. There are various attempts to also make anti-spam more adaptive, with the availability of reporting services that allow spam filters and block lists to keep up to date with new spam techniques. There is a feeling though that spammers are always a step ahead, with the anti-spam community following with countermeasures some time afterwards (Simson, 1998). [M1] 

In addition to technical countermeasures, several governments have introduced legislation outlawing spam and providing for stiff penalties. Despite some prosecutions, such legislation has had little or no effect, for several reasons such as the inter-jurisdictional nature of email traffic and spam.

Internet email is transferred using the Simple Mail Transfer Protocol (SMTP) (Klensin, 1982).  J. SMTP was first introduced (J.B, 1982), at a time when the total number of Internet hosts was just a few hundred and trust between these could be assumed, and was designed to be lightweight and open. Attempts to introduce authentication via extensions or with new protocols that can sit on top of SMTP (pretty good privacy) have proven useful in some situations but are very far from universal adoption.[M2] 

In this project, we explore the use of decentralized trust to provide a more robust spam filtering system. A well-known definition of trust is “a particular level of the subjective probability with which an agent will perform a particular action” ( (Gambetta, 1998)). Trust is primarily a social concept and, by the above definition, is personalized by the subject. In this paper, we make use of trust by taking the socially-inspired notion of trust based on experience and recommendation.

The peer-to-peer nature of the Internet mail infrastructure suggests that it should be well suited to a distributed approach to trust management.

These project proposed an architecture and protocol for establishing and maintaining trust between mail servers. The architecture is effectively a closed loop control system used to adaptively improve spam filtering. In these approach, mail servers dynamically record trust scores for other mail servers; trust by one mail server in another is influenced by direct experience of the server (i.e. based on mail relayed by that server) as well as recommendations issued by collaborating mail servers. Modeling trust interactions between mail servers, we explore how mail filtering combine trust values with existing mail filtering techniques.


In 1978, seven years after the first email, what is considered to be the first spam email ever was sent (Brad Templeton, 2012). This mail was sent by a marketer at Digital Equipment Corporation (DEC) to several hundred recipients. this may not sound like a lot but since the ARPANET on the west coast, where the spam was sent out had a grand total of 1200 possible targets the action was quite grievous indeed (Ray Tomlinson., 2012). The marketing ploy got a lot of attention and most of it negative, much like reactions to spam today, although now the number of spams are closing on 200 billion (Commtouch, 2012).

Every spam that is sent through the system uses resources that could be better utilized elsewhere or for other purposes. For a single mail this might not be a big problem but considering the sheer amount of spam in circulation a great deal of computer resources gets used to filter out unsolicited mail. This problem is only related to the actual resources used by the spam and does not take into account malware or scams that might come with the spam. In short, just as in 1978, spam is still very much unwanted and unwelcome.


The aim of this project is to analyze threats, attacks in an open decentralized distributed spam filtering system using Bayesian filters framework.

The main objectives are;

·         Using Bayesian filters framework and Techniques to analyze the effects of anti-spam to filter threats in email servers.

·         To show how spam emails are generated by attackers

·         To determine if an email is a spam

·         Reveal a secure way to preventing spams email and save users information.