Automated Test Case Generator for Phishing Prevention Using Generative Grammars and Discriminative Methods
Date
2015
Authors
Palka, Sean
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
This research details a methodology designed for creating content in support of various phishing prevention tasks including live exercises and detection algorithm research. Our system uses probabilistic context-free grammars (PCFG) and variable interpolation as part of a multi-pass method to create diverse and consistent phishing email content on a scale not achieved in previous research. This system, which we have named PhishGen, is capable of generating a large amount of unique content that can be used in live exercises, or alternatively used to build training datasets for phishing detection methods and filter settings. PhishGen is a web-based application that implements our underlying methodology to provide a user-interface for building and modifying PCFG rules and weights. The system is released as an open-source tool in order to allow access to other researchers. PhishGen has already been used in support of live commercial phishing exercises and is in the process of being utilized for content development for commercial frameworks.
Description
Keywords
Information technology, Cyber Security, Generative Grammars, Natural Language Processing, Phishing