This is a non-IMPACT record, meaning that access to the data is not controlled by IMPACT. For access, see the directions below.

This Resource is offered and provided outside of the IMPACT mediation framework. IMPACT and the IMPACT Coordination Council/Blackfire Technology, Inc. expressly disclaim all conditions, representations and warranties including but not limited to Resource availability, quality, accuracy, non-infringement, and non-interference. All Resource information and access is controlled by entities and under terms that are external to the IMPACT legal framework.


Phishing Website Data Set
External Dataset
External Data Source
University of California, Irvine
56 (lowest rank is 56)

Category & Restrictions

cyber crime, phishing


In this dataset, light is shed on the important features that have proved to be sound and effective in predicting phishing websites.

Although many articles about predicting phishing websites have been disseminated, no reliable training dataset has been previously published publically, maybe because there is no agreement in literature on the definitive features that characterize phishing webpages, hence it is difficult to shape a dataset that covers all possible features.    This dataset collected mainly from: PhishTank archive, MillerSmiles archive, Googles searching operators.
Data Set Characteristics:    N/A
Number of Instances:2456
Area:Computer Security
Attribute Characteristics:Integer
Number of Attributes:30
Date Donated 2015-03-26
Associated Tasks: Classification
Missing Values? N/A

Additional Details

phishing, phishing website data set, website, 942, external data source, external, inferlink, corporation, inferlink corporation, source, dataset, features, predicting, websites, proved, light, sound, effective, archive, characteristics, missing, articles, operators, security, attribute, covers, 2015, 2456, characterize, publically, attributes, webpages, millersmiles, phishtank, computer, literature, agreement, tasks, uci, ics, reliable, googles, repository, integer, collected, shape, difficult, definitive, training, values, searching, instances, other, donated, disseminated, classification, published, 03