This is a non-IMPACT record, meaning that access to the data is not controlled by IMPACT. For access, see the directions below.

Disclaimer:
This Resource is offered and provided outside of the IMPACT mediation framework. IMPACT and the IMPACT Coordination Council/Blackfire Technology, Inc. expressly disclaim all conditions, representations and warranties including but not limited to Resource availability, quality, accuracy, non-infringement, and non-interference. All Resource information and access is controlled by entities and under terms that are external to the IMPACT legal framework.

Summary

DS-0945
The CTU-13 Dataset. A Labeled Dataset with Botnet, Normal and Background Traffic
External Dataset
External Data Source
Stratosphere Lab
01/01/2011
11/11/2007
56 (lowest rank is 56)

Category & Restrictions

Other
malware, malicious traffic, botnet
Unrestricted
Unknown

Description


The goal of the dataset was to have a large capture of real botnet traffic mixed with normaltraffic and background traffic. The CTU-13 dataset consists in thirteen captures (called scenarios) of different botnet samples. On each scenario we executed a specific malware, which used several protocols and performed different actions.

Each scenario was captured in a pcap file that contains all the packets of the three types of traffic. These pcap files were processed to obtain other type of information, such as NetFlows, WebLogs, etc. The first analysis of the CTU-13 dataset, that was described and published in the paper "An empirical comparison of botnet detection methods" used unidirectional NetFlows to represent the traffic and to assign the labels. These unidirectional NetFlows should not be used because they were outperformed by our second analysis of the dataset, which used bidirectional NetFlows. The bidirectional NetFlows have several advantages over the directional ones. First, they solve the issue of differentiating between the client and the server, second they include more information and third they include much more detailed labels. The second analysis of the dataset with the bidirectional NetFlows is the one published here.

The relationship between the duration of the scenario, the number of packets, the number of NetFlows and the size of the pcap file is shown in Table 3. This Table also shows the malware used to create the capture, and the number of infected computers on each scenario. ; STRATOSPHEREIPS@AGENTS.FEL.CVUT.CZ

Additional Details

1.8GB
false
Unknown
dataset, traffic, botnet, ctu, background, normal, labeled, 945, the ctu-13 dataset. a labeled dataset with botnet, normal and background traffic, inferlink corporation, external, corporation, 2011, inferlink, external data source, source, scenario, capture, malware, called, specific, actions, thirteen, captures, performed, scenarios, executed, protocols, samples, goal, consists, mixed, normaltraffic, real, netflows, bidirectional, analysis, pcap, table, include, file, packets, published, unidirectional, labels, advantages, cz, infected, types, cvut, represent, detailed, weblogs, processed, agents, assign, detection, computers, solve, size, issue, comparison, stratosphereips, create, differentiating, fel, type, files, server, relationship, client, captured, directional, outperformed, paper, duration, methods, other, empirical