This is a non-IMPACT record, meaning that access to the data is not controlled by IMPACT. For access, see the directions below.

Disclaimer:
This Resource is offered and provided outside of the IMPACT mediation framework. IMPACT and the IMPACT Coordination Council/Blackfire Technology, Inc. expressly disclaim all conditions, representations and warranties including but not limited to Resource availability, quality, accuracy, non-infringement, and non-interference. All Resource information and access is controlled by entities and under terms that are external to the IMPACT legal framework.

Summary

DS-0929
VPN-nonVPN dataset
External Dataset
External Data Source
University of New Brunswick
Unknown
07/08/1905
56 (lowest rank is 56)

Category & Restrictions

Other
simulated attacks, network data
Unrestricted
Unknown

Description


The UNB ISCX Network Traffic (VPN-nonVPN) dataset consists of labeled network traffic, including full packet in pcap format and csv (flows generated by ISCXFlowMeter) also are publicly available for researchers.

To generate a representative dataset of real-world traffic in ISCX we defined a set of tasks, assuring that our dataset is rich enough in diversity and quantity. We created accounts for users Alice and Bob in order to use services like Skype, Facebook, etc. Below we provide the complete list of different types of traffic and applications considered in our dataset for each traffic type (VoIP, P2P, etc.)

We captured a regular session and a session over VPN, therefore we have a total of 14 traffic categories: VOIP, VPN-VOIP, P2P, VPN-P2P, etc. We also give a detailed description of the different types of traffic generated:

Browsing: Under this label we have HTTPS traffic generated by users while browsing or performing any task that includes the use of a browser. For instance, when we captured voice-calls using hangouts, even though browsing is not the main activity, we captured several browsing flows.

Email: The traffic samples generated using a Thunderbird client, and Alice and Bob Gmail accounts. The clients were configured to deliver mail through SMTP/S, and receive it using POP3/SSL in one client and IMAP/SSL in the other.

Chat: The chat label identifies instant-messaging applications. Under this label we have Facebook and Hangouts via web browsers, Skype, and IAM and ICQ using an application called pidgin [14].

Streaming: The streaming label identifies multimedia applications that require a continuous and steady stream of data. We captured traffic from Youtube (HTML5 and flash versions) and Vimeo services using Chrome and Firefox.

File Transfer: This label identifies traffic applications whose main purpose is to send or receive files and documents. For our dataset we captured Skype file transfers, FTP over SSH (SFTP) and FTP over SSL (FTPS) traffic sessions.

VoIP: The Voice over IP label groups all traffic generated by voice applications. Within this label we captured voice calls using Facebook, Hangouts and Skype.

TraP2P: This label is used to identify file-sharing protocols like Bittorrent. To generate this traffic we downloaded different .torrent files from a public a repository and captured traffic sessions using the uTorrent and Transmission applications.

The traffic was captured using Wireshark and tcpdump, generating a total amount of 28GB of data. For the VPN, we used an external VPN service provider and connected to it using OpenVPN (UDP mode). To generate SFTP and FTPS traffic we also used an external service provider and Filezilla as a client.

To facilitate the labeling process, when capturing the traffic all unnecessary services and applications were closed. (The only application executed was the objective of the capture, e.g., Skype voice-call, SFTP file transfer, etc.) We used a filter to capture only the packets with source or destination IP, the address of the local client (Alice or Bob).

The full research paper outlining the details of the dataset and its underlying principles:

Gerard Drapper Gil, Arash Habibi Lashkari, Mohammad Mamun, Ali A. Ghorbani, "Characterization of Encrypted and VPN Traffic Using Time-Related Features", In Proceedings of the 2nd International Conference on Information Systems Security and Privacy(ICISSP 2016) , pages 407-414, Rome, Italy.
ISCXFlowMeter has been written in Java for reading the pcap files and create the csv file based on selected features. The UNB ISCX Network Traffic (VPN-nonVPN) dataset consists of labeled network traffic, including full packet in pcap format and csv (flows generated by ISCXFlowMeter) also are publicly available for researchers.

For more information contact cic@unb.ca.

The UNB ISCX Network Traffic Dataset content
Traffic: Content
Web Browsing: Firefox and Chrome
Email: SMPTS, POP3S and IMAPS
Chat: ICQ, AIM, Skype, Facebook and Hangouts
Streaming: Vimeo and Youtube
File Transfer: Skype, FTPS and SFTP using Filezilla and an external service
VoIP: Facebook, Skype and Hangouts voice calls (1h duration)
P2P: uTorrent and Transmission (Bittorrent)
; cic@unb.ca.

Additional Details

N/A
false
Unknown
vpn, dataset, nonvpn, 929, vpn-nonvpn dataset, external, source, external data source, inferlink, inferlink corporation, corporation, traffic, network, generated, unb, iscx, pcap, csv, flows, iscxflowmeter, researchers, publicly, including, labeled, format, consists, packet, skype, captured, label, applications, file, voice, hangouts, browsing, facebook, voip, p2p, sftp, client, bob, transfer, generate, ftps, services, alice, chat, streaming, ssl, service, identifies, files, calls, firefox, receive, session, provider, web, utorrent, accounts, content, vimeo, application, main, types, bittorrent, users, transmission, features, total, youtube, sessions, email, ftp, filezilla, icq, cic, chrome, capture, italy, stream, pidgin, imaps, regular, instant, quantity, categories, 414, transfers, complete, 2nd, multimedia, filter, labeling, mail, facilitate, instance, purpose, gil, security, public, paper, browser, amount, process, performing, unnecessary, mohammad, icissp, lashkari, encrypted, udp, send, habibi, mode, continuous, create, international, representative, selected, mamun, ali, configured, thunderbird, 28gb, created, includes, real, task, principles, gerard, html5, imap, smtp, connected, call, closed, characterization, documents, contact, capturing, details, written, protocols, rome, list, sharing, browsers, downloaded, 1h, systems, ssh, trap2p, objective, wireshark, java, defined, based, require, pop3, flash, ghorbani, conference, activity, executed, outlining, diversity, iam, drapper, destination, description, packets, https, proceedings, tasks, torrent, arash, underlying, samples, messaging, provide, versions, 407, gmail, other, clients, aim, deliver, repository, type, tcpdump, privacy, generating, duration, smpts, steady, openvpn, identify, assuring, rich, reading, pop3s, 2016, detailed, time, called, considered, local