To request access this dataset you will need to login with an IMPACT account. Accounts are free. If you don't have one please register.
This dataset is no longer available and has a current status of 'Withdrawn'.
Please see the catalog for a listing of currently available datasets.

Summary

DS-0520
GT Malware Passive DNS Data Daily Feed
Dataset
Georgia Tech
Georgia Tech
07/01/2015
Data collection is ongoing
1 (lowest rank is 56)

Category & Restrictions

DNS Data
dns data, malware, threat intelligence
Quasi-Restricted
true

Description


GT Malware Passive DNS Data Daily Feed

This dataset contains a daily feed of passive DNS data produced by the Georgia Tech Information Security Center??s malware analysis system. It is produced by executing suspect Windows executables in a sterile, isolated environment, with limited access to the Internet, for a short period of time. Each sample??s use of the DNS is recorded and made available in both raw (packet capture, or PCAP) and plaintext formats. The plaintext format, which contains a subset of information present in the PCAP files, is represented as a series of CSV files named according to the date on which a given set of executables was processed. Each file comprises a series of 3-tuples that provide the executable's MD5 hash, the qname (domain name) of the DNS query, and (if the query was of type A) a resolution IP address for the domain name. Note that in the plaintext format, for a given MD5 and qname there is at most one resolution IP address provided, even if the query resulted in a response record set that contains multiple resolution addresses.

This dataset is structured as a set of archives that each correspond to a single day of sample processing-based DNS data collection. Each archive decompresses to a top-level folder containing a CSV file (the plaintext format) and a PCAP subdirectory (the raw format) for that day. The contents of the CSV file are sorted by executable MD5, qname, and resolution IP address. The PCAP subdirectory contains a set of PCAP files that are each named according to the MD5 of the sample that generated the corresponding DNS traffic it contains.
This dataset is the subject of ongoing measurement and data collection. As such the data is continuously growing. Researchers who are granted access will be able to download updates for a period of one year after their request.

Additional Details

N/A
Size is growing as more data is collected
false
true
georgia, tech, dns, malware, daily, feed, passive, gt, 520, gt malware passive dns data daily feed, 2015, georgia tech, pcap, format, resolution, md5, plaintext, query, files, qname, file, csv, sample, domain, subdirectory, produced, executable, executables, dataset, named, day, series, raw, single, multiple, dns data, top, subset, structured, processed, suspect, limited, time, environment, center, short, response, windows, security, decompresses, isolated, packet, analysis, archive, capture, generated, level, period, based, traffic, represented, correspond, type, comprises, sorted, formats, access, archives, note, hash, tuples, processing, sterile, contents, provide, folder, system, executing