This is a non-IMPACT record, meaning that access to the data is not controlled by IMPACT. For access, see the directions below.

This Resource is offered and provided outside of the IMPACT mediation framework. IMPACT and the IMPACT Coordination Council/Blackfire Technology, Inc. expressly disclaim all conditions, representations and warranties including but not limited to Resource availability, quality, accuracy, non-infringement, and non-interference. All Resource information and access is controlled by entities and under terms that are external to the IMPACT legal framework.


HTTPS Ecosystem Scans
External Dataset
External Data Source
Internet-Wide Scan Data Repository
52 (lowest rank is 52)

Category & Restrictions

address space status data


Regular and continuing scans of the HTTPS Ecosystem from 2012 and 2013 including parsed and raw X.509 certificates, temporal state of scanned hosts, and the raw ZMap output of scans on port 443. The dataset contains approximately 43 million unique certificates from 108 million hosts collected via 100+ scans.

This dataset is composed of four parts: parsed certificates, raw certificates, individual scans (status of each responsive host in a single complete scan of the IPv4 address space), and raw ZMap output of TCP SYN scans on port 443. While we have split these into individual parts, the data is optimized for use in a relational database such as PostgreSQL or MySQL. The files certificates.csv.gz, public_keys.csv.gz, and extraneous_extensions.csv.gz contain parsed data from all certificates we have encountered over the course of our scanning. The certificates relation contains all common data found in a certificate (e.g. subject, issuer, etc). The relation is keyed on "id" and is also unique based on SHA-1 fingerprint. The issuer_id attribute is a self-referntial attribute back to the parent certificate's id. Certificates are valided using OpenSSL and recently downloaded root stores. We attempt to validate each certificate against the browser store along with any previously seen intermediate certificates in order to account for missing certificate chains. The validation is represented in the is-*-trusted attributes. We further validate the certificate for other issues (e.g. expiration, invalid signature), not including the trust chain, which is stored in the is-valid and validation-error attributes. The keys relation contains unique parsed RSA and DSA keys and is linked to by certificates.public_key_id == Other types of keys are noted in the certificates relation, but are not otherwise further parsed. All other non-binary X.509 extensions are stored in the extraneous extensions relation. The scan files we provide contain data about every host that completed a successful TLS handshake on port 443 during a single comprehensive scan of the IPv4 address space. For each host we include: host IP address, certificate ID, the SHA-1 fingerprint of the certificate, and the timestamp at which the TLS handshake was completed. The data specifically originates from a PostgreSQL 9.2 database, whose schema is available in schema.txt, and we recommend for hosting this dataset. Strings are delimited with a double-quote and newlines are replaced with \n. Information about specific fields can be found in schema.txt. ;

Additional Details

scans, https, ecosystem, https ecosystem scans, 1093, external data source, corporation, inferlink, inferlink corporation, source, external, certificates, raw, hosts, parsed, unique, dataset, 443, port, output, zmap, including, 509, collected, scanned, temporal, continuing, regular, 2012, 2013, 108, certificate, relation, keys, host, csv, extensions, schema, scan, public, gz, issuer, fingerprint, validation, completed, extraneous, attribute, validate, single, txt, stored, attributes, space, individual, sha, files, handshake, database, ipv4, postgresql, tls, parent, chains, attempt, issues, quote, represented, based, double, keyed, provide, openssl, binary, intermediate, invalid, status, referntial, rsa, other, signature, scanning, hosting, recommend, valid, valided, composed, strings, umich, key, linked, timestamp, include, chain, stores, replaced, store, team, encountered, trusted, complete, subject, split, newlines, mysql, downloaded, dsa, expiration, fields, trust, optimized, syn, error, common, root, browser, missing, successful, tcp, types, delimited, specific, originates, responsive, relational, comprehensive, account