urlscan.io Search API Reference v1
Our Search API & UI allows you to find historical scans of URLs on urlscan.io. This page is a reference for the available fields that can be used to query the Search API. Please see explanations about the field types and visibility below.
Query String Syntax and General Instructions
- Search requests (through the UI or API) are subject to your individual Search API quotas. Make sure to use your API key.
- The query field uses the ElasticSearch Query String to search for results.
- All queries are run in filter mode, sorted by date with the most recent scans first. There is no scoring of search results.
- You can group and concatenate search terms with brackets
( )
,AND
,OR
, andNOT
. The default operator isAND
. - You can concatenate terms within a group, e.g.,
page.domain:(foo.com OR bar.com)
. - Always use the field names of the fields you want to search. Wildcards for the field name are not supported! Field names are case sensitive!
- Always escape reserved characters with backslash:
+ - = && || > < ! ( ) { } [ ] ^ " ~ * ? : \ /
- Limit the time-range if possible using
date
, e.g.,date:>now-7d or date:>now-1y
. - The date allows relative queries like
date:>now-7d
, range queries likedate:[2020-01-01 TO 2020-02-01]
, or a combination of both. - You can only use leading wildcard searches and regular expression searches on supported fields, and only as a signed-in user.
- Everything is indexed as lowercase, even if the Search API returns values in a case-preserving manner.
- Regular expressions are always anchored to beginning/end of the tokens (implicit
^
and$
). Make sure to prefix/suffix with.*
to match infix strings. - Domain fields contain the whole domain and each smaller domain component, e.g.,
domain
can be searched bygoogle.com
which will find hits forwww.google.com
.
Field Type Legend
The available search fields can have different types, which dictate how they can be searched. It is important to understand the limits of each type to fully utilize our Search API.
- keyword — Field is analyzed as one keyword, use an exact value or a trailing wildcard search to search. Indexed as lowercase.
- keyword RE — Field can be searched by regular expression or leading wildcard. Indexed as lowercase.
- text — Field is analyzed as text, broken into multiple tokens (e.g., split on slash in the URL). Phrase search with quotes possible. Indexed as lowercase.
- date — Field is analyzed as date, allowing range queries and date math, e.g.,
date:>now-24h
. - ip — Search by an IPv4 or IPv6 address, either by using an exact IP or a subnet definition like
ip:8.8.8.8\/24
. - domain — Search by a domain or parent domain. You can search for www.foobar.com or just foobar.com and it will both find scans for www.foobar.com.
- integer — Allows searching by exact value, range, or threshold, e.g.,
stats.uniqIPs:>5
.
Searchable Fields
Field Name | Type | Field semantics, features, & notes |
---|---|---|
date | date | Datetime of when the scan was performed |
asn | keyword | Any of the AS numbers that were contacted (e.g., AS123) |
asnname | text | Any of the AS names that were contacted |
asnname.keyword | keyword | Any of the AS names that were contacted (analyzed as keyword) |
country | keyword | ISO 3166-1 2-letter country code of any country that was contacted |
domain | domain | Any domain or subdomain that was contacted |
domain.keyword | keywordRE | Any domain or subdomain that was contacted |
filename | text | Any URL that was requested |
filename.keyword | keywordRE | Any URL that was requested |
hash | keyword | Any SHA256 hash of any HTTP response |
ip | ip | Any IP that was contacted |
server | keyword | Any HTTP "Server" header of subrequests |
page.apexDomain | keyword | Apex domain of the page |
page.asn | keyword | AS number of the website |
page.asnname | text | Name of the main AS of the website |
page.country | keyword | Primary IP GeoIP Country (ISO 3166-1 2-letter country code) |
page.domain | domain | Primary Domain (analyzed as all levels of parent domains) |
page.domain.keyword | keywordRE | Primary domain (analyzed as keyword) |
page.ip | ip | Primary IP |
page.mimeType | keyword | MIME type of the primary HTTP response |
page.ptr | domain | DNS PTR record of primary IP |
page.redirected | keyword | Whether the page was redirected from task.url |
page.server | text | HTTP "Server" header of primary request |
page.status | keyword | HTTP status code of primary request response |
page.title | text | Title of the page |
page.title.keyword | keywordRE | Title of the page |
page.tlsAgeDays | date | Age of TLS certificate when the page was scanned (in days) |
page.tlsIssuer | keyword | Issuer of the page TLS certificate |
page.tlsValidDays | date | TLS certificate validity period in days |
page.tlsValidFrom | date | TLS certificate Valid-From date |
page.umbrellaRank | integer | Cisco Umbrella Top 1 Million rank of page domain |
page.url | text | URL of the primary page (after redirection) |
page.url.keyword | keywordRE | URL of the primary page (after redirection, analyzed as keyword) |
stats.dataLength | integer | Data size of all subresources |
stats.encodedDataLength | integer | Transfer size of all subresources |
stats.requests | integer | Number of subrequests |
stats.uniqCountries | integer | Number of unique countries contacted |
stats.uniqIPs | integer | Number of unique IPs contacted |
task.domain | domain | Domain of the tasked URL |
task.domain.keyword | keywordRE | Domain of the tasked URL (analyzed as keyword) |
task.method | keyword | Can be manual, api, or automatic |
task.source | keyword | Examples: phishtank or certstream-suspicious |
task.tags | keyword | User-defined tags supplied during scan submission |
task.url | text | The original URL that was tasked |
task.url.keyword | keywordRE | The original URL that was tasked (analyzed as keyword) |
task.uuid | keyword | The unique UUID of the scan |
task.visibility | keyword | Can be one of public, unlisted, or private |
user | virtual | Scans submitted by yourself (Can only be me ) |
team | virtual | Scans submitted by any of your teams (Can only be me ) |
apikey | virtual | Scans submitted using one of your API keys (Can only be me ) |
files.sha256 | keyword | SHA256 of file downloaded by the website |
files.filename | keywordRE | Filename of file downloaded by the website |
files.filesize | integer | Filesize of file downloaded by the website |
files.mimeType | keyword | MIME type description of file downloaded by the website |
urlscan Professional, Enterprise, Ultimate
The following fields can only be searched on the Professional, Enterprise, and Ultimate plans.
Field Name | Type | Field semantics, features, & notes |
---|---|---|
brand.country | keyword | ISO 3166-1 2-letter country code of the brand |
brand.key | keyword | Unique key of the brand |
brand.name | text | Name of the brand |
brand.vertical | text | Industry vertical of the brand, e.g., "banking" |
content.cookieNames | keyword | Names of cookies set by page |
content.globalNames | keyword | Names of non-default JavaScript global variables |
content.inputNames | keyword | Name attributes of input fields on page |
content.inputTypes | keyword | Type attributes of input fields on page |
content.technologies | keyword | Names of technologies detected according to Wappalyzer |
dom.hash | keyword | SHA256 hash of the DOM before truncation |
dom.size | integer | Size of the DOM before truncation |
frames.domains | domain | Domains of frames |
frames.length | integer | Number of frames |
frames.urls | keywordRE | URLs of frames |
links.domains | domain | Domains of outgoing links |
links.length | integer | Number of outgoing links |
links.urls | keywordRE | URLs of outgoing links (to different domains than page.domain) |
scanner.country | keyword | Scanner IP exit location (2-letter country code) |
submitter.country | keyword | GeoIP country of the submitter (2-letter country code) |
text.content | text | Visible text on the website and inline JS (first 20kB) |
text.hash | keyword | SHA256 hash of the text before truncation |
text.size | integer | Size of the text content before truncation |
verdicts.malicious | boolean | Whether the page is considered malicious |
verdicts.score | integer | Maliciousness score of page from -100 (benign) to 100 (malicious) |
verdicts.lastVerdict | date | Date the latest verdict for this scan was added |