Last updated

urlscan.io Search API Reference v1

Our Search API & UI allows you to find historical scans of URLs on urlscan.io. This page is a reference for the available fields that can be used to query the Search API. Please see explanations about the field types and visibility below.

Query String Syntax and General Instructions

  • Search requests (through the UI or API) are subject to your individual Search API quotas. Make sure to use your API key.
  • The query field uses the ElasticSearch Query String to search for results.
  • All queries are run in filter mode, sorted by date with the most recent scans first. There is no scoring of search results.
  • You can group and concatenate search terms with brackets ( ), AND, OR, and NOT. The default operator is AND.
  • You can concatenate terms within a group, e.g., page.domain:(foo.com OR bar.com).
  • Always use the field names of the fields you want to search. Wildcards for the field name are not supported! Field names are case sensitive!
  • Always escape reserved characters with backslash: + - = && || > < ! ( ) { } [ ] ^ " ~ * ? : \ /
  • Limit the time-range if possible using date, e.g., date:>now-7d or date:>now-1y.
  • The date allows relative queries like date:>now-7d, range queries like date:[2020-01-01 TO 2020-02-01], or a combination of both.
  • You can only use leading wildcard searches and regular expression searches on supported fields, and only as a signed-in user.
  • Everything is indexed as lowercase, even if the Search API returns values in a case-preserving manner.
  • Regular expressions are always anchored to beginning/end of the tokens (implicit ^ and $). Make sure to prefix/suffix with .* to match infix strings.
  • Domain fields contain the whole domain and each smaller domain component, e.g., domain can be searched by google.com which will find hits for www.google.com.

Field Type Legend

The available search fields can have different types, which dictate how they can be searched. It is important to understand the limits of each type to fully utilize our Search API.

  • keyword — Field is analyzed as one keyword, use an exact value or a trailing wildcard search to search. Indexed as lowercase.
  • keyword RE — Field can be searched by regular expression or leading wildcard. Indexed as lowercase.
  • text — Field is analyzed as text, broken into multiple tokens (e.g., split on slash in the URL). Phrase search with quotes possible. Indexed as lowercase.
  • date — Field is analyzed as date, allowing range queries and date math, e.g., date:>now-24h.
  • ip — Search by an IPv4 or IPv6 address, either by using an exact IP or a subnet definition like ip:8.8.8.8\/24.
  • domain — Search by a domain or parent domain. You can search for www.foobar.com or just foobar.com and it will both find scans for www.foobar.com.
  • integer — Allows searching by exact value, range, or threshold, e.g., stats.uniqIPs:>5.

Searchable Fields

Field NameTypeField semantics, features, & notes
datedateDatetime of when the scan was performed
asnkeywordAny of the AS numbers that were contacted (e.g., AS123)
asnnametextAny of the AS names that were contacted
asnname.keywordkeywordAny of the AS names that were contacted (analyzed as keyword)
countrykeywordISO 3166-1 2-letter country code of any country that was contacted
domaindomainAny domain or subdomain that was contacted
domain.keywordkeywordREAny domain or subdomain that was contacted
filenametextAny URL that was requested
filename.keywordkeywordREAny URL that was requested
hashkeywordAny SHA256 hash of any HTTP response
ipipAny IP that was contacted
serverkeywordAny HTTP "Server" header of subrequests
page.apexDomainkeywordApex domain of the page
page.asnkeywordAS number of the website
page.asnnametextName of the main AS of the website
page.countrykeywordPrimary IP GeoIP Country (ISO 3166-1 2-letter country code)
page.domaindomainPrimary Domain (analyzed as all levels of parent domains)
page.domain.keywordkeywordREPrimary domain (analyzed as keyword)
page.ipipPrimary IP
page.mimeTypekeywordMIME type of the primary HTTP response
page.ptrdomainDNS PTR record of primary IP
page.redirectedkeywordWhether the page was redirected from task.url
page.servertextHTTP "Server" header of primary request
page.statuskeywordHTTP status code of primary request response
page.titletextTitle of the page
page.title.keywordkeywordRETitle of the page
page.tlsAgeDaysdateAge of TLS certificate when the page was scanned (in days)
page.tlsIssuerkeywordIssuer of the page TLS certificate
page.tlsValidDaysdateTLS certificate validity period in days
page.tlsValidFromdateTLS certificate Valid-From date
page.umbrellaRankintegerCisco Umbrella Top 1 Million rank of page domain
page.urltextURL of the primary page (after redirection)
page.url.keywordkeywordREURL of the primary page (after redirection, analyzed as keyword)
stats.dataLengthintegerData size of all subresources
stats.encodedDataLengthintegerTransfer size of all subresources
stats.requestsintegerNumber of subrequests
stats.uniqCountriesintegerNumber of unique countries contacted
stats.uniqIPsintegerNumber of unique IPs contacted
task.domaindomainDomain of the tasked URL
task.domain.keywordkeywordREDomain of the tasked URL (analyzed as keyword)
task.methodkeywordCan be manual, api, or automatic
task.sourcekeywordExamples: phishtank or certstream-suspicious
task.tagskeywordUser-defined tags supplied during scan submission
task.urltextThe original URL that was tasked
task.url.keywordkeywordREThe original URL that was tasked (analyzed as keyword)
task.uuidkeywordThe unique UUID of the scan
task.visibilitykeywordCan be one of public, unlisted, or private
uservirtualScans submitted by yourself (Can only be me)
teamvirtualScans submitted by any of your teams (Can only be me)
apikeyvirtualScans submitted using one of your API keys (Can only be me)
files.sha256keywordSHA256 of file downloaded by the website
files.filenamekeywordREFilename of file downloaded by the website
files.filesizeintegerFilesize of file downloaded by the website
files.mimeTypekeywordMIME type description of file downloaded by the website

urlscan Professional, Enterprise, Ultimate

The following fields can only be searched on the Professional, Enterprise, and Ultimate plans.

Field NameTypeField semantics, features, & notes
brand.countrykeywordISO 3166-1 2-letter country code of the brand
brand.keykeywordUnique key of the brand
brand.nametextName of the brand
brand.verticaltextIndustry vertical of the brand, e.g., "banking"
content.cookieNameskeywordNames of cookies set by page
content.globalNameskeywordNames of non-default JavaScript global variables
content.inputNameskeywordName attributes of input fields on page
content.inputTypeskeywordType attributes of input fields on page
content.technologieskeywordNames of technologies detected according to Wappalyzer
dom.hashkeywordSHA256 hash of the DOM before truncation
dom.sizeintegerSize of the DOM before truncation
frames.domainsdomainDomains of frames
frames.lengthintegerNumber of frames
frames.urlskeywordREURLs of frames
links.domainsdomainDomains of outgoing links
links.lengthintegerNumber of outgoing links
links.urlskeywordREURLs of outgoing links (to different domains than page.domain)
scanner.countrykeywordScanner IP exit location (2-letter country code)
submitter.countrykeywordGeoIP country of the submitter (2-letter country code)
text.contenttextVisible text on the website and inline JS (first 20kB)
text.hashkeywordSHA256 hash of the text before truncation
text.sizeintegerSize of the text content before truncation
verdicts.maliciousbooleanWhether the page is considered malicious
verdicts.scoreintegerMaliciousness score of page from -100 (benign) to 100 (malicious)
verdicts.lastVerdictdateDate the latest verdict for this scan was added