Custom logic after every successful response
onResponse
function.
$
and $$
to select specific DOM elements.
While similar to jQuery or Cheerio, these methods are standard JavaScript that work if you paste them in your browser console.
$
$
function is shorthand for document.querySelector()
,
and is available when the response content-type
starts with text/html
.
$$
$$
function is short for Array.from(document.querySelectorAll())
,
and is available when the response content-type
starts with text/html
.
ai.extract
meta/llama-3.1-8b
: Small inferencemeta/llama-3.3-70b
: XL inferenceai.summarize
ai.sentiment
POSITIVE
or NEGATIVE
.
This method is very fast and inexpensive.
ai.embed
baai/bge-small-en-v1.5
: Small embeddingbaai/bge-base-en-v1.5
: Base embeddingbaai/bge-large-en-v1.5
: Large embeddingenqueue
method.
This synchronous function appends requests to the queue.
You can call enqueue()
multiple times in your response handler.
Requests are automatically normalized and deduplicated based on URL.
You can pass in URLs (strings), request objects, or HTMLElements.
enqueue
response
response.text()
or response.json()
.
Use the html
, json
, and xml
properties instead.html
html
string is available when the response content-type
starts with text/html
.
json
json
object is available when the response content-type
starts with application/json
.
Use this object instead of awaiting response.json()
.
xml
xml
object is available when the response content-type
starts with application/xml
.
Use this object instead of awaiting response.text()
.
getMarkdown
env
.env
file are available inside the env
object.
Env vars must begin with SECRET_
to be included in your crawler’s deployment.
All env vars are considered secret and are not stored on Crawlspace’s servers.
request
request
, you have access to the properties of the request that had been made.
z