Skip to main content
Canonical Firecrawl Java source of truth for agents. Generated from SDK source and the v2 OpenAPI spec.

Install

Maven:
<dependency>
  <groupId>com.firecrawl</groupId>
  <artifactId>firecrawl-java</artifactId>
  <version>1.2.0</version>
</dependency>
Gradle:
implementation("com.firecrawl:firecrawl-java:1.2.0")

Authenticate

import com.firecrawl.client.FirecrawlClient;

FirecrawlClient client = FirecrawlClient.builder()
    .apiKey(System.getenv("FIRECRAWL_API_KEY"))
    .build();

When To Use What

  • search: use when you start with a query and need discovery.
  • scrape: use when you already have a URL and want page content.
  • interact: use when the page needs clicks, forms, or post-scrape browser actions.
  • support/ask: use when a Firecrawl API call fails or returns unexpected results and you need a diagnosis.
  • support/docs-search: use when you need to look up Firecrawl documentation.

Why use it

Use search to discover relevant pages from a query, then pick URLs to scrape or interact with. You can constrain results to a site with site:, for example site:docs.firecrawl.dev crawl webhooks.

Preferred SDK methods

  • client.search(query)
  • client.search(query, options)

Return value

search returns SearchData. Read result buckets with getWeb(), getNews(), and getImages() (each is List<Map<String, Object>> and may be null). Do not treat SearchData as a directly iterable list of hits.

Simple Example

import com.firecrawl.models.SearchData;
import java.util.List;
import java.util.Map;

SearchData results = client.search("site:docs.firecrawl.dev webhook retries");
List<Map<String, Object>> web = results.getWeb();

Complex Example

import com.firecrawl.models.SearchOptions;
import com.firecrawl.models.ScrapeOptions;
import com.firecrawl.models.JsonFormat;
import com.firecrawl.models.LocationConfig;

SearchOptions options = SearchOptions.builder()
    .sources(List.of("web", "news"))
    .categories(List.of("research"))
    .limit(10)
    .tbs("qdr:m")
    .location("San Francisco,California,United States")
    .ignoreInvalidURLs(true)
    .timeout(60000)
    .scrapeOptions(
        ScrapeOptions.builder()
            .formats(List.of(
                "markdown",
                "links",
                JsonFormat.builder().prompt("Extract title and key endpoints.").build()
            ))
            .onlyMainContent(true)
            .waitFor(1000)
            .build()
    )
    .build();

SearchData results = client.search("site:docs.firecrawl.dev crawl webhooks", options);

Parameters

  • query
    • Type: String
    • Use when: you need a search query.
    • Notes: use site:example.com to limit results to a domain.
  • options.sources
    • Type: List of source strings or maps
    • Use when: you want to control which sources are searched.
    • Confirmed values:
      • "web": web index results
      • "news": news results
      • "images": image results
      • {type: "web" | "news" | "images"}: typed source map form
  • options.categories
    • Type: List of category strings or maps
    • Use when: you want to filter results by category.
    • Confirmed values:
      • "github": GitHub-focused results
      • "research": research and academic results
      • "pdf": PDF-focused results
      • {type: "github" | "research" | "pdf"}: typed category map form
  • options.limit
    • Type: Integer
    • Use when: you want to cap results.
  • options.tbs
    • Type: String
    • Use when: you need a time-based filter (for example qdr:d, qdr:w, sbd:1,qdr:m).
  • options.location
    • Type: String
    • Use when: you want localized results.
  • options.ignoreInvalidURLs
    • Type: Boolean
    • Use when: you want to drop URLs that cannot be scraped by other endpoints.
  • options.timeout
    • Type: Integer
    • Use when: you need a request timeout in milliseconds.
  • options.scrapeOptions
    • Type: ScrapeOptions
    • Use when: you want to scrape each search result (see Scrape parameters for fields).
  • options.integration
    • Type: String
    • Use when: the API expects an integration identifier on the request.

Scrape

Why use it

Use scrape when you already have a URL and want structured content in one or more formats.

Preferred SDK methods

  • client.scrape(url)
  • client.scrape(url, options)

Return value

scrape returns Document. Typical getters include getMarkdown(), getHtml(), getRawHtml(), getJson(), getMetadata(), getLinks(), getAudio(), getVideo(), and additional fields when the corresponding formats are requested.

Simple Example

Document doc = client.scrape(
    "https://docs.firecrawl.dev",
    ScrapeOptions.builder().formats(List.of("markdown")).build()
);

Complex Example

import com.firecrawl.models.ScrapeOptions;
import com.firecrawl.models.JsonFormat;

List<Map<String, Object>> actions = List.of(
    Map.of("type", "click", "selector", "#accept"),
    Map.of("type", "wait", "milliseconds", 750),
    Map.of("type", "scrape")
);

LocationConfig location = LocationConfig.builder()
    .country("US")
    .languages(List.of("en-US"))
    .build();

ScrapeOptions options = ScrapeOptions.builder()
    .formats(List.of(
        "markdown",
        "links",
        JsonFormat.builder().prompt("Extract plan names and prices.").build(),
        Map.of("type", "screenshot", "fullPage", true, "quality", 80)
    ))
    .onlyMainContent(true)
    .waitFor(1000)
    .parsers(List.of(Map.of("type", "pdf", "maxPages", 5)))
    .actions(actions)
    .location(location)
    .removeBase64Images(true)
    .blockAds(true)
    .proxy("auto")
    .maxAge(86400000L)
    .storeInCache(true)
    .build();

Document doc = client.scrape("https://example.com/pricing", options);

Parameters

  • url
    • Type: String
    • Use when: you want to scrape a specific page.
  • options.formats
    • Type: List of format strings or format maps
    • Use when: you want multiple output formats.
    • Confirmed format strings:
      • "markdown": markdown content
      • "html": cleaned HTML
      • "rawHtml": raw HTML
      • "links": page links
      • "images": image URLs
      • "screenshot": screenshot output
      • "summary": summary output
      • "changeTracking": change tracking output
      • "json": JSON extraction
      • "attributes": attribute extraction
      • "branding": branding profile output
      • "audio": audio extraction
      • "video": video extraction
    • Format object fields:
      • type: one of the format strings above
      • prompt, schema: JSON extraction options for type: "json"
      • modes, schema, prompt, tag: change tracking options for type: "changeTracking"
      • fullPage, quality, viewport: screenshot options for type: "screenshot"
      • selectors: array of {selector, attribute} for type: "attributes"
  • options.headers
    • Type: Map<String, String>
    • Use when: you need custom request headers.
  • options.includeTags
    • Type: List<String>
    • Use when: you want to include only specific HTML tags.
  • options.excludeTags
    • Type: List<String>
    • Use when: you want to exclude specific HTML tags.
  • options.onlyMainContent
    • Type: Boolean
    • Use when: you want to strip nav, footer, and other boilerplate.
  • options.timeout
    • Type: Integer
    • Use when: you need a timeout in milliseconds.
  • options.waitFor
    • Type: Integer
    • Use when: you need to wait for the page to render (milliseconds).
  • options.mobile
    • Type: Boolean
    • Use when: you want a mobile viewport.
  • options.parsers
    • Type: List<Object>
    • Use when: you need file parsing controls.
    • Confirmed values:
      • "pdf"
      • {type: "pdf", maxPages: number}
  • options.actions
    • Type: List<Map<String, Object>>
    • Use when: you need lightweight pre-scrape actions.
    • Confirmed action types:
      • wait: milliseconds or selector required
      • screenshot: fullPage, quality, viewport optional
      • click: selector required
      • write: text required (click to focus the input first)
      • press: key required
      • scroll: direction (up or down) required, selector optional
      • scrape: no additional fields
      • executeJavascript: script required
      • pdf: format (A0, A1, A2, A3, A4, A5, A6, Letter, Legal, Tabloid, Ledger), landscape, scale optional
  • options.location
    • Type: LocationConfig
    • Use when: you need geo or language-aware scraping.
  • options.skipTlsVerification
    • Type: Boolean
    • Use when: you need to skip TLS verification.
  • options.removeBase64Images
    • Type: Boolean
    • Use when: you want to drop base64 images from markdown output.
  • options.blockAds
    • Type: Boolean
    • Use when: you want ad and cookie popup blocking.
  • options.proxy
    • Type: String
    • Use when: you need proxy control.
    • Confirmed values: "basic", "stealth", "enhanced", "auto", or a custom proxy URL string
  • options.maxAge
    • Type: Long
    • Use when: you want cached data up to a maximum age (milliseconds).
  • options.storeInCache
    • Type: Boolean
    • Use when: you want Firecrawl to cache the result.
  • options.integration
    • Type: String
    • Use when: the API expects an integration identifier on the request.

Interact

Why use it

Use interact when a page requires browser actions or code execution after a scrape starts.

Preferred SDK methods

  • client.interact(jobId, code) — uses default language node and API default execution timeout
  • client.interact(jobId, code, language, timeout)timeout is seconds (1–300), or null to omit and use the API default (30 seconds)
  • client.interact(jobId, code, language, timeout, origin) — optional origin string is sent only when non-null (request attribution)

Simple Example

import com.firecrawl.models.BrowserExecuteResponse;

BrowserExecuteResponse result = client.interact(
    "<scrapeJobId>",
    "console.log(await page.title());",
    "node",
    60
);

Complex Example

BrowserExecuteResponse result = client.interact(
    "<scrapeJobId>",
    "// Use Playwright page methods here",
    "node",
    120
);

Stop interactive browser

End the scrape-bound browser session when finished. Preferred SDK method: client.stopInteractiveBrowser(jobId)
import com.firecrawl.models.BrowserDeleteResponse;

BrowserDeleteResponse stopped = client.stopInteractiveBrowser("<scrapeJobId>");

Interact response (BrowserExecuteResponse)

  • isSuccess(): boolean
  • getStdout(), getStderr(), getResult(), getError(): String (may be null)
  • getExitCode(): Integer (may be null)
  • getKilled(): Boolean (may be null) — true when execution was stopped due to timeout

Stop response (BrowserDeleteResponse)

  • isSuccess(): boolean
  • getSessionDurationMs(): Long (may be null)
  • getCreditsBilled(): Integer (may be null)
  • getError(): String (may be null)

Parameters

  • jobId
    • Type: String
    • Use when: you have a scrape job ID.
  • code
    • Type: String
    • Use when: you want to run code in the browser session.
  • language
    • Type: String
    • Use when: you need a specific runtime.
    • Confirmed values: "python", "node", "bash"
    • Notes: defaults to "node" in the SDK if null.
  • timeout
    • Type: Integer
    • Use when: you need an execution timeout in seconds (1–300). Null omits the field and uses the API default.
  • origin
    • Type: String
    • Use when: you need an optional origin label on the request. Prefer omitting unless your integration requires it.

Notes

  • Deprecated aliases: scrapeExecuteinteract, deleteScrapeBrowserstopInteractiveBrowser (and the corresponding *Async helpers).
  • The Java SDK exposes code-based interactions only: there is no prompt parameter on interact (unlike some other language SDKs).

Source Of Truth

  • firecrawl/apps/java-sdk/build.gradle.kts
  • firecrawl/apps/java-sdk/src/main/java/com/firecrawl/client/FirecrawlClient.java
  • firecrawl/apps/java-sdk/src/main/java/com/firecrawl/models/SearchOptions.java
  • firecrawl/apps/java-sdk/src/main/java/com/firecrawl/models/ScrapeOptions.java
  • firecrawl/apps/java-sdk/src/main/java/com/firecrawl/models/SearchData.java
  • firecrawl/apps/java-sdk/src/main/java/com/firecrawl/models/JsonFormat.java
  • firecrawl/apps/java-sdk/src/main/java/com/firecrawl/models/LocationConfig.java
  • firecrawl/apps/java-sdk/src/main/java/com/firecrawl/models/Document.java
  • firecrawl/apps/java-sdk/src/main/java/com/firecrawl/models/BrowserExecuteResponse.java
  • firecrawl/apps/java-sdk/src/main/java/com/firecrawl/models/BrowserDeleteResponse.java
  • firecrawl-docs/api-reference/v2-openapi.json