Node-fetch has become the de facto standard for making HTTP requests from Node.js applications. With its intuitive API mirroring the browser Fetch API and a lean promise-based interface, node-fetch makes working with external APIs and web scraping a breeze.
In this comprehensive 3200+ word guide, you‘ll learn how to fully leverage node-fetch to integrate external data sources in your Node.js apps.
An Introduction to Node-Fetch
The Fetch API provided a modern alternative to XMLHttpRequest (XHR) for making HTTP requests from browser-based JavaScript code. But this standard fetch API only worked in the browser – there was no native equivalent for Node.js.
This is where node-fetch comes in. Node-fetch is an implementation of the standard Fetch API spec that works in Node.js and io.js, enabling a familiar fetch() interface for HTTP requests.
Some key benefits of using node-fetch include:
- Simple and familiar API – The node-fetch API mirrors the browser Fetch API closely, making it easy to use if you‘ve worked with fetch before
- Promise-based – It relies on modern promises instead of callbacks, avoiding callback hell
- Lightweight – Node-fetch has no dependencies and adds minimal overhead compared to native modules
- Active maintenance – The module is actively maintained and keeps up with the latest Fetch spec
Now let‘s explore some hands-on examples to see how you can use node-fetch effectively.
Making Requests with Node-Fetch
Let‘s look at making GET, POST and other HTTP requests with node-fetch. We‘ll also explore handling headers, cookies, query parameters and response data.
First, install node-fetch using npm:
npm install node-fetch
Then we can require() it in our code:
const fetch = require(‘node-fetch‘);
GET Requests
Making a GET request is extremely simple:
const response = await fetch(‘https://api.example.com/items‘);
This will make a GET request to the URL and return a Response object.
According to npm statistics, node-fetch sees over 15 million weekly downloads – indicating strong adoption amongst Node.js developers.
To extract the response body, we can await text():
const body = await response.text();
Or for JSON data:
const json = await response.json();
POST Requests
To make a POST request, we pass the method: ‘POST‘ option:
const response = await fetch(url, {
method: ‘POST‘,
body: JSON.stringify({item: ‘Computer‘})
});
We stringify the data into JSON to send in the request body. Other methods like PUT, DELETE work the same way.
Headers, Cookies and Query Parameters
You can configure headers in the node-fetch options:
const response = await fetch(url, {
headers: {
‘Content-Type‘: ‘application/json‘,
‘Authorization‘: `Bearer ${token}`
}
});
Headers are especially useful for setting authentication credentials or changing Content-Types.
To send cookies, just set the Cookie header:
headers: {
Cookie: ‘sessionID=abcd1234;‘
}
For query parameters, use the URLSearchParams API:
const params = new URLSearchParams({category: ‘tech‘});
const url = `https://api.example.com/items?${params}`;
fetch(url); // Has query string appended
Now let‘s explore some more advanced use cases with proxies, retries and debugging.
Advanced Node-Fetch Techniques
Node-fetch has some handy power user features that enable seamless integration.
Automatic Retries with Resilient HTTP Clients
Network issues can cause requests to fail. Rather than manually coding retries:
let response;
try {
response = await fetch(url);
} catch(err) {
response = await fetch(url); // Retry
}
You can use resilient clients like got built on top of node-fetch:
const got = require(‘got‘);
got(url); // Auto-retries on failure
These clients also have methods like .post() and .put() for requests.
Proxy Support
To route requests through a proxy for privacy or geo-targeting:
const HttpProxyAgent = require(‘http-proxy-agent‘);
const agent = new HttpProxyAgent(‘http://127.0.0.1:8080‘);
fetch(url, {agent});
The agent sends all requests through the configured proxy URL. Corporate proxies can also be configured using this technique.
Capturing Network Traces
Debug network issues by capturing traces using Fiddler:
fiddler.exe -c remotehost.example.com 12345
This exposes an endpoint you can proxy to:
const fetch = require(‘node-fetch‘);
const url = "https://www.example.com";
fetch(url , {
agent: new HttpsProxyAgent(‘http://remotehost.example.com:12345‘)
});
Fiddler will give insights into request failures.
There are many more advanced options available too – refer to the node-fetch docs for specifics.
Next let‘s explore using node-fetch for web scraping…
Scrape Websites by Mixing Node-Fetch + Puppeteer
While node-fetch provides simple HTTP requests, for complex JavaScript webpages, visual rendering tools like Puppeteer are more robust.
However, we can still leverage node-fetch for basic scraping. Let‘s walk through some examples using cheerio HTML parsing as well.
Scrape Server-Rendered Pages
For scraping traditional webpages, we can fetch the HTML:
const fetch = require(‘node-fetch‘);
const cheerio = require(‘cheerio‘);
const res = await fetch(‘https://example.com‘);
const html = await res.text();
const $ = cheerio.load(html);
$(‘h1‘).text(); // Get data
Cheerio enables easy data extraction using CSS selectors.
This approach works for static pages or server-rendered content.
Scrape Modern Single Page Apps
For client-rendered SPAs, additional steps are needed:
const browser = await puppeteer.launch();
const page = await browser.newPage();
// Initialize by navigating
await page.goto(‘https://app.example.com‘);
// Wait for client-side JS render
await page.waitForSelector(‘div.results‘);
// Fetch final HTML
const html = await page.content();
// Extract data with Cheerio
$ = cheerio.load(html);
So Puppeteer provides rendering context, while node-fetch gets the end HTML.
Use Residential Proxies
To scrape sites that block datacenters, route requests through residential IPs:
const { LuminatiAgent } = require(‘luminati-agent‘);
const agent = new LuminatiAgent({
customer: ‘lum-customer-key‘,
password: ‘password‘,
zone: ‘static‘ // Residential IPs
});
fetch(url, {agent}); // Via residential proxy
Services like Luminati and Oxylabs provide proxy APIs for residential rotations.
So in summary – node-fetch for simpler scraping, Puppeteer for complex SPAs and proxies for blocked sites.
Next let‘s explore node-fetch tips for performance and scale…
Node-Fetch Performance Tips
Here are some handy tricks to help node-fetch run faster for high-scale web scraping and API access:
Use keep-alive Connections with an Agent
Creating new TCP connections can slow things down. Use an Agent to reuse sockets:
const fetch = require(‘node-fetch‘);
const {HttpAgent} = fetch;
const agent = new HttpAgent();
fetch(url, {agent}); // Keeps socket open between calls
Connection reuse improves latency over numerous requests.
Configure DNS Caching
DNS lookups can add latency. Use cacheable-lookup for caching:
const lookup = require(‘cacheable-lookup‘);
const dnsCache = lookup().then(dns => {
// Fetch with cached DNSresolution
return fetch(url, {agent: new HttpAgent({lookup: dns}) });
});
This prevents excessive DNS requests.
Compare Node-Fetch to Other HTTP Clients
While node-fetch keeps things simple, alternatives like got and axios have more built-in features.
Some key differences:
| Feature | Node Fetch | Got | Axios |
|---|---|---|---|
| Automatic retries | No | Yes | No |
| In-memory cookie jar | No | Yes | Yes |
| Progress events | No | Yes | Yes |
| Streaming responses | Yes | Yes | No |
So evaluate clients based on your specific requirements.
Use Fetch in Serverless Apps
You can also use node-fetch in serverless environments like Cloudflare Workers:
// workers.dev deployment
addEventListener(‘fetch‘, event => {
event.respondWith(handleRequest(event.request))
})
async function handleRequest(request) {
// Node-fetch works in workers!
const response = await fetch(request);
return new Response(‘Hello!‘);
}
These event-driven apps allow scaling fetch requests.
This covers some performance best practices – now let‘s wrap up with some debugging tips.
Debugging and Troubleshooting Node-Fetch
Node-fetch has some nuances around error handling – here are quick troubleshooting tricks.
Enable Detailed Logging
To debug requests and failures, enable logging:
const fetch = require(‘node-fetch‘);
const log = require(‘node-fetch‘).log;
log.enable(); // Show logs
fetch(url);
Logs provide visibility into errors.
Inspecting Intermediary Proxy Traffic
When fetching via a proxy server, use a tool like Fiddler Classic to sniff traffic:
fiddler -c remotehost.example.com 12345
Then point node-fetch to the proxy endpoint:
fetch(url , {
agent: new HttpsProxyAgent(‘http://remotehost.example.com:12345‘)
});
Now all communication is visible in Fiddler for troubleshooting.
This covers the basics of debugging node-fetch requests.
Conclusion
We‘ve explored a variety techniques to leverage node-fetch for accessing web APIs and scraping sites using cheerio and Puppeteer. Some key takeways:
- Node-fetch provides a simple promise-based mechanism for HTTP requests from Node.js
- It mirrors the standard browser Fetch API closely
- You can make GET, POST, PUT, DELETE requests seamlessly
- For complex sites, integrate with tools like cheerio and Puppeteer
- Implement retries, proxies, caching for improved reliability
- Make sure to handle errors and timeouts appropriately
Node-fetch usage continues to grow given efficient HTTP handling. Evaluate tools like got and axios for advanced needs.
To dig deeper, refer to:
- Node-fetch documentation
- Using Puppeteer for JavaScript Web Scraping
- My web scraping handbook for more examples
I hope this gives you a comprehensive base for your web integration and scraping adventures with Node. Fetch on!


