Skip to main content

How to Paginate Through Large Datasets with Cursor-Based Pagination

When you request a list of items from Sorsa API, the results are returned in pages. To retrieve the full dataset, you iterate through pages using cursors until no more data is available.

How cursor-based pagination works

Sorsa uses cursor-based pagination instead of traditional page numbers. This approach is more reliable for social media data, where new content is constantly being added and offset-based pagination would skip or duplicate results. The flow is the same for every paginated endpoint:
  1. Send your initial request without a cursor.
  2. The response includes your data and a next_cursor field.
  3. Pass the next_cursor value in your next request to fetch the next page.
  4. When next_cursor is null or absent from the response, you have reached the end.
Not every endpoint uses pagination. Single-object endpoints like /info, /tweet-info, /score, and /about return one result and have no cursor. Pagination only applies to endpoints that return lists of users or tweets.

Response structure

Paginated responses always follow one of these two patterns:
{
  "users": [ ... ],
  "next_cursor": "DAABCgABF7Y..."
}
{
  "tweets": [ ... ],
  "next_cursor": "DAABCgABF7Y..."
}
The data is always in the users or tweets array. The next_cursor field is a string when more pages exist, and null or absent when you have reached the end. For full details on response wrappers and object schemas, see Response Format.

How cursors are passed

The way you send the cursor depends on whether the endpoint uses GET or POST. GET endpoints (like /followers, /follows, /list-tweets) accept the cursor as a query parameter called cursor:
# First page
curl --request GET \
  --url 'https://api.sorsa.io/v3/followers?username=elonmusk' \
  --header 'ApiKey: YOUR_API_KEY'

# Next page
curl --request GET \
  --url 'https://api.sorsa.io/v3/followers?username=elonmusk&cursor=DAABCgABF7Y...' \
  --header 'ApiKey: YOUR_API_KEY'
POST endpoints (like /search-tweets, /user-tweets, /comments) accept the cursor in the JSON body as next_cursor:
# First page
curl --request POST \
  --url 'https://api.sorsa.io/v3/search-tweets' \
  --header 'ApiKey: YOUR_API_KEY' \
  --header 'Content-Type: application/json' \
  --data '{"query": "bitcoin"}'

# Next page
curl --request POST \
  --url 'https://api.sorsa.io/v3/search-tweets' \
  --header 'ApiKey: YOUR_API_KEY' \
  --header 'Content-Type: application/json' \
  --data '{"query": "bitcoin", "next_cursor": "DAABCgABF7Y..."}'

Page sizes are not fixed

Due to the nature of X platform data, the number of items returned per page can vary. An endpoint that returns up to 20 items per page might return 18, 12, or even 5 on a given page, even when more data exists on the next page. Never use the item count to decide whether you have reached the end. A page with fewer items than expected does not mean there is no more data. Always check the next_cursor field. If it exists and is not null, there are more pages to fetch.

Full pagination examples

Python: paginating through followers (GET endpoint)
import time
import requests

API_KEY = "YOUR_API_KEY"

def fetch_all_followers(username):
    all_users = []
    cursor = None

    while True:
        params = {"username": username}
        if cursor:
            params["cursor"] = cursor

        response = requests.get(
            "https://api.sorsa.io/v3/followers",
            params=params,
            headers={"ApiKey": API_KEY}
        )
        data = response.json()

        users = data.get("users", [])
        all_users.extend(users)
        print(f"Page fetched: {len(users)} users. Total so far: {len(all_users)}")

        cursor = data.get("next_cursor")
        if not cursor:
            break

        time.sleep(0.05)  # respect 20 req/s rate limit

    return all_users

followers = fetch_all_followers("elonmusk")
print(f"Done. {len(followers)} followers total.")
Python: paginating through search results (POST endpoint)
import time
import requests

API_KEY = "YOUR_API_KEY"

def search_all_tweets(query):
    all_tweets = []
    cursor = None

    while True:
        body = {"query": query}
        if cursor:
            body["next_cursor"] = cursor

        response = requests.post(
            "https://api.sorsa.io/v3/search-tweets",
            json=body,
            headers={"ApiKey": API_KEY}
        )
        data = response.json()

        tweets = data.get("tweets", [])
        all_tweets.extend(tweets)
        print(f"Page fetched: {len(tweets)} tweets. Total so far: {len(all_tweets)}")

        cursor = data.get("next_cursor")
        if not cursor:
            break

        time.sleep(0.05)

    return all_tweets

results = search_all_tweets("bitcoin")
print(f"Done. {len(results)} tweets total.")
JavaScript: paginating through followers (GET endpoint)
async function fetchAllFollowers(username) {
  const API_KEY = "YOUR_API_KEY";
  const allUsers = [];
  let cursor = null;

  while (true) {
    const params = new URLSearchParams({ username });
    if (cursor) params.append("cursor", cursor);

    const response = await fetch(
      `https://api.sorsa.io/v3/followers?${params}`,
      { headers: { "ApiKey": API_KEY } }
    );
    const data = await response.json();

    const users = data.users || [];
    allUsers.push(...users);
    console.log(`Page fetched: ${users.length} users. Total: ${allUsers.length}`);

    cursor = data.next_cursor;
    if (!cursor) break;

    await new Promise(r => setTimeout(r, 50));
  }

  return allUsers;
}
JavaScript: paginating through search results (POST endpoint)
async function searchAllTweets(query) {
  const API_KEY = "YOUR_API_KEY";
  const allTweets = [];
  let cursor = null;

  while (true) {
    const body = { query };
    if (cursor) body.next_cursor = cursor;

    const response = await fetch("https://api.sorsa.io/v3/search-tweets", {
      method: "POST",
      headers: {
        "ApiKey": API_KEY,
        "Content-Type": "application/json"
      },
      body: JSON.stringify(body)
    });
    const data = await response.json();

    const tweets = data.tweets || [];
    allTweets.push(...tweets);
    console.log(`Page fetched: ${tweets.length} tweets. Total: ${allTweets.length}`);

    cursor = data.next_cursor;
    if (!cursor) break;

    await new Promise(r => setTimeout(r, 50));
  }

  return allTweets;
}

Pagination with error handling

In production, combine pagination with retry logic so that a single failed page does not break your entire data collection run. For the full error handling reference, see Error Codes.
import time
import requests

API_KEY = "YOUR_API_KEY"

def paginate_with_retries(username, max_retries=3):
    all_users = []
    cursor = None

    while True:
        params = {"username": username}
        if cursor:
            params["cursor"] = cursor

        for attempt in range(max_retries):
            response = requests.get(
                "https://api.sorsa.io/v3/followers",
                params=params,
                headers={"ApiKey": API_KEY}
            )

            if response.status_code == 200:
                break
            elif response.status_code == 429:
                time.sleep(1)
                continue
            elif response.status_code >= 500:
                time.sleep(2)
                continue
            else:
                raise Exception(f"Error {response.status_code}: {response.text}")
        else:
            raise Exception("Max retries exceeded")

        data = response.json()
        all_users.extend(data.get("users", []))

        cursor = data.get("next_cursor")
        if not cursor:
            break

        time.sleep(0.05)

    return all_users

Next steps

  • Rate Limits - Understand the 20 req/s limit when paginating through large datasets
  • Error Codes - Handle 429 and other errors during pagination loops
  • Response Format - Full reference for User and Tweet object schemas