Skip to content

Dataplex source SearchEntries method fetches more entries than the page size #3308

@theantagonist9509

Description

@theantagonist9509

Prerequisites

  • I've searched the current open issues
  • I've updated to the latest version of Toolbox

Toolbox version

toolbox version 1.3.0+dev.linux.amd64

Environment

  1. OS type and version: Linux 6.18.14-1rodete3-amd64 #1 SMP PREEMPT_DYNAMIC Deb
  2. How are you running Toolbox:
  • Compiled from source: go build

Client

  1. Client: curl command
  2. Version: N/A
  3. Exact command:
curl -X POST http://localhost:5000/mcp \
      -H "Content-Type: application/json" \
      -d '{
        "jsonrpc": "2.0",
        "id": "1",
        "method": "tools/call",
        "params": {
          "name": "search_dataplex_entries",
          "arguments": {
            "query": "type:table"
          }
        }
      }'

Expected Behavior

The server responds with at most pageSize entries (default 5). This is the expected behavior of the One MCP served Dataplex API.

Current Behavior

The server makes repeated requests to the Dataplex API until the search is exhausted. If the query is "broad" (like "type:table"), this can potentially lead to the server issuing hundreds of sequential requests to the Dataplex API (for example, if there are 500 search result entries and the client is using a page size of 5), making it appear like the server has hung. Even if a sufficiently large page size is used, this is not ideal since it may flood the client LLM's context with hundreds of entries.

Steps to reproduce?

tools.yaml:

kind: source
name: my-dataplex-source
type: dataplex
project: <my project>
---
kind: tool
name: search_dataplex_entries
type: dataplex-search-entries
source: my-dataplex-source
description: Search for metadata entries in Dataplex Catalog

(Client is as specified previously)

Additional Details

The methods SearchAspectTypes, and SearchDataQualityScans have the same behavior. Those may also need to be addressed.
The fix should be as simple as putting the following conditional at the end of the loops within these methods:

		if len(results) >= pageSize {
			break
		}

Metadata

Metadata

Assignees

Labels

type: bugError or flaw in code with unintended results or allowing sub-optimal usage patterns.

Type

No fields configured for Bug.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions