The JSON_DATAGUIDE function in Oracle SQL is a powerful aggregate function used for data analysis. Its job is not to extract data, but to analyze a collection of JSON documents and automatically discover their structure.
Think of it as an X-ray for your JSON data. It scans many JSON documents (across multiple rows) and generates a report (a new JSON document) that describes all the unique properties (paths), their data types, and their maximum lengths. This "data guide" can then be used to create views or understand the schema of your JSON.
What is the JSON_DATAGUIDE Function in Oracle?
The JSON_DATAGUIDE(json_expr) function is an aggregate function, just like COUNT() or SUM(). It is typically used with a GROUP BY clause to analyze all the JSON documents within a group.
It returns a CLOB containing a JSON array, where each object in the array represents a unique "path" found in the source data.
This is essential for:
- Understanding the complete structure of JSON data that may be inconsistent across rows.
- Finding all possible keys and their data types.
- Automatically generating a "schema" from your data.
JSON_DATAGUIDE Function Syntax
The syntax for JSON_DATAGUIDE is:
JSON_DATAGUIDE(json_expr [, format] [, flag])
Let's break that down:
json_expr: The JSON column (of typeJSON,VARCHAR2,CLOB, orBLOB) that you want to analyze.[format](Optional): A constant that defines the output format. The most common is:dbms_json.format_flat(Default): Returns a flat list of all paths.dbms_json.format_hierarchical: Returns a nested JSON structure.
[flag](Optional): A constant to change the behavior. The most useful is:DBMS_JSON.PRETTY: Formats the outputCLOBwith indentation to make it human-readable.
Oracle JSON_DATAGUIDE Function Examples
Because JSON_DATAGUIDE is an aggregate function that works on multiple rows, we must first simulate a multi-row table. We can do this using the DUAL table with UNION ALL.
Example 1: Creating a "Flat" Dataguide for All Rows with JSON_DATAGUIDE
This example will analyze two "rows" of JSON data. Notice that the two rows have different structures. JSON_DATAGUIDE will find all unique paths from both rows.
Query:
-- First, we simulate a 'j_purchaseorders' table with 2 rows
WITH j_purchaseorders AS (
SELECT '{"PONumber": 1600, "Requestor": "A. Bull", "LineItems": [{"Part": "A"}]}'
AS po_document FROM DUAL
UNION ALL
SELECT '{"PONumber": 1700, "Requestor": "B. King", "Special_Instructions": "Rush"}'
AS po_document FROM DUAL
)
-- Now, we aggregate ALL rows into one dataguide.
-- We use DBMS_JSON.PRETTY to make the CLOB output readable.
SELECT
JSON_DATAGUIDE(
po_document,
dbms_json.format_flat,
DBMS_JSON.PRETTY
) AS "Data_Guide"
FROM j_purchaseorders;
Result: (The function scans both rows and finds 5 unique paths, describing their type and length.)
Data_Guide
--------------------------------------------------------------------------------
[
{
"o:path" : "$.PONumber",
"type" : "number",
"o:length" : 4
},
{
"o:path" : "$.Requestor",
"type" : "string",
"o:length" : 7
},
{
"o:path" : "$.LineItems",
"type" : "array",
"o:length" : 8
},
{
"o:path" : "$.LineItems.Part",
"type" : "string",
"o:length" : 1
},
{
"o:path" : "$.Special_Instructions",
"type" : "string",
"o:length" : 4
}
]
Example 2: Using GROUP BY with JSON_DATAGUIDE
This example shows how JSON_DATAGUIDE works as a true aggregate function. We will create two groups ('A' and 'B') and generate a separate dataguide for each one.
Query:
-- Simulate a table with categories and JSON documents
WITH my_data AS (
SELECT 'A' AS category, '{"id": 1, "name": "Widget"}' AS doc FROM DUAL UNION ALL
SELECT 'A' AS category, '{"id": 2, "name": "Gadget"}' AS doc FROM DUAL UNION ALL
SELECT 'B' AS category, '{"id": 3, "status": "Active"}' AS doc FROM DUAL
)
-- Group by the category and generate a dataguide for each group
SELECT
category,
JSON_DATAGUIDE(
doc,
dbms_json.format_flat,
DBMS_JSON.PRETTY
) AS "Data_Guide"
FROM my_data
GROUP BY category;
Result: (The query returns two rows, one for each category, each with its own specific dataguide.)
CAT Data_Guide
--- -------------------------------------------------------------------
A [
{
"o:path" : "$.id",
"type" : "number",
"o:length" : 1
},
{
"o:path" : "$.name",
"type" : "string",
"o:length" : 6
}
]
B [
{
"o:path" : "$.id",
"type" : "number",
"o:length" : 1
},
{
"o:path" : "$.status",
"type" : "string",
"o:length" : 6
}
]
