Skip to content

Consider using OPENJSON for our JSON scalar access #30981

@roji

Description

@roji

When using owned JSON entities, we currently use JSON_VALUE to extract scalars out of them. Since JSON_VALUE always returns nvarchar(4000), we apply a cast to the results based on the model type. For example:

_ = await ctx.Blogs.Where(b => b.Details.Foo == 8).ToArrayAsync();

... translates to:

SELECT [b].[Id], [b].[Name], JSON_QUERY([b].[Details],'$')
FROM [Blogs] AS [b]
WHERE CAST(JSON_VALUE([b].[Details],'$.Foo') AS int) = 8

There are issues with this translation:

  • Binary data (varbinary) cannot be extracted in this way: the standard JSON representation for binary data is base64, but a regular relational CAST doesn't do that (see comment). OPENJSON with WITH does decode base64 data, since it applies a JSON-specific conversion. Binary data specifically is covered by #33435.
    • On the other hand, spatial data cannot be converted with OPENJSON with WITH, only with a regular cast. The same may be true of hierarchyid.
    • We need to go through all supported SQL Server types
  • JSON_VALUE returns null for strings larger than 4000 (or can throw in strict mode) (see #29477).
  • There's good reason to believe that this translation is inefficient compared to OPENJSON with WITH, where the query supplies more information to SQL Server.
    • See this and this. Both these comments are about OPENJSON without WITH (and not about JSON_VALUE), and are in the context of primitive collections; but the same principles should hold here as well - needs to be confirmed.

The alternative translation would be to use OPENJSON with WITH instead:

SELECT [b].[Id], [b].[Name], JSON_QUERY([b].[Details],'$')
FROM [Blogs] AS [b]
CROSS APPLY OPENJSON([b].[Details]) WITH ([Foo] int '$.Foo') AS [d]
WHERE [d].[Foo] = 8

In other words, every scalar access adds a CROSS APPLY to an OPENJSON invocation, applying the JSON conversion with WITH.

When the property being extracted is known to be a short string in the model (fitting in nvarchar(4000)), we could keep the current JSON_VALUE translation. When the type being extracted is incompatible with OPENJSON with WITH (e.g. geometry), we could do OPENJSON without WITH and then apply a relational cast (we should avoid JSON_VALUE still it truncates).

Note: consider the indexability of this technique, compared to JSON_VALUE (with computed columns)

Metadata

Metadata

Assignees

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions