Skip to content

ESQL: Data sources: Azure plugin#143236

Merged
bpintea merged 23 commits intoelastic:mainfrom
bpintea:esql/ds/azure
Feb 28, 2026
Merged

ESQL: Data sources: Azure plugin#143236
bpintea merged 23 commits intoelastic:mainfrom
bpintea:esql/ds/azure

Conversation

@bpintea
Copy link
Copy Markdown
Contributor

@bpintea bpintea commented Feb 27, 2026

This adds support for the Azure Blob Storage.

🤖 Developed AI-assisted.

This adds support for the Azure Blob Storage.
@bpintea bpintea requested a review from costin February 27, 2026 13:13
@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Feb 27, 2026
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Hi @bpintea, I've created a changelog YAML for you.

Copy link
Copy Markdown
Member

@costin costin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Left a few comments; in particular the close() one might address the thread leakage.

public StorageIterator listObjects(StoragePath prefix, boolean recursive) throws IOException {
validateAzureScheme(prefix);
ParsedPath parsed = parsePathForListing(prefix);
String account = extractAccountFromHost(parsed.host);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

close() only nulls the reference but never calls blobServiceClient.close(). BlobServiceClient uses reactor-netty event loops under the hood, so this leaks threads and connections. GCS calls storage.close(), S3 calls s3Client.close(). Should snapshot the reference, null it, then close it in a try/catch like GCS does.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, addressed.

|| (sasToken != null && sasToken.isEmpty() == false);
}

@Override
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a Java record, which auto-generates equals() and hashCode() from all components. The custom implementations here are identical to what the compiler generates. Can just remove them.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😊

java.net.URL resourceUrl = AzureFixtureUtils.class.getResource(FIXTURES_RESOURCE_PATH);
if (resourceUrl == null) {
logger.warn("Fixtures resource path not found: {}", FIXTURES_RESOURCE_PATH);
return;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: use imports instead of FQN for java.io.ByteArrayInputStream and java.net.URL. Codebase convention is to always import.

private final StoragePath path;

private Long cachedLength;
private Instant cachedLastModified;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These cached fields are written in fetchMetadata() without synchronization or volatile. If a StorageObject is ever accessed from multiple threads there could be visibility issues. Worth either making them volatile or documenting that instances are single-threaded.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made them volatile, though the resolution of the resource needs to happen uni-threaded first and unlikely that'll change, but it's even safer now.

@@ -0,0 +1,10 @@
ALL-UNNAMED:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

repository-azure scopes entitlements to specific modules (io.netty.common, reactor.core, etc.) rather than ALL-UNNAMED. Worth aligning with that pattern so entitlements are narrower.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@bpintea bpintea requested a review from a team as a code owner February 27, 2026 14:25
@bpintea bpintea enabled auto-merge (squash) February 27, 2026 15:34
elasticsearchmachine and others added 6 commits February 27, 2026 15:42
The failure was caused by the esql-datasource-azure entitlement policy
using module-scoped entitlements (io.netty.common, reactor.core, etc.)
while the plugin layer only exposes ALL-UNNAMED.
@bpintea bpintea merged commit d280b32 into elastic:main Feb 28, 2026
35 checks passed
@bpintea bpintea deleted the esql/ds/azure branch February 28, 2026 11:15
tballison pushed a commit to tballison/elasticsearch that referenced this pull request Mar 3, 2026
This adds support for the Azure Blob Storage.

🤖 Developed AI-assisted.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL >feature Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants