Skip to content

Question: Can arrow be used to load a parquet file from an Azure blob store? #1510

@rjrussell77

Description

@rjrussell77

I'm researching the functionality of opening a parquet file stored in an Azure blob store from a Jupyter notebook using a Python 3 kernel.

Specifically, I do not want a PySpark kernel. I can already do that using the following code:
salesData = spark.read.parquet("wasbs://product-abc-data@myproduct.blob.core.windows.net/part-r-....gz.parquet")

The question is, can arrow be used to somehow connect to a blob store from pure python code:
import pandas as pd
pd.read_parquet('wasbs://product-abc-data@myproduct.blob.core.windows.net/part-r-blah-gz.parquet'
Results in:
ArrowIOError Traceback (most recent call last) ... ... ... ArrowIOError: Failed to open local file: wasbs://product-abc-data@myproduct.blob.core.windows.net/part-r-blah-gz.parquet , error: No such file or directory

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions