I'm researching the functionality of opening a parquet file stored in an Azure blob store from a Jupyter notebook using a Python 3 kernel.
Specifically, I do not want a PySpark kernel. I can already do that using the following code:
salesData = spark.read.parquet("wasbs://product-abc-data@myproduct.blob.core.windows.net/part-r-....gz.parquet")
The question is, can arrow be used to somehow connect to a blob store from pure python code:
import pandas as pd
pd.read_parquet('wasbs://product-abc-data@myproduct.blob.core.windows.net/part-r-blah-gz.parquet'
Results in:
ArrowIOError Traceback (most recent call last) ... ... ... ArrowIOError: Failed to open local file: wasbs://product-abc-data@myproduct.blob.core.windows.net/part-r-blah-gz.parquet , error: No such file or directory