Skip to content

JSONDecodeError: Extra data when trying to read a directory for SPOOL #508

@d-chambers

Description

@d-chambers

Discussed in #507

Originally posted by pongchayut July 11, 2025
Hey everyone,

I’m running into a weird issue with dascore. I tried this simple code:

import dascore as dc
spool = dc.get_example_spool("random_directory_das")

But it throws this error:

JSONDecodeError: Extra data: line 1 column 999 (char 998)

And this also happened with real data. It used to work just fine before, and I didn’t change anything in the code or the data. I'm working with .tdms files from a Silixa system. What’s strange is that if I pass a single .tdms file directly to dc.spool(), it works perfectly. The error only appears when I attempt to load the entire directory.

I'm not sure what went wrong or what caused this. Has anyone seen this before or know how to fix it?

Thanks in advance!

Python Version: 3.11.13
DASCore Version: 0.1.7

Full Traceback:

---------------------------------------------------------------------------
JSONDecodeError                           Traceback (most recent call last)
Cell In[17], line 1
----> 1 spool = dc.get_example_spool("random_directory_das")

File ~/anaconda3/envs/das311/lib/python3.11/site-packages/dascore/examples.py:732, in get_example_spool(example_name, **kwargs)
    727     msg = (
    728         f"No example spool registered with name {example_name} "
    729         f"Registered example spools are {list(EXAMPLE_SPOOLS)}"
    730     )
    731     raise UnknownExampleError(msg)
--> 732 return EXAMPLE_SPOOLS[example_name](**kwargs)

File ~/anaconda3/envs/das311/lib/python3.11/site-packages/dascore/examples.py:570, in random_directory_spool(path, **kwargs)
    568 spool = random_spool(**kwargs)
    569 path = spool_to_directory(spool, path)
--> 570 return dc.spool(path)

File ~/anaconda3/envs/das311/lib/python3.11/functools.py:909, in singledispatch.<locals>.wrapper(*args, **kw)
    905 if not args:
    906     raise TypeError(f'{funcname} requires at least '
    907                     '1 positional argument')
--> 909 return dispatch(args[0].__class__)(*args, **kw)

File ~/anaconda3/envs/das311/lib/python3.11/site-packages/dascore/core/spool.py:711, in _spool_from_str(path, **kwargs)
    708 if path.is_dir():
    709     from dascore.clients.dirspool import DirectorySpool
--> 711     return DirectorySpool(path, **kwargs)
    712 # A single file was passed. If the file format supports quick scanning
    713 # Return a FileSpool (lazy file reader), else return DirectorySpool.
    714 elif path.exists():  # a single file path was passed.

File ~/anaconda3/envs/das311/lib/python3.11/site-packages/dascore/clients/dirspool.py:65, in DirectorySpool.__init__(self, base_path, index_path, preferred_format, select_kwargs, merge_kwargs)
     63     self.indexer = base_path
     64 elif isinstance(base_path, Path | str):
---> 65     self.indexer = DirectoryIndexer(base_path, index_path=index_path)
     66 assert hasattr(self, "indexer"), "indexer not set."
     67 self._preferred_format = preferred_format

File ~/anaconda3/envs/das311/lib/python3.11/site-packages/dascore/io/indexer.py:118, in DirectoryIndexer.__init__(self, path, cache_size, index_path)
    116 self.max_size = cache_size
    117 self.path = Path(path).absolute()
--> 118 self.index_path = Path(self._find_index_file(self.path, index_path))
    119 self._current_index = 0
    120 self._index_table = HDFPatchIndexManager(
    121     self.index_path,
    122     self._namespace,
    123 )

File ~/anaconda3/envs/das311/lib/python3.11/site-packages/dascore/io/indexer.py:142, in DirectoryIndexer._find_index_file(self, data_path, index_path)
    140         return expected_path
    141 # else load path map and see if it knows where the index is.
--> 142 path_map = _get_index_map(cache_path=str(self.index_map_path))
    143 if out := path_map.get(str(data_path)):
    144     return out

File ~/anaconda3/envs/das311/lib/python3.11/site-packages/dascore/io/indexer.py:42, in _get_index_map(cache_path)
     40 if path.exists():
     41     with path.open("r") as fi:
---> 42         out = json.load(fi)
     43 else:
     44     out = {}

File ~/anaconda3/envs/das311/lib/python3.11/json/__init__.py:293, in load(fp, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
    274 def load(fp, *, cls=None, object_hook=None, parse_float=None,
    275         parse_int=None, parse_constant=None, object_pairs_hook=None, **kw):
    276     """Deserialize ``fp`` (a ``.read()``-supporting file-like object containing
    277     a JSON document) to a Python object.
    278 
   (...)    291     kwarg; otherwise ``JSONDecoder`` is used.
    292     """
--> 293     return loads(fp.read(),
    294         cls=cls, object_hook=object_hook,
    295         parse_float=parse_float, parse_int=parse_int,
    296         parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)

File ~/anaconda3/envs/das311/lib/python3.11/json/__init__.py:346, in loads(s, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
    341     s = s.decode(detect_encoding(s), 'surrogatepass')
    343 if (cls is None and object_hook is None and
    344         parse_int is None and parse_float is None and
    345         parse_constant is None and object_pairs_hook is None and not kw):
--> 346     return _default_decoder.decode(s)
    347 if cls is None:
    348     cls = JSONDecoder

File ~/anaconda3/envs/das311/lib/python3.11/json/decoder.py:340, in JSONDecoder.decode(self, s, _w)
    338 end = _w(s, end).end()
    339 if end != len(s):
--> 340     raise JSONDecodeError("Extra data", s, end)
    341 return obj

JSONDecodeError: Extra data: line 1 column 999 (char 998)
```</div>

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions