different number of partitions when indexing after reading dataframe

**Describe the issue**:
When having a dataframe with npartitions > 1, when saving it to parquet and reading it back in, the number of partitions changes when you index the dataframe on one of the columns.
**Minimal Complete Verifiable Example**:

```python
import dask.dataframe as dd
import pandas as pd
import numpy as np

n_partitions = 10
data = {
    'x': np.random.randn(1000),
    'y': np.random.randn(1000),
    'genes': np.random.choice(['GeneA', 'GeneB', 'GeneC', 'GeneD'], size=1000),
    'instance_id': np.arange(1000)
}

df = pd.DataFrame(data)
df['genes'] = df['genes'].astype('category')
ddf = dd.from_pandas(df, npartitions=n_partitions)

ddf.to_parquet('test.parquet')

test = dd.read_parquet("test.parquet")
```
Now when printing:
```python
print(ddf.map_partitions(len).compute())
print(ddf['x'].map_partitions(len).compute())
```
Both return:
```python
Out[13]: 
0    1000
0    1000
0    1000
0    1000
0    1000
0    1000
0    1000
0    1000
0    1000
0    1000
dtype: int64
```
printing the following:
```python
print(test.map_partitions(len).compute())
print(test['x'].map_partitions(len).compute())
```
The first statement gives the result as before with `ddf`. However, the second print gives:
```
Out[15]: 
0    3000
0    3000
0    4000
dtype: int64
```


**Environment**:

- Dask version: dask from main
- Python version: tested with multiple python versions
- Operating System: windows but reproduced on mac
- Install method (conda, pip, source): pip development installation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

different number of partitions when indexing after reading dataframe #12193

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

different number of partitions when indexing after reading dataframe #12193

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions