[python] `to_dataframe` does not produce sparse data frames

Hi,

I noticed that the `pandas.SparseDataFrame` returned by `Table.to_dataframe` is not really sparse. For instance for the American Gut data:

```python3
In [15]: bm = load_table("deblur_125nt_no_blooms.biom")

In [16]: bm
Out[16]: 32954 x 9511 <class 'biom.table.Table'> with 1829490 nonzero entries (0% dense)

In [17]: tab = bm.to_dataframe()

In [19]: type(tab)
Out[19]: pandas.core.sparse.frame.SparseDataFrame

In [20]: tab.density
Out[20]: 1.0

In [21]: tab.info()
<class 'pandas.core.sparse.frame.SparseDataFrame'>
Index: 32954 entries, AACGTAGGGTGCAAGCGTTATCCGGATTTACTGGGTGTAAAGGGAGCGCAGGCGGAAGGCTAAGTCTGATGTGAAAGCCCGGGGCTCAACCCCGGTACTGCATTGGAAACTGGTCATCTAGAGTG to TACGGGGGATGCGAGCGTTATCCGGATTCATTGGGTTTAAAGGGTGCGCAGGCCGAGGTTCAAGTCAGCGGTGAAACCCCCGCGCTCAACGCGGGGCATGCCGTTGATACTGTATCTCTGGAGTA
Columns: 9511 entries, 10317.000012326 to 10317.000038478
dtypes: Sparse[float64, nan](9511)
memory usage: 2.3+ GB
``` 

This is basically the memory use of the full table including zeros. Also the densities of the original table and the `SparseDataTable` are pretty different (~0% vs 100%). 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[python] `to_dataframe` does not produce sparse data frames #808

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[python] to_dataframe does not produce sparse data frames #808

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[python] `to_dataframe` does not produce sparse data frames #808