Skip to content

Conversation

@wesm
Copy link
Member

@wesm wesm commented Dec 24, 2019

In investigating ARROW-6994 and ARROW-7305 I was shocked to find the following behavior from this test script https://gist.github.com/wesm/193f644d10b5aee8c258b8f4f81c5161 (requires data file attached to ARROW-7305)

this is what I see both on master and 0.15.1

$ python arrow7305.py
Starting RSS: 67297280
Read CSV RSS: 179224576
Wrote Parquet RSS: 645177344
Waited 1 second RSS: 645177344
Read CSV RSS: 703504384
Wrote Parquet RSS: 707674112
Waited 1 second RSS: 707674112
...
Waited 1 second RSS: 1147301888

This contrasts with Linux where RSS stabilizes around 170MB at the end. The macOS behavior on 0.14.1 is slightly better, though RSS ends up at ~465MB after the script runs.

When the background thread option is disabled this patch sets the decay time for unused memory pages to 0 so they are released immediately to the OS. Without understanding more about what's going on the current behavior on master this seems better than releasing again what is currently there. With this patch I have

$ python arrow7305.py
Starting RSS: 68505600
Read CSV RSS: 179671040
Wrote Parquet RSS: 288759808
Waited 1 second RSS: 288759808
Read CSV RSS: 298987520
Wrote Parquet RSS: 301359104
Waited 1 second RSS: 301359104
Read CSV RSS: 308961280
Wrote Parquet RSS: 313081856
Waited 1 second RSS: 313081856
...
Read CSV RSS: 315908096
Wrote Parquet RSS: 315822080
Waited 1 second RSS: 315822080
Read CSV RSS: 315822080
Wrote Parquet RSS: 315944960
Waited 1 second RSS: 315944960
Waited 1 second RSS: 315944960

@wesm
Copy link
Member Author

wesm commented Dec 24, 2019

cc @pitrou @xhochy -- I will be mostly unavailable 26 Dec until 6 Jan so if you need to take ownership of this patch in the meantime please be my guest

@github-actions
Copy link

Copy link
Member

@xhochy xhochy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, sounds reasonable but would love to have a second pair of eyes to approve.

Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 as well

@pitrou
Copy link
Member

pitrou commented Jan 6, 2020

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants