Integration of new PFB IO functionality and xarray compatibility #365

arbennett · 2022-01-20T16:57:08Z

👋 Hi Parflow maintainers! This is my first PR here - I'm Andrew Bennett, working with Laura Condon (@lecondon)& Reed Maxwell (@reedmaxwell) to develop some machine learning emulators of Parflow. In preparing the large datasets to feed the ML algorithms it became a bottleneck to read using the ParflowIO implementation of readers. This pull request implements simple and fast pure-python based readers and writers of PFB files, done by myself and Bill Hasling (@wh3248). This eliminates the need for the external ParflowIO dependency. I have also implemented a new backend for the xarray package that let's you open both .pfb files as well as .pfmetadata files directly into xarray datastructures. These are very useful for data wrangling and scientific analysis. This PR is a merge from my separate repo pf-xarray.

I recognize this is a rather large pull request so I'm happy to meet to discuss the finer points of the implementation. Currently there are some tests, but they are minimal and might be expanded to make sure things interoperate with the rest of the parflow ecosystem. Also, I'm not sure if I should be PRing against master - but I didn't see a staging branch like develop or next. Let me know if this should be changed!

Some basic usage of the new functionality:

import parflow as pf

# Read a pfb file as numpy array:
x = pf.read_pfb('/path/to/file.pfb')

# Read a pfb file as an xarray dataset:
ds = xr.open_dataset('/path/to/file.pfb', name='example')

# Write a pfb file with distfile:
pf.write_pfb('/path/to/new_file.pfb', x, 
             p=p, q=q, r=r, dist=True)

…ckend

…parflow into feature/pf_xarray_integration

smithsg84

Minor typo and would like to get rid of the hard-coded temp directory path.

smithsg84 · 2022-02-09T17:46:09Z

pftools/python/parflow/tools/tests/test_pf_xarray.py

+            header['p'] = '8'
+            header['q'] = '5'
+            header['z'] = '1'
+            if not os.path.exists(TEMP_DIRECTORY):


The TEMP_DIRECTORY should be replaced with generated value.

pftools/python/parflow/tools/io.py

…parflow into feature/pf_xarray_integration

arbennett · 2022-02-11T00:27:27Z

I noticed a few of the tests failed - I added some fixes to 2 of them, clm_build_export and clm_veg_mapping, but don't know what went wrong with the other few that failed.

…writing pfb

smithsg84

Thanks for all the patching.

smithsg84 · 2022-02-16T17:49:04Z

Add simple and fast pure-python based readers and writers of PFB files, done by myself and Bill Hasling (@wh3248). This eliminates the need for the external ParflowIO dependency. Implemented a new backend for the xarray package that let's you open both .pfb files as well as .pfmetadata files directly into xarray datastructures. These are very useful for data wrangling and scientific analysis

Basic usage of the new functionality:

import parflow as pf

# Read a pfb file as numpy array:
x = pf.read_pfb('/path/to/file.pfb')

# Read a pfb file as an xarray dataset:
ds = xr.open_dataset('/path/to/file.pfb', name='example')

# Write a pfb file with distfile:
pf.write_pfb('/path/to/new_file.pfb', x, 
             p=p, q=q, r=r, dist=True)

.

arbennett added 30 commits September 23, 2021 17:04

Initial commit

7da891b

Starting to set up scaffolding...

a7a9fd1

First pass on loading a dataarray

683e619

First working prototype!

1839859

Test scaffold

ce317a5

Adding initial support for pfmetadata files

5d9f669

Set read_inputs to false by default

628ebb9

Update readme with basic usage

f05fe67

Progress towards lazy loading datastructures

2d4ae57

A bit more backend scaffolding....

fbb602b

Some aspects of lazy loading working, but memory leak somewhere

9abf121

Fix copy vs view when loading data

54398ce

Add functionality for reading inputs

6c946a3

Big update expanding loading capabilities!

efff7ac

Fix lazy loading!

246e9b3

In the middle of some experimenting

1c21894

Attempt at speeding up initial loading by inferring dims and shape

95fcc7a

Update before branch

2e0e54d

Updated infrastructure for CachingFileManager

fbec687

Add default dims

4a077c8

Finally fixed lazy loading! For real!

873e659

Improve performance of loading slices with loadClipOfData'

6a8eed3

Improve shape/dimension inference to be lazy

8dc8d52

Big rigmarole on indexing, still broken for indexing with lists/arrays

df8023d

Fix off by one error on list indexing

0f8f312

Allow for chunking, also simplify 2d timeseries

5d131d2

Improve mf_dataset by disabling strict check on filetype

75f7bfa

Almost have parflowio reimplementation

e769d66

Adding version in case of rollback

c61cc64

Got subarray reading working, starting to work on migration of the ba…

75af284

…ckend

arbennett added 4 commits February 3, 2022 15:19

Fix corner case on time indexing

9f01bf0

Merge branch 'feature/pf_xarray_integration' of github.com:arbennett/…

fabeb0a

…parflow into feature/pf_xarray_integration

Fix deprecation warning

3c043b5

Renaming subgrid conventions

189095c

smithsg84 requested changes Feb 9, 2022

View reviewed changes

arbennett added 2 commits February 10, 2022 17:06

Fix some writer bugs, improve tests

9195050

Fix typo

b1dad80

arbennett requested a review from smithsg84 February 10, 2022 22:07

smithsg84 previously approved these changes Feb 10, 2022

View reviewed changes

smithsg84 and others added 3 commits February 10, 2022 15:47

Merge branch 'master' into feature/pf_xarray_integration

2d4fb92

Update tests to use new IO

0d2f521

Merge branch 'feature/pf_xarray_integration' of github.com:arbennett/…

0889ef3

…parflow into feature/pf_xarray_integration

arbennett dismissed smithsg84’s stale review via 0889ef3 February 11, 2022 00:26

arbennett added 12 commits February 11, 2022 13:20

Move np.typing -> np.ndarray for backwards compat

4d963b6

Remove ParflowIO dependency & references

a8b07ca

Allow for writing 2d arrays as pfb with implicit z=1

1455e4b

Forgot to expand dims when writing 2d array

1468c16

Remove requirements_pfb.txt from cmakelists

93a4092

Hopefully fix distfile writing.

b3a16b5

Make imageio required dependency

042e2df

Hopefully fixing writing of distfile

5080224

Write updated pfb in addition to distfile

b6f094b

Fixing tests...

a5e2e27

Fix last test, improve script call, ensure arrays are float64 before …

8c7d97b

…writing pfb

Add conditional to cast of mask

40dc7a5

smithsg84 approved these changes Feb 16, 2022

View reviewed changes

smithsg84 merged commit 8ed7c29 into parflow:master Feb 16, 2022

arbennett mentioned this pull request Feb 23, 2022

Remove reference to parflowio, relax requirement on numba #370

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Integration of new PFB IO functionality and xarray compatibility #365

Integration of new PFB IO functionality and xarray compatibility #365

Uh oh!

arbennett commented Jan 20, 2022 •

edited

Loading

Uh oh!

smithsg84 left a comment

Uh oh!

smithsg84 Feb 9, 2022

Uh oh!

Uh oh!

arbennett commented Feb 11, 2022

Uh oh!

smithsg84 left a comment

Uh oh!

smithsg84 commented Feb 16, 2022 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Integration of new PFB IO functionality and xarray compatibility #365

Integration of new PFB IO functionality and xarray compatibility #365

Uh oh!

Conversation

arbennett commented Jan 20, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

smithsg84 left a comment

Choose a reason for hiding this comment

Uh oh!

smithsg84 Feb 9, 2022

Choose a reason for hiding this comment

Uh oh!

Uh oh!

arbennett commented Feb 11, 2022

Uh oh!

smithsg84 left a comment

Choose a reason for hiding this comment

Uh oh!

smithsg84 commented Feb 16, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

arbennett commented Jan 20, 2022 •

edited

Loading

smithsg84 commented Feb 16, 2022 •

edited

Loading