Skip to content

Conversation

@arbennett
Copy link
Contributor

@arbennett arbennett commented Jan 20, 2022

👋 Hi Parflow maintainers! This is my first PR here - I'm Andrew Bennett, working with Laura Condon (@lecondon)& Reed Maxwell (@reedmaxwell) to develop some machine learning emulators of Parflow. In preparing the large datasets to feed the ML algorithms it became a bottleneck to read using the ParflowIO implementation of readers. This pull request implements simple and fast pure-python based readers and writers of PFB files, done by myself and Bill Hasling (@wh3248). This eliminates the need for the external ParflowIO dependency. I have also implemented a new backend for the xarray package that let's you open both .pfb files as well as .pfmetadata files directly into xarray datastructures. These are very useful for data wrangling and scientific analysis. This PR is a merge from my separate repo pf-xarray.

I recognize this is a rather large pull request so I'm happy to meet to discuss the finer points of the implementation. Currently there are some tests, but they are minimal and might be expanded to make sure things interoperate with the rest of the parflow ecosystem. Also, I'm not sure if I should be PRing against master - but I didn't see a staging branch like develop or next. Let me know if this should be changed!

Some basic usage of the new functionality:

import parflow as pf

# Read a pfb file as numpy array:
x = pf.read_pfb('/path/to/file.pfb')

# Read a pfb file as an xarray dataset:
ds = xr.open_dataset('/path/to/file.pfb', name='example')

# Write a pfb file with distfile:
pf.write_pfb('/path/to/new_file.pfb', x, 
             p=p, q=q, r=r, dist=True)

Copy link
Contributor

@smithsg84 smithsg84 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor typo and would like to get rid of the hard-coded temp directory path.

header['p'] = '8'
header['q'] = '5'
header['z'] = '1'
if not os.path.exists(TEMP_DIRECTORY):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TEMP_DIRECTORY should be replaced with generated value.

@arbennett arbennett requested a review from smithsg84 February 10, 2022 22:07
smithsg84
smithsg84 previously approved these changes Feb 10, 2022
@arbennett
Copy link
Contributor Author

I noticed a few of the tests failed - I added some fixes to 2 of them, clm_build_export and clm_veg_mapping, but don't know what went wrong with the other few that failed.

Copy link
Contributor

@smithsg84 smithsg84 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the patching.

@smithsg84
Copy link
Contributor

smithsg84 commented Feb 16, 2022

Add simple and fast pure-python based readers and writers of PFB files, done by myself and Bill Hasling (@wh3248). This eliminates the need for the external ParflowIO dependency. Implemented a new backend for the xarray package that let's you open both .pfb files as well as .pfmetadata files directly into xarray datastructures. These are very useful for data wrangling and scientific analysis

Basic usage of the new functionality:

import parflow as pf

# Read a pfb file as numpy array:
x = pf.read_pfb('/path/to/file.pfb')

# Read a pfb file as an xarray dataset:
ds = xr.open_dataset('/path/to/file.pfb', name='example')

# Write a pfb file with distfile:
pf.write_pfb('/path/to/new_file.pfb', x, 
             p=p, q=q, r=r, dist=True)

.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants