Add the GMTSampleData class to simplify the load_sample_data and list_sample_data functions (#2342)#2342
Add the GMTSampleData class to simplify the load_sample_data and list_sample_data functions (#2342)#2342
Conversation
pygmt/datasets/samples.py
Outdated
| texts = "\n ".join( | ||
| f'- ``"{name}"``: {dataset.description}.' for name, dataset in datasets.items() | ||
| ) | ||
|
|
||
| docstrings = module_func.__doc__.format(dataset_list=texts) | ||
| module_func.__doc__ = textwrap.dedent(docstrings) | ||
| return module_func |
There was a problem hiding this comment.
This decorator inserts the list of available datasets to the load_sample_data function by substituting the {dataset_list} placeholder with the following text strings:
- ``"bathymetry"``: Table of ship bathymetric observations off Baja California
- ``"earth_relief_holes"``: Regional 20 arc-minutes Earth relief grid with holes
- ...
However, because the placeholder {dataset_list} is indented by 8 whitespaces (see Line 341), substituting the placeholder with the text strings would give:
- ``"bathymetry"``: Table of ship bathymetric observations off Baja California
- ``"earth_relief_holes"``: Regional 20 arc-minutes Earth relief grid with holes
- ...
So at line 302, I have to use "\n " (newline with 8 whitespaces) to join the list, which is not ideal but I don't have a better solution for this.
| -------- | ||
| load_sample_data : Load an example dataset from the GMT server. | ||
| """ | ||
| return {name: dataset.description for name, dataset in datasets.items()} |
There was a problem hiding this comment.
Should there be a test_list_sample_data that has a dictionary of all of the functions/descriptions to test line 324?
There was a problem hiding this comment.
Are you suggesting a test to check if the returned value is equal to:
{'bathymetry': 'Table of ship bathymetric observations off Baja California',
'earth_relief_holes': 'Regional 20 arc-minutes Earth relief grid with holes',
'fractures': 'Table of hypothetical fracture lengths and azimuths',
'hotspots': 'Table of locations, names, and symbol sizes of hotpots from Mueller et al., 1993',
'japan_quakes': 'Table of earthquakes around Japan from NOAA NGDC database',
'mars_shape': 'Table of topographic signature of the hemispheric dichotomy of Mars from Smith and Zuber (1996)',
'maunaloa_co2': 'Table of CO2 readings from Mauna Loa',
'notre_dame_topography': 'Table 5.11 in Davis: Statistics and Data Analysis in Geology',
'ocean_ridge_points': 'Table of ocean ridge points for the entire world',
'rock_compositions': 'Table of rock sample compositions',
'usgs_quakes': 'Table of global earthquakes from the USGS'}
There was a problem hiding this comment.
In 518900d, I added a test to check if the returned value is in dict type.
There was a problem hiding this comment.
I think that's fine; I was just going off of the codecov alert saying that like 324 was untested. My thought had been to use something like a list of the dataset names being equal to dict.keys() to avoid a very lengthy comparison, but I think your way works better.
Co-authored-by: Will Schlitzer <schlitzer90@gmail.com>
|
@GenericMappingTools/pygmt-maintainers It would be good if we can have one more reviewer since this PR brings some big changes. |
maxrjones
left a comment
There was a problem hiding this comment.
Thanks for working on this, it'll be a big improvement!
My preference would be to document of the list of datasets manually as the output of a docstring example for list_sample_data (in which case the doctest should catch whether it's out of date). I would prefer avoiding the decorator/docstring inject trick when possible because it makes the docstrings less interpretable for IDEs like VSCode.
Co-authored-by: Yvonne Fröhlich <94163266+yvonnefroehlich@users.noreply.github.com>
That's a good point.
I'm OK with adding the the output of >>> # use list_sample_data to see the available datasets
>>> list_sample_data()
{'bathymetry': 'Table of ship bathymetric observations off Baja California',
'earth_relief_holes': 'Regional 20 arc-minutes Earth relief grid with holes',
'fractures': 'Table of hypothetical fracture lengths and azimuths',
'hotspots': 'Table of locations, names, and symbol sizes of hotpots from Mueller et al., 1993',
'japan_quakes': 'Table of earthquakes around Japan from NOAA NGDC database',
'mars_shape': 'Table of topographic signature of the hemispheric dichotomy of Mars from Smith and Zuber (1996)',
'maunaloa_co2': 'Table of CO2 readings from Mauna Loa',
'notre_dame_topography': 'Table 5.11 in Davis: Statistics and Data Analysis in Geology',
'ocean_ridge_points': 'Table of ocean ridge points for the entire world',
'rock_compositions': 'Table of rock sample compositions',
'usgs_quakes': 'Table of global earthquakes from the USGS'}
>>> # load the sample bathymetry dataset
>>> data = load_sample_data("bathymetry") |
sounds good to me |
|
I tried to follow @maxrjones's suggestion and added a doctest for The default returned value of Then I try to use Now it looks better but pylint reports that some lines are longer than 79 characters. So I only have two options:
I choose option 2 because in my opinion it's more readable, but I have to add |
Description of proposed changes
It's easier to review this PR by looking at the changes in each commit:
load_sample_dataandlist_sample_datato the end. Required by the changes in commit 4641649GMTSampleDataclass, similar to theGMTRemoteDataused inload_remote_dataset.py. This class contains two attributes, the function to load the dataset and the description of the dataset. With theGMTSampleDataclass, theload_sample_dataandlist_sample_datafunctions are greatly simplified and we no longer need to maintain two dictionaries which have the same keys.ea9132c: Add afmt_dataset_listdecorator to automatically inserts the list of available datasets to the docstrings of theload_sample_datafunction, to address Improve the documentation of load_sample_data and list_sample_data #1774.Preview: https://pygmt-dev--2342.org.readthedocs.build/en/2342/api/generated/pygmt.datasets.load_sample_data.html
Fixes #1774 and supersedes #1814.
Reminders
make formatandmake checkto make sure the code follows the style guide.doc/api/index.rst.Slash Commands
You can write slash commands (
/command) in the first line of a comment to performspecific operations. Supported slash commands are:
/format: automatically format and lint the code/test-gmt-dev: run full tests on the latest GMT development version