-
Notifications
You must be signed in to change notification settings - Fork 34
API functionality revamp, text fixes, README revamp #15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…SW-CEEM-master
Merge UNSW-CEEM master into fork master - pocket rocket nemosis changes made
API revamped functionaly merge
Contributor
Author
|
@nick-gorman see outline of all changes above. GUI still needs to be tested (checkbox unticked) |
Member
|
Looks good Abi, I'll merge, compile the GUI, draft a release and publish to pypi |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
API functionality revamp (type inference for some API functions) and fix tests, major README changes
Initial PR made 8/5/2021. Leaving PR open though further changes are being made - as these are incorportated into the PR, I will tick tasks off.
API (Type inference & other changes)
Initial fixes
parse_data_types, which is default True for API and set to false in a gui wrapper function (follows structure of other functions wrapped for GUI). This parses data types on reading the AEMO csv.parse_data_typeswill not parse the data types when reading existing files.Further functionality
cache_compileroption that has typical cache args fromdynamic_data_compilerbuilt in (e.g.keep_csv=False,fformat=parquetorfformat=featheranddata_merge=False. It will infer data types when CSVs from AEMO are downloaded and read in.cache_compilerparse_data_typeswill remain but will parse data types of the DataFrame regardless of file type (i.e. parsing when cache or new file read, not just when new file read). Data from csv will always be read in as stringdynamic_data_compilerhas concatenated the list of DataFrames that_dynamic_data_fetch_loopreturns. Parsing before concatenation can lead to typed columns being reverted to object once concatenation occurs (e.g. INTERVENTION went from Int to object).filter_colsandfilter_values. If a user provides a numeric filter value (e.g. RAISE5MIN=5), the pre-parsed DataFrame will have all columns as objects and therefore return an empty DataFrame (unless the user provides RAISE5MIN="5"). This is not expected behaviour, so parsing occurs before filtering. Datetimes can be filtered using user-provided datetime strings or datetime objectsparse_data_types=Falsesince GUI uses string joinsparse_data_types=Truesince operations on columns may require them to be numericfrom nemosis import dynamic_data_compileris possible)Code readability
dynamic_data_compilerandcache_compilerwill be broken out into private functions.Readme
dynamic_data_compilersection with more advanced filtering examples.cache_compiler, with note that it will delete csvs in a cache. However, if it detects pre-cached feather or parquet files, it will not do anything (e.g. ifcache_compileris run in the GUI cache, it will print that the cache has already been compiled)Changes to tests
Other
data_fetch_methods.py,filters.pyandtest_data_fetch_methods.pystyled (flake8)Testing
Test suite run to ensure newer changes to data_fetch_methods work. Report for tests:
Test Report.pdf
f963eb0 passed
New changes tested for API (spot checks) with fresh install of Python on Ubuntu 20.04
dynamic_data_compilerdownloads DISPATCHLOAD csv, releases feather file. The returned DataFrame is typed (which should happen for API users), but the saved feather had columns as objects/strings.cache_compilerreleases parquet/feather for DISPATCHLOAD and deletes csv in cache. The remaining file is typed. Different compression engines were passed to the write function and this worked. The file was then reloaded usingdynamic_data_compilerand this worked, with a typed DataFrame loaded.Quick performance test:
%timeit data_fetch_methods.dynamic_data_compiler("2018/01/01 00:00:00", "2018/01/01 23:55:00", "DISPATCHLOAD", './alt_data')with precompiled feather cacheNew changes tested for GUI (spot checks)