Skip to content

Hyperparameter PDFs an Densities#241

Merged
eddiebergman merged 1 commit intoautoml:masterfrom
hvarfner:pdf
Mar 21, 2022
Merged

Hyperparameter PDFs an Densities#241
eddiebergman merged 1 commit intoautoml:masterfrom
hvarfner:pdf

Conversation

@hvarfner
Copy link
Copy Markdown
Contributor

@hvarfner hvarfner commented Mar 7, 2022

PiBO Pull Request 3 remake - PDFs and Densities

Third pull request - Implements _pdf, pdf and get_max_density for each parameter type, as well as other support needed for PiBO

  • Everything from PiBO PR1 and PR2
  • _pdf and pdf
  • get_max_density
  • remove_parameter_priors with supporting static methods which copy forbiddens and conditions from the old (non-uniform) configspace to the new one
    • static method substitute_hyperparameters_in_conditions()
    • static method substitute_hyperparameters_in_forbiddens()
  • Accompanying tests
  • Documentation in user guide
  • Betaparameter pdfs now consider Unit hypercube scaling (plus skewing induced by Integer/quantization)

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 7, 2022

Codecov Report

Merging #241 (1e77c8e) into master (4cc7bde) will increase coverage by 4.64%.
The diff coverage is n/a.

❗ Current head 1e77c8e differs from pull request most recent head 4538ff4. Consider uploading reports for the commit 4538ff4 to get more accurate results

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #241      +/-   ##
==========================================
+ Coverage   62.43%   67.07%   +4.64%     
==========================================
  Files          17       17              
  Lines        1637     1637              
==========================================
+ Hits         1022     1098      +76     
+ Misses        615      539      -76     
Impacted Files Coverage Δ
ConfigSpace/read_and_write/pcs_new.py 90.93% <0.00%> (+8.47%) ⬆️
ConfigSpace/read_and_write/pcs.py 85.53% <0.00%> (+19.42%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4cc7bde...4538ff4. Read the comment docs.

@eddiebergman
Copy link
Copy Markdown
Contributor

Hi @hvarfner,

The failing test seems to be a slight floating point numerical difference. You can use assert_allclose from numpy to fix this.

I'll do a full review now :)

Copy link
Copy Markdown
Contributor

@eddiebergman eddiebergman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments, I havn't checked the actual correctness of the pdf implementations but I'm assuming with your tests that it is correct :)

Parameters
----------
vector: np.ndarray
the (N, ) vector of imputs for which the probability density
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noticed the word imput, should be input. Seems the same in all other doc strings, if you have time, a search-replace over imputs would be nice :)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@hvarfner
Copy link
Copy Markdown
Contributor Author

hvarfner commented Mar 7, 2022

Cool, thanks! I'll address everything you found later thi afternoon!

And to perhaps make you feel a bit at ease - Making sure that the pdfs were correctly computed was my absolute biggest timesink, so I'm pretty confident that they are correct!

@hvarfner
Copy link
Copy Markdown
Contributor Author

hvarfner commented Mar 8, 2022

Okay, new pass done. I hope everything is adressed once again. I did change the index.rst file too, as the added weights to the CategoricalHPs end up changing the RNG.

LMK if there's anything else that needs to be done!

@eddiebergman
Copy link
Copy Markdown
Contributor

Hi @hvarfner,

I'm happy with the changes and happy to see all the green ticks :) I would ask @mfeurer to take a look which I assume will be towards the end of the week or next. Sorry for the delay on that.

We really do appreciate all the patience you've had to get this implemented. It was a large change and dealt with fixing a few bugs that were present. Many kudos!

I think it highlights some areas of improvements required from us, notably in my view:

  • Documentation of what's expected by implemented Hyperparamters
  • Slight optimizations for sampling/neighbours
  • I would argue the hyperparameters should be split into seperate files at this point.
  • Rethink class heriarchy
  • Use properties for internal object properties, removes a lot of the if/else checking if transformed or not.

We also briefly discussed in side channels of having the distributions be a component of a class, rather than baking it into the class itself. That would make things a lot more flexible and testable.

If you have further thoughts on specific improvements, they would be greatly appreciated!

Best,
Eddie

@hvarfner
Copy link
Copy Markdown
Contributor Author

hvarfner commented Mar 8, 2022

Many thanks to you too, Eddie - and obviously Matthias as well!

Great to hear that your experience was pleasant, as mine was, too. Obviously, it took quite a lot of patience on you guys' end to get all of my quirks sorted out.

I think your bullets make sense. My impression is that there were one or two crossroads in terms of design, where I went with my gut and we had to backtrack. For example, the various domains that parameters work with internally and the intricacies of _transform and _inverse_transform.

I was rather surprised when I first saw the domain that a logged IntegerHP works with internally, and it took me a while to figure out. If I had one improvement, it would probably be that all HPs work with a strict Unit Hypercube internally (self._lower=0, lowest value in domain=0), whether they are discrete/discretized, Categorical - you name it. I do believe this would assist in implementing additional features in the future, and help substantially in testing private methods.

I would agree with the splitting and neighbors point as well.

On the question of class hierarchy, I have not given this a lot of though myself, but I feel like what's in place now is a very valid approach. Whether there's a better one out there, I'm not sure. However, I'd be happy to discuss if you want my input!

Copy link
Copy Markdown
Contributor

@mfeurer mfeurer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi folks,

I just had a quick look over the implementation and it looks fine overall. I didn't have the time to look into the unit tests, though, but please go ahead with this PR without me doing so. WRT to @eddiebergman's comments:

Documentation of what's expected by implemented Hyperparamters

Yes! I promised to open an issue to spawn a discussion on the scaling/warping of continuous values, but I'm afraid I will only be able to do so in April. @hvarfner if you want to go ahead on this, please do so :) However, I'm not sure if we can/should map categoricals between 0/1, though.

Slight optimizations for sampling/neighbours

That sounds great!

I would argue the hyperparameters should be split into seperate files at this point.

Totally agree on that. Compiling now takes ages if you just change one HP

Rethink class heriarchy

Yes!

Use properties for internal object properties, removes a lot of the if/else checking if transformed or not.

Maybe, we always need to check whether such Python additions are equally fast in Cython.

We also briefly discussed in side channels of having the distributions be a component of a class, rather than baking it into the class itself. That would make things a lot more flexible and testable.

That would be an amazing option. I'm not sure if we can/should do this for all distributions (because their implementation might slow down sampling), but we should have this as a general option.


Parameters
----------
new_configspace: ConfigurationSpace
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return type is incorrect.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Return type? Are you referring to the list of conditions? I couldn't find anything wrong, so what are you referring to?

return size

@staticmethod
def substitute_hyperparameters_in_conditions(conditions, new_configspace):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add type hints?

return new_conditions

@staticmethod
def substitute_hyperparameters_in_forbiddens(forbiddens, new_configspace):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as above (also the return type).

shrunk unit hypercube range for the Beta hyperparameter case, and tests
have been rewritten to accomodate.
@hvarfner
Copy link
Copy Markdown
Contributor Author

@eddiebergman
Now, all of Matthias' last proposed edits are incorporated, everything squashed, and all the tests should pass. Can we merge?

@eddiebergman
Copy link
Copy Markdown
Contributor

Hi @hvarfner, Seems like the tests are passing, I'll just wait till they're all done and then I'll merge it in :)

@hvarfner
Copy link
Copy Markdown
Contributor Author

@eddiebergman ping
=)

@eddiebergman
Copy link
Copy Markdown
Contributor

Good ping, my bad!

@eddiebergman eddiebergman merged commit 175f798 into automl:master Mar 21, 2022
@hvarfner
Copy link
Copy Markdown
Contributor Author

Thanks!

github-actions bot pushed a commit that referenced this pull request Mar 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants