Conversation
vanpelt
left a comment
There was a problem hiding this comment.
Overall looks great! One addition I would suggest is a quick explainer on how to add new settings. We could put this in a comment of wandb_settings or one of our doc files, but there's a lot to think about between env vars, validation, and serialization that would be good to clearly define.
| html.bind_to_run(mocked_run, "rad", "summary") | ||
| wandb.Html.seq_to_json([html], mocked_run, "rad", "summary") | ||
| assert os.path.exists(html._path) | ||
| wandb.finish() |
There was a problem hiding this comment.
I'm curious why we added all of these? The mocked_run object should be calling finish after yielding it to the test.
| run = wandb.init(entity="ign", project="ignored") | ||
| sweep_projects.append(run.project) | ||
| sweep_entities.append(run.entity) | ||
| run.finish() |
There was a problem hiding this comment.
Same here, I would think the sweep agents are automatically calling finish when the function returns...
| f.write("Not that big") | ||
| art = run.log_code() | ||
| assert sorted(art.manifest.entries.keys()) == ["test.py"] | ||
| run.finish() |
There was a problem hiding this comment.
Same here, the test_settings fixture has:
if wandb.run is not None:
wandb.run.finish()
I'm sure there was a reason for this, but if we can make it "Just Work" in tests it's going to make future devs lives a lot less error prone.
There was a problem hiding this comment.
Yea, I was trying to find the reason for flakiness and the working hypothesis was that maybe for some reason the singleton doesn't get recycled properly or something, which is of course not true since all fixtures are func-scoped.
I later discovered that it's (mainly) because of stuff leaking into os.environ and the filesystem through certain fixtures and tests, but decided to keep these calls in as it seemed more explicit and a settings fixture didn't seem like the first place I'd look for a teardown action like that.
Now that you bring this up, how about ditching it from test_settings and instead doing something like
@pytest.fixture(autouse=True)
def cleanup():
yield
if wandb.run is not None:
wandb.run.finish()What do you think? @raubitsj?
There was a problem hiding this comment.
Yeah, I was also thinking there could be leaking if an exception is thrown in the yield I wonder if we need:
@pytest.fixture(autouse=True)
def cleanup():
try:
yield
finally:
if wandb.run is not None:
wandb.run.finish()| @enum.unique | ||
| class Source(enum.IntEnum): | ||
| OVERRIDE: int = 0 | ||
| BASE: int = 1 # fixme: audit this |
There was a problem hiding this comment.
Would love to know a little more about this comment ;)
There was a problem hiding this comment.
https://youtu.be/HQ47glxcxr0?t=13 ;)
It's a general comment that we should use these more or ditch - e.g. ENTITY or WORKSPACE never come up in the code, ORG only come up in tests, etc.
| } | ||
| self.username: Any = { | ||
| "validator": lambda x: isinstance(x, str), | ||
| } |
There was a problem hiding this comment.
All of these Any annotations make me feel sad. Any reason not to make these NamedTuples or something with strong type gurarentees?
There was a problem hiding this comment.
Yeah.. All these are converted to Property class instances immediately after definition, all of which come with validators that guarantee not only types, but also anything else, specific to each setting (see e.g. https://github.com/wandb/client/pull/3083/files/cf4a6864f7e306dd5d3fc1172fb26f29fc96265f#diff-432d500a1d77db4e4a22825d98e99cb860fee78777f6540b85395c1e5ee28e94R476). Here's my reasoning for doing it this way:
- Properties encapsulate all the logic we want from them: preprocessing, validation, runtime hooks, deps, source priorities (see an example in the pr description). So there are strong type guarantees, but yes, static checkers aren't of much use in this case, action happens at runtime. (So all those
Any's are there just to make mypy happy.) But we should bring our test coverage to ~100% anyways :)- Validation allows programming by contract of sorts (long live PEP316 ^_^), which static checkers won't help with.
- I could have gone with
object.__setattr__to define things (like we're doing now) but that breaks code completion in pycharm (and i'd expect same in vscode), and it's kind of nice to have. - We could define these as Properties straight away:
self.mofl: Property = Property(
name="mofl",
validator=[lambda x: isinstance(x, (int, float)), lambda x: x > 42],
)
or something, it just takes up more space to spell it out since I decided to attach names to them. I'm afraid it might freak mypy out, although I haven't tried to be honest.
There was a problem hiding this comment.
@raubitsj has a great idea about how to get the best of both static and runtime worlds, implementing now.
[the idea is to use typed class attributes + dynamic re-definition as properties propagating type annotation as the first validator in the resulting property object; this does add a bit of pain to the dev experience, but I'm adding a note to the contrib guide + a unit test that ensures people don't screw up in the process].
| continue | ||
| if self._priority_failed(k, source=_source, override=_override): | ||
| env_prefix: str = "WANDB_" | ||
| special_env_var_names = { |
There was a problem hiding this comment.
Would be good to add a note about what makes these specials and when a dev should add something to this list.
|
Many thanks @vanpelt!
|
raubitsj
left a comment
There was a problem hiding this comment.
Awesome. Lots of great stuff here!
Fixes WB-7468
Fixes WB-7692
...and a ton of other stuff
I apologize for the size of this PR and the length of the description below :)
Description
Context: currently, settings are handled in a very complex manner, making it somewhat challenging to add new features and to maintain the existing ones.
The main purpose of this PR (which is likely only the first one in a series of PRs) is to refactor the
wandb.sdk.wandb_settings.Settingsclass to provide:Having addressed the above, I focused on making sure that nothing is broken across the client.
It's been a lot of fun working on this, and I expect even more fun from the review process :)
Implementation details
This PR adds the
Propertyclass that is utilized by theSettings.Property: a class to represent attributes (individual settings) of the Settings object.update()method is used to update the value of a setting. Enforces source priority logic when updating values.is_policyattribute determines the source priority when updating the property value. E.g. ifis_policyisTrue, the smallestSourcevalue takes precedence.Here's a basic example (for more examples, see
tests/wandb_settings_test.py)Settings: a class that usesPropertyobjects to represent configurable settings.wandb/sdk/wandb_settings.py:)Settings.update(). Direct attribute assignment is not allowed.attribute.valueif attribute is aProperty.Settings.make_static()method that we can use to replaceStaticSettings.Other stuff in this PR
CONTRIBUTING.md.'{"code_saving_enabled": true}'inmock_serverctx.Run.settings. We should also start saving them by default with each run as it would provide a lot of useful information that is not otherwise captured.Testing
Numerous (>100) tests, both settings-specific and not, were added or fixed.
pytestworker.Some notes
When designing this, I considered external libraries
attrsandpydanticas potentially useful. Neither seemed like a good option for us:pydanticgenerates scary code behind the scenes.attrsonly ensures conversion and validation at init.Status as of 1/11/2022
{“code_saving_enabled”: True}in the user settings by default, which, since now code saving setting is a policy, has higher priority than the ENV variable that the test sets. In unit tests, I made that configurable and so I dolive_mock_server.set_ctx({"code_saving_enabled": None})before running the corresponding tests — now I am working on making this possible withyea. Shouldn't take too long, but I think the review can proceed in the meantime.