Use REPL context attributes if available to avoid calling JVM methods#5132
Conversation
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
| _env_var_prefix = "DATABRICKS_" | ||
|
|
||
|
|
||
| def _use_env_var_if_exists(env_var, *, if_exists=lambda x: os.environ[x]): |
There was a problem hiding this comment.
Introduced this decorator to make it easier to preserve the existing logic for older runtime versions.
| """ | ||
|
|
||
| def decorator(f): | ||
| @functools.wraps(f) |
There was a problem hiding this comment.
nice use of the decorator factory here. +1
BenWilson2
left a comment
There was a problem hiding this comment.
really clever, elegant, and simplified solution.
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
| return None | ||
|
|
||
|
|
||
| @_use_env_var_if_exists(_env_var_prefix + "ACL_PATH_OF_ACL_ROOT") |
There was a problem hiding this comment.
Can we prefix these environment variables with DATABRICKS?
There was a problem hiding this comment.
We probably don't need ACL_PATH_OF_ACL_ROOT, since this is used for is_in_databricks_notebook / get_notebook_id. We can rely on DATABRICKS_NOTEBOOK_ID for those.
There was a problem hiding this comment.
@dbczumar Thanks for the comment! _env_var_prefix adds DATABRICKS_ or am I missing something?
There was a problem hiding this comment.
Doh. Sorry - missed that.
There was a problem hiding this comment.
We probably don't need ACL_PATH_OF_ACL_ROOT, since this is used for is_in_databricks_notebook / get_notebook_id. We can rely on DATABRICKS_NOTEBOOK_ID for those.
Makes sense!
There was a problem hiding this comment.
LGTM once #5132 (comment) is addressed. Thanks Haru!
|
@BenWilson2 @dbczumar Thanks for the review, I still need to update the code for dynamic metadata (e.g. command run id). |
| @_use_env_var_if_exists(_ENV_VAR_PREFIX + "NOTEBOOK_PATH") | ||
| def get_notebook_path(): | ||
| """Should only be called if is_in_databricks_notebook is true""" | ||
| path = _get_property_from_spark_context("spark.databricks.notebook.path") |
There was a problem hiding this comment.
does this work with ephemeral notebooks within and without jobs?
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Signed-off-by: harupy <hkawamura0130@gmail.com>
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
30473e2 to
5ac0475
Compare
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
What changes are proposed in this pull request?
Use REPL context attributes if available to avoid calling JVM methods.
How is this patch tested?
Install mlflow from this branch on Databricks and confirmed we can run mlflow code in multiprocessing.
Does this PR change the documentation?
ci/circleci: build_doccheck. If it's successful, proceed to thenext step, otherwise fix it.
Detailson the right to open the job page of CircleCI.Artifactstab.docs/build/html/index.html.Release Notes
Is this a user-facing change?
(Details in 1-2 sentences. You can just refer to another PR with a description if this PR is part of a larger change.)
What component(s), interfaces, languages, and integrations does this PR affect?
Components
area/artifacts: Artifact stores and artifact loggingarea/build: Build and test infrastructure for MLflowarea/docs: MLflow documentation pagesarea/examples: Example codearea/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registryarea/models: MLmodel format, model serialization/deserialization, flavorsarea/projects: MLproject format, project running backendsarea/scoring: MLflow Model server, model deployment tools, Spark UDFsarea/server-infra: MLflow Tracking server backendarea/tracking: Tracking Service, tracking client APIs, autologgingInterface
area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev serverarea/docker: Docker use across MLflow's components, such as MLflow Projects and MLflow Modelsarea/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registryarea/windows: Windows supportLanguage
language/r: R APIs and clientslanguage/java: Java APIs and clientslanguage/new: Proposals for new client languagesIntegrations
integrations/azure: Azure and Azure ML integrationsintegrations/sagemaker: SageMaker integrationsintegrations/databricks: Databricks integrationsHow should the PR be classified in the release notes? Choose one:
rn/breaking-change- The PR will be mentioned in the "Breaking Changes" sectionrn/none- No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" sectionrn/feature- A new user-facing feature worth mentioning in the release notesrn/bug-fix- A user-facing bug fix worth mentioning in the release notesrn/documentation- A user-facing documentation change worth mentioning in the release notes