Stop using Java8 because we no longer support spark < 3.0#5234
Conversation
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
| def _format_exception(ex): | ||
| return "".join(traceback.format_exception(type(ex), ex, ex.__traceback__)) |
There was a problem hiding this comment.
Removed this unused function to run cross version tests for spark.
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
| # This import statement adds `serializeToBundle` and `deserializeFromBundle` to `Transformer`: | ||
| # https://github.com/combust/mleap/blob/37f6f61634798118e2c2eb820ceeccf9d234b810/python/mleap/pyspark/spark_support.py#L32-L33 |
There was a problem hiding this comment.
Change for running cross version tests for mleap.
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
| - name: Get Java version | ||
| id: get-java-version | ||
| run: | | ||
| if [ "${{ matrix.package }}" = "mleap" ] |
There was a problem hiding this comment.
mleap still uses spark 2.4.
There was a problem hiding this comment.
The latest mleap version already support spark 3.x ?
There was a problem hiding this comment.
https://github.com/combust/mleap#requirements says:
MLeap is built against Scala 2.11 and Java 8. Because we depend heavily on Typesafe config for MLeap, we only support Java 8 at the moment.
There was a problem hiding this comment.
Until MLeap supports >3.1.x the solution in this PR will be needed for the netty serialization issue, right?
|
|
||
|
|
||
| @pytest.mark.large | ||
| def test_spark_module_model_save_with_mleap_and_unsupported_transformer_raises_exception( |
There was a problem hiding this comment.
Moved this test here from tests/spark/test_spark_model_export.py since it's related to mleap.
| if Version(pyspark.__version__) < Version("3.1"): | ||
| # A workaround for this issue: | ||
| # https://stackoverflow.com/questions/62109276/errorjava-lang-unsupportedoperationexception-for-pyspark-pandas-udf-documenta | ||
| spark_home = ( | ||
| os.environ.get("SPARK_HOME") | ||
| if "SPARK_HOME" in os.environ | ||
| else os.path.dirname(pyspark.__file__) | ||
| ) | ||
| conf_dir = os.path.join(spark_home, "conf") | ||
| os.makedirs(conf_dir, exist_ok=True) | ||
| with open(os.path.join(conf_dir, "spark-defaults.conf"), "w") as f: | ||
| conf = """ | ||
| spark.driver.extraJavaOptions="-Dio.netty.tryReflectionSetAccessible=true" | ||
| spark.executor.extraJavaOptions="-Dio.netty.tryReflectionSetAccessible=true" | ||
| """ | ||
| f.write(conf) |
There was a problem hiding this comment.
A workaround for an issue with spark < 3.1, java11, and pandas_udf:
https://stackoverflow.com/questions/62109276/errorjava-lang-unsupportedoperationexception-for-pyspark-pandas-udf-documenta
There was a problem hiding this comment.
There was a problem hiding this comment.
+1 writing to spark-defaults config file. Good solution for this.
BenWilson2
left a comment
There was a problem hiding this comment.
LGTM. Nice config solution.
Signed-off-by: harupy 17039389+harupy@users.noreply.github.com
What changes are proposed in this pull request?
Stop using java8 because we no longer support spark < 3.0.
How is this patch tested?
Existing checks
Does this PR change the documentation?
ci/circleci: build_doccheck. If it's successful, proceed to thenext step, otherwise fix it.
Detailson the right to open the job page of CircleCI.Artifactstab.docs/build/html/index.html.Release Notes
Is this a user-facing change?
(Details in 1-2 sentences. You can just refer to another PR with a description if this PR is part of a larger change.)
What component(s), interfaces, languages, and integrations does this PR affect?
Components
area/artifacts: Artifact stores and artifact loggingarea/build: Build and test infrastructure for MLflowarea/docs: MLflow documentation pagesarea/examples: Example codearea/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registryarea/models: MLmodel format, model serialization/deserialization, flavorsarea/projects: MLproject format, project running backendsarea/scoring: MLflow Model server, model deployment tools, Spark UDFsarea/server-infra: MLflow Tracking server backendarea/tracking: Tracking Service, tracking client APIs, autologgingInterface
area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev serverarea/docker: Docker use across MLflow's components, such as MLflow Projects and MLflow Modelsarea/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registryarea/windows: Windows supportLanguage
language/r: R APIs and clientslanguage/java: Java APIs and clientslanguage/new: Proposals for new client languagesIntegrations
integrations/azure: Azure and Azure ML integrationsintegrations/sagemaker: SageMaker integrationsintegrations/databricks: Databricks integrationsHow should the PR be classified in the release notes? Choose one:
rn/breaking-change- The PR will be mentioned in the "Breaking Changes" sectionrn/none- No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" sectionrn/feature- A new user-facing feature worth mentioning in the release notesrn/bug-fix- A user-facing bug fix worth mentioning in the release notesrn/documentation- A user-facing documentation change worth mentioning in the release notes