Build R documentation using Docker#5188
Conversation
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
e03ad6d to
f240662
Compare
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
| FROM rocker/r-ver:4.1.2 | ||
|
|
||
| WORKDIR /mlflow/mlflow/R/mlflow | ||
| RUN apt-get update -y | ||
| RUN apt-get install git wget libxml2-dev libgit2-dev -y | ||
| RUN TEMP_DEB=$(mktemp) && \ | ||
| wget --directory-prefix $TEMP_DEB https://github.com/jgm/pandoc/releases/download/2.16.2/pandoc-2.16.2-1-amd64.deb && \ | ||
| dpkg --install $(find $TEMP_DEB -name '*.deb') && \ | ||
| rm -rf $TEMP_DEB | ||
| COPY DESCRIPTION . | ||
| COPY .install-deps.R . | ||
| RUN Rscript -e 'source(".install-deps.R", echo = TRUE)' |
There was a problem hiding this comment.
Renamed Docker.build to Docker.dev and add packages required to build the docs.
| +-----------+---------------------------------------------------------+ | ||
| | Argument | Description | | ||
| +===========+=========================================================+ | ||
| | ``model`` | The loaded MLflow model flavor. | | ||
| +-----------+---------------------------------------------------------+ | ||
| | ``data`` | A data frame to perform scoring. | | ||
| +-----------+---------------------------------------------------------+ | ||
| | ``...`` | Optional additional arguments passed to underlying | | ||
| | | predict methods. | | ||
| +-----------+---------------------------------------------------------+ |
There was a problem hiding this comment.
I updated the version of pandoc to 2.16.2 from 2.7.1, which added some changes in this file.
| - run: | ||
| name: Install dev tools | ||
| environment: | ||
| DEBIAN_FRONTEND: noninteractive | ||
| command: | | ||
| apt-get update --yes | ||
| apt-get install sudo git wget curl jq software-properties-common apt-transport-https --yes | ||
|
|
||
| - run: | ||
| name: Install Java | ||
| command: | | ||
| sudo apt-get install default-jdk --yes | ||
| java -version | ||
|
|
||
| - run: | ||
| name: Install R | ||
| command: | | ||
| # How To Install R on Ubuntu 20.04: | ||
| # https://www.digitalocean.com/community/tutorials/how-to-install-r-on-ubuntu-20-04 | ||
|
|
||
| sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E298A3A825C0D65DFD57CBB651716619E084DAB9 | ||
| sudo add-apt-repository 'deb https://cloud.r-project.org/bin/linux/ubuntu focal-cran40/' | ||
| sudo apt update -y | ||
| sudo apt install -y r-base libssl-dev libxml2-dev libcurl4-openssl-dev | ||
| R --version | ||
|
|
||
| - run: | ||
| name: Install pandoc | ||
| command: | | ||
| # Install a recent version of pandoc | ||
| TEMP_DEB="$(mktemp)" | ||
| wget -O "$TEMP_DEB" 'https://github.com/jgm/pandoc/releases/download/2.7.2/pandoc-2.7.2-1-amd64.deb' | ||
| sudo dpkg -i "$TEMP_DEB" | ||
| rm -f "$TEMP_DEB" | ||
|
|
||
| - run: | ||
| name: Dump R dependencies | ||
| working_directory: mlflow/R/mlflow | ||
| command: | | ||
| Rscript .dump-r-dependencies.R | ||
|
|
||
| - restore_cache: | ||
| keys: | ||
| - r-cache-{{ checksum "mlflow/R/mlflow/R-version" }} | ||
|
|
There was a problem hiding this comment.
Removed these steps because the docker image contains all required tools.
| docker: | ||
| - image: ubuntu:20.04 | ||
| machine: | ||
| image: ubuntu-2004:202111-01 |
There was a problem hiding this comment.
To use docker volume mounting, we need to use the machine executor:
https://support.circleci.com/hc/en-us/articles/360007324514-How-can-I-use-Docker-volume-mounting-on-CircleCI-
| # Note that this commit is equivalent to commit 6b48255 of Rd2md master | ||
| # (https://github.com/quantsch/Rd2md/tree/6b4825579a2df8a22898316d93729384f92a756b) | ||
| # with a single extra commit to fix rendering of \link tags between methods in R documentation. | ||
| devtools::install_git("https://github.com/smurching/Rd2md", ref = "mlflow-patches") |
There was a problem hiding this comment.
Shall we use commit hash instead of branch which may be changed ?
There was a problem hiding this comment.
Good point, but we usually don't update this branch.
| RUN apt-get install git wget libxml2-dev libgit2-dev -y | ||
| RUN TEMP_DEB=$(mktemp) && \ | ||
| wget --directory-prefix $TEMP_DEB https://github.com/jgm/pandoc/releases/download/2.16.2/pandoc-2.16.2-1-amd64.deb && \ | ||
| dpkg --install $(find $TEMP_DEB -name '*.deb') && \ |
There was a problem hiding this comment.
Can we install pandoc by RUN apt-get install pandoc -y ?
There was a problem hiding this comment.
The version installed by pandoc is 2.5 which is too old and contains a bug.
What changes are proposed in this pull request?
Build the R documentation using docker to make it easier to run and test + simplify the CircleCI workflow.
How is this patch tested?
Does this PR change the documentation?
ci/circleci: build_doccheck. If it's successful, proceed to thenext step, otherwise fix it.
Detailson the right to open the job page of CircleCI.Artifactstab.docs/build/html/index.html.Release Notes
Is this a user-facing change?
(Details in 1-2 sentences. You can just refer to another PR with a description if this PR is part of a larger change.)
What component(s), interfaces, languages, and integrations does this PR affect?
Components
area/artifacts: Artifact stores and artifact loggingarea/build: Build and test infrastructure for MLflowarea/docs: MLflow documentation pagesarea/examples: Example codearea/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registryarea/models: MLmodel format, model serialization/deserialization, flavorsarea/projects: MLproject format, project running backendsarea/scoring: MLflow Model server, model deployment tools, Spark UDFsarea/server-infra: MLflow Tracking server backendarea/tracking: Tracking Service, tracking client APIs, autologgingInterface
area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev serverarea/docker: Docker use across MLflow's components, such as MLflow Projects and MLflow Modelsarea/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registryarea/windows: Windows supportLanguage
language/r: R APIs and clientslanguage/java: Java APIs and clientslanguage/new: Proposals for new client languagesIntegrations
integrations/azure: Azure and Azure ML integrationsintegrations/sagemaker: SageMaker integrationsintegrations/databricks: Databricks integrationsHow should the PR be classified in the release notes? Choose one:
rn/breaking-change- The PR will be mentioned in the "Breaking Changes" sectionrn/none- No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" sectionrn/feature- A new user-facing feature worth mentioning in the release notesrn/bug-fix- A user-facing bug fix worth mentioning in the release notesrn/documentation- A user-facing documentation change worth mentioning in the release notes