Skip to content

Workflow: publish Argus, Colossus and QueryNode as docker images#4780

Merged
mnaamani merged 30 commits intomasterfrom
publish-docker-apps
Jul 6, 2023
Merged

Workflow: publish Argus, Colossus and QueryNode as docker images#4780
mnaamani merged 30 commits intomasterfrom
publish-docker-apps

Conversation

@mnaamani
Copy link
Copy Markdown
Member

@mnaamani mnaamani commented Jun 2, 2023

Running a storage-node and distributor-node currently requires

  • cloning the monorepo
  • building packages
  • start docker services with local repo mounted as a bind volume

This is not practical for many reasons, and although is great for development cycle it is not a proper production setup. So in this PR we are making the two nodes build-able into proper independent docker images that will get published to docker hub giving operators more flexibility and deployment options in cloud, k8 clusters etc.

  • proper docker images
  • fixes found when running nodes as docker images.
    • missing direct dependencies
    • TS compiler was not copying openapi spec files into compiled lib/ destination (why is not needed when running from the monorepo?) Is it because of not installing dev dependencies?
  • Update to use node18 -> better performance/security fixes
  • query-node generated schema.graphql is now placed in the storage-node, distributor-node package so we don't require building query-node ahead of the docker images.

The images can be tested by running integration tests with the docker-compose-no-bind-volumes.yml docker compose file.

Successful run:
https://github.com/Joystream/joystream/actions/runs/5437536353

you should be able to pull the images:

docker pull joystream/storage-node:3.4.0
docker pull joystream/distributor-node:1.2.0
docker pull joystream/query-node:1.2.1

I've prepared a standalone repo for running query-node from built docker image. https://github.com/mnaamani/query-node-standalone

@vercel
Copy link
Copy Markdown

vercel bot commented Jun 2, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Updated (UTC)
pioneer-testnet ⬜️ Ignored (Inspect) Jul 5, 2023 3:57pm

@mnaamani mnaamani marked this pull request as ready for review June 5, 2023 05:31
@mnaamani mnaamani added colossus argus Argus distributor node labels Jun 5, 2023
@mnaamani mnaamani force-pushed the publish-docker-apps branch 2 times, most recently from 53e1765 to 5cd1bb4 Compare June 5, 2023 14:43
@mnaamani mnaamani force-pushed the publish-docker-apps branch from 5cd1bb4 to ce431f9 Compare June 5, 2023 16:56
@mnaamani mnaamani removed the request for review from Lezek123 June 9, 2023 07:04
@mnaamani mnaamani marked this pull request as draft June 9, 2023 08:41
@mnaamani
Copy link
Copy Markdown
Member Author

mnaamani commented Jun 9, 2023

After updating from master, storage-node as docker image is failing:

2023-06-09 12:40:58 2023-06-09 08:40:58:4058 error: Server error: TypeError: Cannot read properties of undefined (reading 'sync')
2023-06-09 12:40:58 error Command failed with exit code 105.

Looking into it.

@mnaamani
Copy link
Copy Markdown
Member Author

mnaamani commented Jun 9, 2023

Resolved it by reverting to older verision of mkdirp in storage-node. As there was a conflict with with version used in query-node/codegen. I think we really need to break out the query node into its own repo as it wrecks havock with package dependencies in the yarn worksapce.

@mnaamani mnaamani marked this pull request as ready for review June 9, 2023 17:56
@mnaamani
Copy link
Copy Markdown
Member Author

For some reason in the integration test (content-directory scenario) with the docker images built the CLI is failing to connect to distributor

2023-06-20T06:39:40.309Z integration-tests:job:manage channels and videos through CLI flow 0 failed:
Error: connect ECONNREFUSED 127.0.0.1:3334
    at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1494:16) {
  errno: -111,
  code: 'ECONNREFUSED',
  syscall: 'connect',
  address: '127.0.0.1',
  port: 3334,
  config: {
    url: 'http://127.0.0.1:3334/api/v1/assets/0',
.
.

Tested locally without issues. Something specific to github runner maybe?

@mnaamani
Copy link
Copy Markdown
Member Author

Tested locally without issues. Something specific to github runner maybe?

Again it was dependency issue on building monorepo before docker images.

 (node:28) [MODULE_NOT_FOUND] Error Plugin: @joystream/distributor-cli: Cannot find module '/joystream/node_modules/@joystream/storage-node-client/lib/index.js'. Please verify that the package.json has a valid "main" entry

Will add build step for the storage-node-client in distributor dockerfile.

@zeeshanakram3 zeeshanakram3 added the origo Origo Release label Jun 23, 2023
Copy link
Copy Markdown
Contributor

@zeeshanakram3 zeeshanakram3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

query-node generated schema.graphql is now placed in the storage-node, distributor-node package so we don't require building query-node ahead of the docker images.

  1. I see that schema.graphql has also been committed to tests/network-tests package, since integration test has direct dependency on building & running the QN, so maybe the tests/network-tests/src/graphql/schema.graphql can be gitignored?
  2. I think same can be said about cli package too, as during local development it has dependency on QN and most of the commands require running QN, so I think it does not make much sense to commit auto-generated code to the repo and it generally also aligns with the decision we made a while ago to un-commit auto generated files (#4640)

TS compiler was not copying openapi spec files into compiled lib/ destination (why is not needed when running from the monorepo?) Is it because of not installing dev dependencies?

First, tsc does not copy non .js/.d.ts files into the outDir directory defined in tsconfig.json (only exception to this is I think .json files).

Now, when running the storage/distributor nodes locally or with docker-compose's bind mount setup, the OPENAPI_SPEC_PATH resolves to the api-spec file in the src directory and not in the lib directory so the nodes bootstrap successfully.

This is because, by default oclif tries to run a typescript project directly using ts-node without compiling to js artifacts. From the oclif documentation:

We generate CLIs and plugins that use ts-node to make it fast to run the TypeScript code without a compilation step. You won't have to mess around with build configuration using oclif.

So why it does not work in the new docker steup? My hunch is that if oclif can't find ts-node in node_modules it would fallback to running the js artifacts In the /lib dir. And since ts-node is defined as devDependency in the package.json, and the docker image is only installing the dependencies (specified by --production flag to the yarn install), and not the dev dependencies.

Update to use node18 -> better performance/security fixes

Shouldn't we also update the compatible node versions in package.json file

joystream/package.json

Lines 67 to 74 in d61783a

"engines": {
"node": ">=14.0.0",
"yarn": "^1.22.15"
},
"volta": {
"node": "14.18.0",
"yarn": "1.22.15"
}

RUN yarn workspace @joystream/types build
RUN yarn workspace @joystream/metadata-protobuf build
RUN yarn workspace storage-node build
RUN find . -name "node_modules" -type d -prune
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this command is added here? It seems like a debug relic. This command is looking for all directories named "node_modules" starting from the current directory and printing it on console, the same command also exists in distributor-node.Dockerfile

Here is the response of the above command

./query-node/generated/graphql-server/node_modules
./query-node/node_modules
./query-node/codegen/node_modules
./query-node/mappings/node_modules
./devops/prettier-config/node_modules
./devops/eslint-config/node_modules
./types/node_modules
./node_modules
./tests/network-tests/node_modules
./utils/migration-scripts/node_modules
./utils/api-scripts/node_modules
./cli/node_modules
./metadata-protobuf/node_modules
./joystreamjs/node_modules
./storage-node/node_modules
./storage-node/client/node_modules
./distributor-node/node_modules
./distributor-node/client/node_modules

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interesting, perhaps the output was supposed to be piped to xargs to apply rm -fr?
I'll remove it.

COPY ./package.json package.json
COPY ./tsconfig.json ./tsconfig.json

RUN yarn --frozen-lockfile
Copy link
Copy Markdown
Contributor

@zeeshanakram3 zeeshanakram3 Jun 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way Dockerfile is structured, the npm dependencies will be installed even if only the code changed and, no new dependencies were added in the package.json files (in short all the subsequent layers needs to be build again after COPY ./storage-node storage-node even if there was only one liner change in storage-node package). Ideally the following build layers should be placed at the top of the Dockerfile,

COPY ./yarn.lock yarn.lock
COPY ./types/package.json types/package.json
COPY ./metadata-protobuf/package.json metadata-protobuf/package.json
COPY ./storage-node/package.json storage-node/package.json
COPY ./package.json package.json
RUN yarn --frozen-lockfile

so that cached node modules layer can be used whenever there was no change in yarn.lock/package.json files, this approach considerably improves the build time of the images.

I think similar docker build optimisations can also be made in distributor-node.Dockerfile

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, I'll apply your suggestions and test it out.

@mnaamani mnaamani changed the title Add workflow to publish argus and colossus docker images Workflow: publish Argus, Colossus and QueryNode as docker images Jul 3, 2023
@mnaamani mnaamani requested a review from zeeshanakram3 July 3, 2023 12:26
Copy link
Copy Markdown
Contributor

@zeeshanakram3 zeeshanakram3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Just one thing, did you see the last comment from #4780 (review)

Shouldn't we also update the compatible node versions in package.json file
...

WDYT?

@mnaamani
Copy link
Copy Markdown
Member Author

mnaamani commented Jul 5, 2023

Looks good. Just one thing, did you see the last comment from #4780 (review)

Shouldn't we also update the compatible node versions in package.json file
...

WDYT?

Sorry I totally missed that.
I had actually prepared a separate PR to update for that #4778 which I was planning to finish after this.

Thanks for explaining the oclif/ts-node matter about the openapi spec file not being copied 👍

Regarding checking in the schema.graphql I take your point. I'll revert for cli and integration tests.

Copy link
Copy Markdown
Contributor

@zeeshanakram3 zeeshanakram3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mnaamani mnaamani merged commit aee116a into master Jul 6, 2023
@mnaamani mnaamani deleted the publish-docker-apps branch July 19, 2023 21:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

argus Argus distributor node colossus origo Origo Release

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants