Skip to content

646 lab 604 custom job selector label#655

Merged
thetechnocrat-dev merged 9 commits intomainfrom
646-lab-604-custom-job-selector-label
Sep 19, 2023
Merged

646 lab 604 custom job selector label#655
thetechnocrat-dev merged 9 commits intomainfrom
646-lab-604-custom-job-selector-label

Conversation

@thetechnocrat-dev
Copy link
Copy Markdown

@thetechnocrat-dev thetechnocrat-dev commented Sep 19, 2023

Changes

  1. Add selector over ride flag to plex init and plex run
  2. Update Plex to use Bacalhau version 1.0.3

Details

Add the -s flag to plex run or plex init to directly pass selectors to Bacalhau. This will overwrite the default selector of owner=labdao.

Example

go run main.go init -s owner=josh -t QmZWYpZXsrbtzvBCHngh4YEgME5djnV5EedyTpc8DrK7k2 -i '{"protein": ["QmUWCBTqbRaKkPXQ3M14NkUuM4TEwfhVfrqLNoBB7syyyd/7n9g.pdb"], "small_molecule": ["QmViB4EnKX6PXd77WYSgMDMq9ZMX14peu3ZNoVV1LHUZwS/ZINC000019632618.sdf"]}' --scatteringMethod=dotProduct --autoRun=true -a test

Will produce the error

error submitting Bacalhau job: not enough nodes to run job. requested: 1, available: 0

Because there are no nodes with the selector owner = josh

@thetechnocrat-dev thetechnocrat-dev linked an issue Sep 19, 2023 that may be closed by this pull request
@linear
Copy link
Copy Markdown

linear bot commented Sep 19, 2023

LAB-604 custom job selector label

be able to put custom labels on plex jobs that pass through to bacalhau as a label, will be useful as instance type selector for benchmarking

@vercel
Copy link
Copy Markdown

vercel bot commented Sep 19, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated (UTC)
docs ⬜️ Ignored (Inspect) Visit Preview Sep 19, 2023 2:42pm

@thetechnocrat-dev thetechnocrat-dev temporarily deployed to ci September 19, 2023 01:05 — with GitHub Actions Inactive
@alabdao
Copy link
Copy Markdown
Contributor

alabdao commented Sep 19, 2023

If I read the code correctly, if --selector is specified with label=value, it is getting appended to hard-coded owner=labdao (or owner=labdaostaging for staging) making the final selector option to be label=value,owner=labdao. If that's correct, this would cause issues where if we want jobs submitted via plex to run on nodes without owner=labdao set. In short don't want any hard-coded selector specified.

@thetechnocrat-dev thetechnocrat-dev temporarily deployed to ci September 19, 2023 13:58 — with GitHub Actions Inactive
@acashmoney
Copy link
Copy Markdown
Contributor

acashmoney commented Sep 19, 2023

With selector flag

Expected error case when passing in node selector which doesn't exist works.

go run main.go init -s owner=aakaash -t QmZWYpZXsrbtzvBCHngh4YEgME5djnV5EedyTpc8DrK7k2 -i '{"protein": ["QmUWCBTqbRaKkPXQ3M14NkUuM4TEwfhVfrqLNoBB7syyyd/7n9g.pdb"], "small_molecule": ["QmViB4EnKX6PXd77WYSgMDMq9ZMX14peu3ZNoVV1LHUZwS/ZINC000019632618.sdf"]}' --scatteringMethod=dotProduct --autoRun=true -a test
Plex version (v0.10.4) up to date.
Pinned IO JSON CID: QmWhpHj9qWwHETVwCJtK1o5wXvxF8h7oVBxQQKpp4HkDXK
Created working directory: /Users/aakaash/Desktop/code/OPENLAB/plex/jobs/fa205bcb-4fff-4875-9a04-4b17cbdc51e4
Initialized IO file at: /Users/aakaash/Desktop/code/OPENLAB/plex/jobs/fa205bcb-4fff-4875-9a04-4b17cbdc51e4/io.json
Processing IO Entries
Starting to process IO entry 0 
Error processing IO entry 0 
error submitting Bacalhau job: not enough nodes to run job. requested: 1, available: 0
Finished processing, results written to /Users/aakaash/Desktop/code/OPENLAB/plex/jobs/fa205bcb-4fff-4875-9a04-4b17cbdc51e4/io.json
Completed IO JSON CID: QmQjaFxq4TpynLh7iLDjAAT96pzSL9T9S9TNd15sJqxSVY

Without selector flag

@thetechnocrat-dev however, I got a context deadline exceeded error 2/5 times running without the selector flag.

go run main.go init -t QmZWYpZXsrbtzvBCHngh4YEgME5djnV5EedyTpc8DrK7k2 -i '{"protein": ["QmUWCBTqbRaKkPXQ3M14NkUuM4TEwfhVfrqLNoBB7syyyd/7n9g.pdb"], "small_molecule": ["QmViB4EnKX6PXd77WYSgMDMq9ZMX14peu3ZNoVV1LHUZwS/ZINC000019632618.sdf"]}' --scatteringMethod=dotProduct --autoRun=true -a test
Plex version (v0.10.4) up to date.
Pinned IO JSON CID: QmWhpHj9qWwHETVwCJtK1o5wXvxF8h7oVBxQQKpp4HkDXK
Created working directory: /Users/aakaash/Desktop/code/OPENLAB/plex/jobs/abad6223-2cca-4aaf-827e-2f1154f5669b
Initialized IO file at: /Users/aakaash/Desktop/code/OPENLAB/plex/jobs/abad6223-2cca-4aaf-827e-2f1154f5669b/io.json
Processing IO Entries
Starting to process IO entry 0 
Job running...
Bacalhau job id: 422883ea-6648-4f77-98a0-691775da9d0f 
////_🌱___////
Computed default go-libp2p Resource Manager limits based on:
    - 'Swarm.ResourceMgr.MaxMemory': "8.6 GB"
    - 'Swarm.ResourceMgr.MaxFileDescriptors': 30720

Theses can be inspected with 'ipfs swarm resources'.

Error processing IO entry 0 
error downloading Bacalhau results: failed to get ipfs cid 'QmdDEZZzjQNEFXeb3dGeS8ta2tbV5HuLMBgN6qsKqdF9fU': context deadline exceeded
Finished processing, results written to /Users/aakaash/Desktop/code/OPENLAB/plex/jobs/abad6223-2cca-4aaf-827e-2f1154f5669b/io.json
Completed IO JSON CID: QmbpaTejgpAxXmu9qyhNMcbEnCHVDANt8oQuBBJ8ecAVqp

The CID QmdDEZZzjQNEFXeb3dGeS8ta2tbV5HuLMBgN6qsKqdF9fU does exist, but it appears DownloadBacalhauResults isn't able to fetch it? It's unclear to me why this works 3/5 times.

@thetechnocrat-dev thetechnocrat-dev temporarily deployed to ci September 19, 2023 14:42 — with GitHub Actions Inactive
@thetechnocrat-dev
Copy link
Copy Markdown
Author

thetechnocrat-dev commented Sep 19, 2023

@acashmoney thanks for the extra testing. I changed the timeout to 5 minutes, which is what we had before and what bacalhau has set as the default.

@alabdao, good catch I forgot I coded the selector to append last week. It now overrides.

Copy link
Copy Markdown
Contributor

@acashmoney acashmoney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's goo

With selector flag

go run main.go init -s owner=aakaash -t QmZWYpZXsrbtzvBCHngh4YEgME5djnV5EedyTpc8DrK7k2 -i '{"protein": ["QmUWCBTqbRaKkPXQ3M14NkUuM4TEwfhVfrqLNoBB7syyyd/7n9g.pdb"], "small_molecule": ["QmViB4EnKX6PXd77WYSgMDMq9ZMX14peu3ZNoVV1LHUZwS/ZINC000019632618.sdf"]}' --scatteringMethod=dotProduct --autoRun=true -a test
Plex version (v0.10.4) up to date.
Pinned IO JSON CID: QmWhpHj9qWwHETVwCJtK1o5wXvxF8h7oVBxQQKpp4HkDXK
Created working directory: /Users/aakaash/Desktop/code/OPENLAB/plex/jobs/fa3c55e8-9bf9-4d31-925f-9f8671a38315
Initialized IO file at: /Users/aakaash/Desktop/code/OPENLAB/plex/jobs/fa3c55e8-9bf9-4d31-925f-9f8671a38315/io.json
Processing IO Entries
Starting to process IO entry 0 
Error processing IO entry 0 
error submitting Bacalhau job: not enough nodes to run job. requested: 1, available: 0
Finished processing, results written to /Users/aakaash/Desktop/code/OPENLAB/plex/jobs/fa3c55e8-9bf9-4d31-925f-9f8671a38315/io.json
Completed IO JSON CID: QmQjaFxq4TpynLh7iLDjAAT96pzSL9T9S9TNd15sJqxSVY

Without selector flag

go run main.go init -t QmZWYpZXsrbtzvBCHngh4YEgME5djnV5EedyTpc8DrK7k2 -i '{"protein": ["QmUWCBTqbRaKkPXQ3M14NkUuM4TEwfhVfrqLNoBB7syyyd/7n9g.pdb"], "small_molecule": ["QmViB4EnKX6PXd77WYSgMDMq9ZMX14peu3ZNoVV1LHUZwS/ZINC000019632618.sdf"]}' --scatteringMethod=dotProduct --autoRun=true -a test
Plex version (v0.10.4) up to date.
Pinned IO JSON CID: QmWhpHj9qWwHETVwCJtK1o5wXvxF8h7oVBxQQKpp4HkDXK
Created working directory: /Users/aakaash/Desktop/code/OPENLAB/plex/jobs/02a36ee7-ecd7-4aa6-87e7-ec11a6cf98fc
Initialized IO file at: /Users/aakaash/Desktop/code/OPENLAB/plex/jobs/02a36ee7-ecd7-4aa6-87e7-ec11a6cf98fc/io.json
Processing IO Entries
Starting to process IO entry 0 
Job running...
Bacalhau job id: 3ac8d6a5-e14e-4b1d-9657-f1d677bdfbd2 
////_🌱___////
Computed default go-libp2p Resource Manager limits based on:
    - 'Swarm.ResourceMgr.MaxMemory': "8.6 GB"
    - 'Swarm.ResourceMgr.MaxFileDescriptors': 30720

Theses can be inspected with 'ipfs swarm resources'.

Success processing IO entry 0 
Finished processing, results written to /Users/aakaash/Desktop/code/OPENLAB/plex/jobs/02a36ee7-ecd7-4aa6-87e7-ec11a6cf98fc/io.json
Completed IO JSON CID: QmbW5AqM8jdF8dcu2xe16g23Ykirc8TSjWWqESK4XNfbxe

@thetechnocrat-dev thetechnocrat-dev merged commit 10656ed into main Sep 19, 2023
@thetechnocrat-dev thetechnocrat-dev deleted the 646-lab-604-custom-job-selector-label branch September 19, 2023 15:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[LAB-604] custom job selector label

3 participants