Skip to content

package: Add support for different query job types:#452

Merged
haiqi96 merged 12 commits into
y-scope:mainfrom
haiqi96:AddJobTypes
Jun 24, 2024
Merged

package: Add support for different query job types:#452
haiqi96 merged 12 commits into
y-scope:mainfrom
haiqi96:AddJobTypes

Conversation

@haiqi96

@haiqi96 haiqi96 commented Jun 19, 2024

Copy link
Copy Markdown
Contributor

Description

This PR is a preparation to let query-scheduler supports more than one type of jobs.
The following changes are made:

  1. Add a new "type" column to the job metadata table.
  2. Adds an abstract base class for QueryJob and QueryConfig, which will be used as the base class for different types of job
  3. Refactor the query-scheduler to handle jobs differently based on their type
  4. Refactor some helper functions in the query-scheduler so they can handle generic QueryJob
  5. Updated job submission logic to add job's type

Validation performed

Built the package
Verified that search and reducer works properly from both commandline and webui

Comment thread components/job-orchestration/job_orchestration/scheduler/constants.py Outdated
SELECT {QUERY_JOBS_TABLE_NAME}.id as job_id
FROM {QUERY_JOBS_TABLE_NAME}
WHERE {QUERY_JOBS_TABLE_NAME}.status={QueryJobStatus.CANCELLING}
AND {QUERY_JOBS_TABLE_NAME}.type={QueryJobType.SEARCH}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assume we only cancel search job to simplify the implementation

@junhaoliao junhaoliao left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The WebUI changes look good to me.

@haiqi96 haiqi96 changed the title Add job types package: Refactor query job in the search-cluster to support different job types. Jun 19, 2024
@haiqi96 haiqi96 marked this pull request as ready for review June 19, 2024 21:10


class SearchJob(QueryJob):
search_config: SearchConfig

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we rename this to just "job_config"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think search_config is fine.

@kirkrodrigues kirkrodrigues requested a review from gibber9809 June 24, 2024 13:03
SELECT {QUERY_JOBS_TABLE_NAME}.id as job_id
FROM {QUERY_JOBS_TABLE_NAME}
WHERE {QUERY_JOBS_TABLE_NAME}.status={QueryJobStatus.CANCELLING}
AND {QUERY_JOBS_TABLE_NAME}.type={QueryJobType.SEARCH}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the naming of QueryJobType.SEARCH gets confusing for me here. When I see this my first thought is wondering about what happens to aggregation jobs, but actually QueryJobType.SEARCH represents search AND aggregation jobs.

I think we might want to eventually split these out into different job types, but to avoid holding you up here how about we just rename this to QueryJobType.SEARCH_OR_AGGREGATION which is ugly but more clear at least.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it also require renaming other places in the query_scheduler?
for example, "pending_query_job" will become "pending_query_or_aggregation_job", and "search_config" becomes "search_or_aggregation_config"?

I personally prefer to not rename those, but only change QueryJobType.SEARCH to QueryJobType.SEARCH_OR_AGGREGATION.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, let's just rename the SEARCH_OR_AGGREGATION and leave everything else. The aggregations are a query of sorts, so I think its fine.



class SearchJob(QueryJob):
search_config: SearchConfig

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think search_config is fine.

@kirkrodrigues kirkrodrigues left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few small questions / suggestions.

Comment thread components/job-orchestration/job_orchestration/scheduler/job_config.py Outdated
Comment thread components/job-orchestration/job_orchestration/scheduler/scheduler_data.py Outdated
Comment thread components/job-orchestration/job_orchestration/scheduler/scheduler_data.py Outdated
Comment thread components/webui/imports/api/search/constants.js
Comment thread components/webui/imports/api/search/server/QueryJobsDbManager.js Outdated
Comment thread components/job-orchestration/job_orchestration/scheduler/query/query_scheduler.py Outdated
Comment thread components/job-orchestration/job_orchestration/scheduler/query/query_scheduler.py Outdated

@kirkrodrigues kirkrodrigues left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the PR title & body, how about:

package: Add support for different query job types:

  • Add abstract classes for QueryJobType and QueryJobConfig.
  • Submit and track job type in the search jobs table.
  • Guard search-job specific handling in the scheduler.

@haiqi96 haiqi96 changed the title package: Refactor query job in the search-cluster to support different job types. package: Add support for different query job types Jun 24, 2024

@gibber9809 gibber9809 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@haiqi96 haiqi96 changed the title package: Add support for different query job types package: Add support for different query job types: Jun 24, 2024
@haiqi96 haiqi96 merged commit 94509a8 into y-scope:main Jun 24, 2024
@haiqi96 haiqi96 deleted the AddJobTypes branch June 28, 2024 14:43
jackluo923 pushed a commit to jackluo923/clp that referenced this pull request Dec 4, 2024
- Add abstract classes for QueryJobType and QueryJobConfig.
- Submit and track job type in the search jobs table.
- Guard search-job specific handling in the scheduler.
junhaoliao pushed a commit to junhaoliao/clp that referenced this pull request May 17, 2026
- Add abstract classes for QueryJobType and QueryJobConfig.
- Submit and track job type in the search jobs table.
- Guard search-job specific handling in the scheduler.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants