-
Notifications
You must be signed in to change notification settings - Fork 7.4k
Open
Labels
P1Issue that should be fixed within a few weeksIssue that should be fixed within a few weekscommunity-backlogenhancementRequest for new feature and/or capabilityRequest for new feature and/or capabilityserveRay Serve Related IssueRay Serve Related Issueusability
Description
Description
Sub-issue of #55833
Add the skeleton for serve status -v and backend API.
This should define the CLI entrypoint, JSON schema, and text renderer with placeholder values.
Later work will populate these sections with real data.
For reference, implementation will build on:
-
Event summarizer:
ray/python/ray/autoscaler/_private/event_summarizer.py
Lines 57 to 64 in 73e1131
def summary(self) -> List[str]: """Generate the aggregated log summary of all added events.""" with self.lock: out = [] for template, quantity in self.events_by_key.items(): out.append(template.format(quantity)) out.extend(self.messages_to_send) return out -
CLI integration:
ray/python/ray/serve/scripts.py
Lines 658 to 729 in 044f910
@cli.command( short_help="Get the current status of all Serve applications on the cluster.", help=( "Prints status information about all applications on the cluster.\n\n" "An application may be:\n\n" "- NOT_STARTED: the application does not exist.\n" "- DEPLOYING: the deployments in the application are still deploying and " "haven't reached the target number of replicas.\n" "- RUNNING: all deployments are healthy.\n" "- DEPLOY_FAILED: the application failed to deploy or reach a running state.\n" "- DELETING: the application is being deleted, and the deployments in the " "application are being teared down.\n\n" "The deployments within each application may be:\n\n" "- HEALTHY: all replicas are acting normally and passing their health checks.\n" "- UNHEALTHY: at least one replica is not acting normally and may not be " "passing its health check.\n" "- UPDATING: the deployment is updating." ), ) @click.option( "--address", "-a", default=os.environ.get("RAY_DASHBOARD_ADDRESS", "http://localhost:8265"), required=False, type=str, help=RAY_DASHBOARD_ADDRESS_HELP_STR, ) @click.option( "--name", "-n", default=None, required=False, type=str, help=( "Name of an application. If set, this will display only the status of the " "specified application." ), ) def status(address: str, name: Optional[str]): warn_if_agent_address_set() serve_details = ServeInstanceDetails( **ServeSubmissionClient(address).get_serve_details() ) status = asdict(serve_details._get_status()) # Ensure multi-line strings in app_status is dumped/printed correctly yaml.SafeDumper.add_representer(str, str_presenter) if name is None: print( yaml.safe_dump( # Ensure exception traceback in app_status are printed correctly process_dict_for_yaml_dump(status), default_flow_style=False, sort_keys=False, ), end="", ) else: if name not in serve_details.applications: cli_logger.error(f'Application "{name}" does not exist.') else: print( yaml.safe_dump( # Ensure exception tracebacks in app_status are printed correctly process_dict_for_yaml_dump(status["applications"][name]), default_flow_style=False, sort_keys=False, ), end="", )
Use case
Provides a visible CLI entrypoint early on, so other contributors can hook in deployment/application/external metrics without conflicts.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
P1Issue that should be fixed within a few weeksIssue that should be fixed within a few weekscommunity-backlogenhancementRequest for new feature and/or capabilityRequest for new feature and/or capabilityserveRay Serve Related IssueRay Serve Related Issueusability