When using gemini-3.1-pro-high, the CLI occasionally receives a 503 Service Unavailable error with the reason MODEL_CAPACITY_EXHAUSTED. Even though the server provides a RetryInfo with a retryDelay (e.g., 9s or 10s), the CLI does not seem to honor this retry delay and instead fails the request.
Example error:
Trajectory ID: 89ecc052-02de-47e6-a2f9-62fd1f051cd4
Error: HTTP 503 Service Unavailable
{
"error": {
"code": 503,
"details": [
{
"@type": "type.googleapis.com/google.rpc.ErrorInfo",
"domain": "cloudcode-pa.googleapis.com",
"metadata": {
"model": "gemini-3.1-pro-high"
},
"reason": "MODEL_CAPACITY_EXHAUSTED"
},
{
"@type": "type.googleapis.com/google.rpc.RetryInfo",
"retryDelay": "9s"
}
],
"message": "No capacity available for model gemini-3.1-pro-high on the server",
"status": "UNAVAILABLE"
}
}
The CLI should automatically parse the RetryInfo and sleep for the specified duration before retrying the request, as there is sufficient quota left but the server is temporarily out of capacity.
When using
gemini-3.1-pro-high, the CLI occasionally receives a 503 Service Unavailable error with the reasonMODEL_CAPACITY_EXHAUSTED. Even though the server provides aRetryInfowith aretryDelay(e.g., 9s or 10s), the CLI does not seem to honor this retry delay and instead fails the request.Example error:
Trajectory ID: 89ecc052-02de-47e6-a2f9-62fd1f051cd4
Error: HTTP 503 Service Unavailable
{
"error": {
"code": 503,
"details": [
{
"@type": "type.googleapis.com/google.rpc.ErrorInfo",
"domain": "cloudcode-pa.googleapis.com",
"metadata": {
"model": "gemini-3.1-pro-high"
},
"reason": "MODEL_CAPACITY_EXHAUSTED"
},
{
"@type": "type.googleapis.com/google.rpc.RetryInfo",
"retryDelay": "9s"
}
],
"message": "No capacity available for model gemini-3.1-pro-high on the server",
"status": "UNAVAILABLE"
}
}
The CLI should automatically parse the
RetryInfoand sleep for the specified duration before retrying the request, as there is sufficient quota left but the server is temporarily out of capacity.