-
Notifications
You must be signed in to change notification settings - Fork 199
Description
Description:
It is difficult to troubleshoot problems when they materialize as 500 error without any readable description in the logs.
Let's make it possible to get more information about the backend/upstream by default, in the following priority with E2E tests to prove it.
As code paths might not be the same, let's ensure these paths are tested in standalone mode
Most important is a happy path E2E that uses ollama for a simple chat completion. This can use qwen2.5:0.5b or similar tiny model. Again, this must run outside k8s in standalone mode.
Then with that in, we can ensure common failures exist and are reported with relevant HTTP status.
HTTP 404 - test for example an invalid path on the backend, like someone did http://localhost:11434/chat/completions instead of /v1/chat/completions. This ensures an invalid path isn't presented as a 500
HTTP 502 - test for invalid hostname, or incorrect port on localhost
HTTP 504 - make a test server who just hangs. the configuration can tighten the deadline so that the test fails quickly (e.g. 2s)
HTTP 500 - make a test server who receives the request, but sends back a 500. Ensure what was sent back is visible
These tests will ensure we don't use 500 for everything. Even better if we have some sort of high level description in the logs when a 5xx occurs, if there is not HTTP payload to describe it.
Relevant Links:
It was assumed that #724 was a problem in upstream, but when there's 500 it is hard to know why there would be. Especially it is an extremely simple request.