[packages] Very frequent timeouts on FTL device tests

We're seeing all of the `Linux_android android_device_tests_shard_* master` tests timing out on most PRs during approximately regular business hours this week. Looking at the logs, what I see is that it's usually sitting trying to run the first FTL test, and when I click through to the dashboard link in the logs, FTL shows it as "pending". It will eventually run, at least in a couple of cases I monitored, but after the LUCI job has timed out (90 minutes).

I first saw it last Friday, March 13th, but then when I re-ran a couple of failing PRs over the weekend they were fine so I thought it was a temporary hiccup on the FTL side. However, all this week I've been seeing it happen very frequently. Sometimes it seems to manage to run them in time, but rarely, except when I trigger the re-run early in the morning (eastern) or late in the evening, when it seems much more likely to work (I only have anecdotes on this, not hard data. I would expect the FTL team could get general data on usage/backlog for these devices though).

In the past when this has happened it's usually been because the device we are using fell out of the high capacity pool and we needed to updated. @gmackall verified last Friday that these devices are still listed as `DEVICE_CAPACITY_HIGH`, but it doesn't seem like that's true in practice at the moment (unless there's some kind of demand spike for these devices).

We likely need to either try switching to a different, newer high-capacity device, or discuss with FTL folks about what might be going on.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[packages] Very frequent timeouts on FTL device tests #183935

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

[packages] Very frequent timeouts on FTL device tests #183935

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions