Skip to content

Ensure docker-compose runs down process if any error setting up service or agents#3207

Merged
mrodm merged 5 commits intoelastic:mainfrom
mrodm:failures-multiple-deployers
Jan 20, 2026
Merged

Ensure docker-compose runs down process if any error setting up service or agents#3207
mrodm merged 5 commits intoelastic:mainfrom
mrodm:failures-multiple-deployers

Conversation

@mrodm
Copy link
Contributor

@mrodm mrodm commented Jan 14, 2026

Ensures that docker-compose down process is executed if there is any failure in the SetUp process of Service and Agent deployers based on docker-compose.

Fixes elastic/integrations#16941

Author's Checklist

  • Validate with services running via terraform and docker-compose scenarios.
  • Validate that are container logs are written to a file.
  • Take into account running system tests by steps (e.g. --setup, --no-provision or --tear-down flags).

How to test this package locally

elastic-package stack up -v -d

cd test/packages/parallel/terraform_local

# Force a failure in the terraform code so the container exits with code 1
sed -i 's/locals/locals\nlocals/g' test/packages/parallel/terraform_local/data_stream/local/_dev/deploy/tf/main.tf

## Output diff:
#--- test/packages/parallel/terraform_local/data_stream/local/_dev/deploy/tf/main.tf
#+++ test/packages/parallel/terraform_local/data_stream/local/_dev/deploy/tf/main.tf
#@@ -4,6 +4,7 @@ resource "local_file" "log" {
#   file_permission = "0777"
# }
# 
#+locals
# locals {
#   items ={
#     environment  = "${var.ENVIRONMENT}"

# Running this test with the latest elastic-package release is not going to tear down the service

elastic-package-v0.118.0 test system -v --data-streams local -v

# Docker networks are still present
docker network ls |grep elastic-package-service
# Example output
# e7b338653298   elastic-package-service-64294_default   bridge    local

# Delete the previous docker network, for this example it would be:
docker network rm elastic-package-service-64294_default

# Running this test with the code from this PR, it will run the tear down process
go run github.com/elastic/elastic-package test system --data-streams local -v

# Check docker networks, no `elastic-package-service-*` should be present:
docker network ls | grep elastic-package-service


elastic-package stack down -v

@mrodm mrodm self-assigned this Jan 14, 2026
@mrodm
Copy link
Contributor Author

mrodm commented Jan 14, 2026

test integrations

@elastic-vault-github-plugin-prod

Created or updated PR in integrations repository to test this version. Check elastic/integrations#16956

@mrodm
Copy link
Contributor Author

mrodm commented Jan 14, 2026

Checking the Buildkite output of PR elastic/integrations#16956

2026/01/14 01:57:45 DEBUG Container uploader (98de2b2f8aa4) status: exited (exit code: 1)
2026/01/14 01:57:45  INFO Write container logs to file: /opt/buildkite-agent/builds/bk-agent-prod-gcp-1768354945608483478/elastic/integrations/build/container-logs/gcs-mock-service-1768355865394902252.log
2026/01/14 01:57:45 DEBUG removing agent...
2026/01/14 01:57:47  INFO Tearing down agent...
2026/01/14 01:57:47 DEBUG tearing down agent using Docker Compose runner
2026/01/14 01:57:48  INFO Write container logs to file: /opt/buildkite-agent/builds/bk-agent-prod-gcp-1768354945608483478/elastic/integrations/build/container-logs/elastic-agent-1768355868031981941.log
 Container elastic-package-agent-netskope-transaction-67348-elastic-agent-1  Stopping
 Container elastic-package-agent-netskope-transaction-67348-elastic-agent-1  Stopped
 Container elastic-package-agent-netskope-transaction-67348-elastic-agent-1  Removing
 Container elastic-package-agent-netskope-transaction-67348-elastic-agent-1  Removed
 Network elastic-package-agent-netskope-transaction-67348_default  Removing
 Network elastic-package-agent-netskope-transaction-67348_default  Removed
2026/01/14 01:57:48 DEBUG deleting test policies...
2026/01/14 13:38:04 DEBUG Container uploader (b2091147a858) status: exited (exit code: 1)
2026/01/14 13:38:04 DEBUG tearing down service using Docker Compose runner
 Container elastic-package-service-626066727-uploader-1  Stopping
 Container elastic-package-service-626066727-gcs-mock-service-1  Stopping
 Container elastic-package-service-626066727-uploader-1  Stopped
 Container elastic-package-service-626066727-azure-blob-storage-emulator-1  Stopping
 Container elastic-package-service-626066727-gcs-mock-service-1  Stopped
 Container elastic-package-service-626066727-azure-blob-storage-emulator-1  Stopped
2026/01/14 13:38:05  INFO Write container logs to file: /opt/buildkite-agent/builds/bk-agent-prod-gcp-1768396933189470609/elastic/integrations/build/container-logs/gcs-mock-service-1768397885256789478.log
 Container elastic-package-service-626066727-gcs-mock-service-1  Stopping
 Container elastic-package-service-626066727-uploader-1  Stopping
 Container elastic-package-service-626066727-gcs-mock-service-1  Stopped
 Container elastic-package-service-626066727-gcs-mock-service-1  Removing
 Container elastic-package-service-626066727-uploader-1  Stopped
 Container elastic-package-service-626066727-uploader-1  Removing
 Container elastic-package-service-626066727-uploader-1  Removed
 Container elastic-package-service-626066727-azure-blob-storage-emulator-1  Stopping
 Container elastic-package-service-626066727-azure-blob-storage-emulator-1  Stopped
 Container elastic-package-service-626066727-azure-blob-storage-emulator-1  Removing
 Container elastic-package-service-626066727-azure-blob-storage-emulator-1  Removed
 Container elastic-package-service-626066727-gcs-mock-service-1  Removed
 Network elastic-package-service-626066727_default  Removing
 Network elastic-package-service-626066727_default  Removed
2026/01/14 13:38:05 DEBUG removing agent...
2026/01/14 13:38:06  INFO Tearing down agent...
2026/01/14 13:38:06 DEBUG tearing down agent using Docker Compose runner
2026/01/14 13:38:06  INFO Write container logs to file: /opt/buildkite-agent/builds/bk-agent-prod-gcp-1768396933189470609/elastic/integrations/build/container-logs/elastic-agent-1768397886593943528.log
 Container elastic-package-agent-netskope-transaction-48220-elastic-agent-1  Stopping
 Container elastic-package-agent-netskope-transaction-48220-elastic-agent-1  Stopped
 Container elastic-package-agent-netskope-transaction-48220-elastic-agent-1  Removing
 Container elastic-package-agent-netskope-transaction-48220-elastic-agent-1  Removed
 Network elastic-package-agent-netskope-transaction-48220_default  Removing
 Network elastic-package-agent-netskope-transaction-48220_default  Removed
2026/01/14 13:38:07 DEBUG deleting test policies...

@mrodm
Copy link
Contributor Author

mrodm commented Jan 14, 2026

/test

@mrodm mrodm changed the title Add defer function to ensure docker-compose runs down process if any error Ensure docker-compose runs down process if any error setting up service or agents Jan 14, 2026
}
logger.Debug("Tearing down service due to setup error")
// Update svcInfo with the latest info before tearing down
agent.agentInfo = agentInfo
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

teardown process requires agent.agentInfo.Name and depending on where the error was raised that value is not set yet in agent.agentInfo but it is in agentInfo variable.

Same applies for service deployer changes.

@mrodm mrodm marked this pull request as ready for review January 16, 2026 14:52
@mrodm mrodm requested a review from a team January 16, 2026 14:52
Copy link
Member

@jsoriano jsoriano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

if err != nil {
processAgentContainerLogs(ctx, p, compose.CommandOptions{
Env: opts.Env,
}, agentName)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we need to process logs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agent.TearDown and service.TearDown methods perform the logs processing too and they write those logs to a file. So those log files are still created with this change.

For instance:
https://buildkite.com/elastic/integrations/builds/36477#019bc194-6d6a-4f7c-9dc9-9fc250eef030/L1819-L1827

2026/01/15 12:31:23 DEBUG Tearing down service due to setup error
2026/01/15 12:31:23 DEBUG tearing down service using Docker Compose runner
 Container elastic-package-service-25ac85762-uploader-1  Stopping
 Container elastic-package-service-25ac85762-gcs-mock-service-1  Stopping
 Container elastic-package-service-25ac85762-uploader-1  Stopped
 Container elastic-package-service-25ac85762-azure-blob-storage-emulator-1  Stopping
 Container elastic-package-service-25ac85762-gcs-mock-service-1  Stopped
 Container elastic-package-service-25ac85762-azure-blob-storage-emulator-1  Stopped
2026/01/15 12:31:24  INFO Write container logs to file: /opt/buildkite-agent/builds/bk-agent-prod-gcp-1768479315190015396/elastic/integrations/build/container-logs/gcs-mock-service-1768480284898784890.log 

Just realized that if the --no-provision flag is set, those logs are not processed and they could be interesting too. I'll work on adding those.

@mrodm
Copy link
Contributor Author

mrodm commented Jan 19, 2026

/test

@elasticmachine
Copy link
Collaborator

💚 Build Succeeded

History

cc @mrodm

Comment on lines +146 to +154
defer func() {
if err == nil {
return
}
logger.Debug("Tearing down service due to setup error")
// Update svcInfo with the latest info before tearing down
service.svcInfo = svcInfo
service.TearDown(context.WithoutCancel(ctx))
}()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Terraform service deployer does not support to run system tests by steps (--setup, --no-provision , --tear-down flags):

if options.RunSetup || options.RunTearDown || options.RunTestsOnly {
return nil, errors.New("terraform service deployer not supported to run by steps")
}

@mrodm mrodm requested a review from jsoriano January 19, 2026 13:57
@mrodm mrodm merged commit 483f2f9 into elastic:main Jan 20, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CI failures related to not being available a IPV4 address pool among the defaults running system tests

3 participants