Skip to content

Testing refactor for Jenkins#1098

Merged
melekes merged 8 commits intodevelopfrom
jenkins
Mar 8, 2018
Merged

Testing refactor for Jenkins#1098
melekes merged 8 commits intodevelopfrom
jenkins

Conversation

@zramsay
Copy link
Contributor

@zramsay zramsay commented Jan 11, 2018

What used to be make test_integration and a rabbit hole of bash scripts is now:

make test_cover_fast
make test_cover_slow
make test_apps
make test_persistence
make test_p2p

where each test is called by a seperate Jenkins job.

Some optimizations remain but require feedback.

# run the test container on the local network
docker run -t --rm \
-v "$GOPATH/src/github.com/tendermint/tendermint/test/p2p/:/go/src/github.com/tendermint/tendermint/test/p2p" \
-v "$GOPATH/src/github.com/tendermint/tendermint/test/p2p/:/go/src/github.com/tendermint/tendermint/test/p2p":Z \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NOTE: :Z is needed for SELinux to allow writes to the mounted volume on host

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added


if [ $1 == "slow" ]; then
for pkg in ${PKGS[@]}; do
echo $pkg
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure what determiend these distinctions. types runs way slower on my machine than node or rpc/client.

Is this really helpful ? Why not just run them all together as make test ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i split them up because of non-deterministic failures in the "slow" tests. Sure, another package might be slower or faster; these three "slow" packages just really mean "takes relatively long to run and has been known to reliably fail non-deterministically

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's much easier to debug failures when they're in different jobs. The "slow" tests also generally pump out tons of logs. IMO the goal should be to, yeah, move the slow tests into the fast section, once they're sped up and more deterministic.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great thanks for explanation!

$DUMMY_CMD &> "dummy_${name}.log" &
fi
echo "Starting persistent dummy and tendermint: $name"
$DUMMY_CMD &> "dummy_${name}.log" &
Copy link
Contributor

@ebuchman ebuchman Jan 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will I be able to see this log on jenkins ? same for ones below

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

working on it

Copy link
Contributor Author

@zramsay zramsay Jan 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, they'll be archived as aritfacts, e.g., here's the coverage.txt that can be copied by a job that sends it to codecov. https://ci.interblock.io/job/05.Test.Tendermint.FeatureBranches.Cover.Fast/

# tendermint should already have exited when it hits the fail index
# but kill -9 for good measure
kill_procs
# kill_procs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

? we dont want to leave zombie processes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@@ -62,16 +62,47 @@ draw_deps:
########################################
### Testing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we get a single makefile target to run all the tests so we can easily run the full suite locally? need to ensure its easy to see logs and debug when doing this so we can figure out issues ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@zramsay zramsay force-pushed the jenkins branch 2 times, most recently from 69e985e to 9a32495 Compare January 19, 2018 15:43
Makefile Outdated
# requires 'tester' the image from above
bash test/p2p/test.sh tester

test_all:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally liked the old test_integrations target

Makefile Outdated
vagrant ssh -c 'make test_all'

### go tests without docker
test_unit:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like we should still keep test target. Maybe alias it to test_unit by default?

test: test_unit

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's just keep test to conform with:

# All libs should define `make test` and `make get_vendor_deps`

Makefile Outdated
BUILD_FLAGS = -ldflags "-X github.com/tendermint/tendermint/version.GitCommit=`git rev-parse --short HEAD`"

all: check build test install
all: check build install test_integrations
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Jae wanted to run tests before install.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the tests need the binary available on $PATH

@ebuchman
Copy link
Contributor

Can we get this rebased on develop and squash the commits please

@ebuchman
Copy link
Contributor

ebuchman commented Feb 5, 2018

Let's get this passing on circle so we can merge @zramsay

@zramsay
Copy link
Contributor Author

zramsay commented Feb 5, 2018

proto: no slice oenc for *reflect.rtype = []*reflect.rtype
proto: no encoder for Validators []types.Validator [GetProperties]
proto: no slice oenc for *reflect.rtype = []*reflect.rtype
proto: no encoder for Validators []types.Validator [GetProperties]
proto: no coders for types.Header
proto: no encoder for Header types.Header [GetProperties]
proto: no slice oenc for *reflect.rtype = []*reflect.rtype
proto: no encoder for ByzantineValidators []types.Evidence [GetProperties]
proto: no coders for types.Header
proto: no encoder for Header types.Header [GetProperties]
proto: no slice oenc for *reflect.rtype = []*reflect.rtype
proto: no encoder for ByzantineValidators []types.Evidence [GetProperties]
proto: no slice oenc for *reflect.rtype = []*reflect.rtype
proto: no encoder for ValidatorUpdates []types.Validator [GetProperties]
proto: no slice oenc for *reflect.rtype = []*reflect.rtype
proto: no encoder for ValidatorUpdates []types.Validator [GetProperties]

@zramsay zramsay closed this Feb 6, 2018
@zramsay zramsay reopened this Feb 6, 2018
@ebuchman
Copy link
Contributor

ebuchman commented Mar 2, 2018

Can we get this rebased and cleaned up? Maybe @xla can take a look too? Not really Jenkins relevant (though Jenkins uses some of these targets) but would be nice to get merged. Also the unit tests run much faster now so we dont need the fast/slow targets. `

@zramsay
Copy link
Contributor Author

zramsay commented Mar 2, 2018

the fast/slow is gone

@zramsay zramsay requested a review from xla March 2, 2018 07:23
Copy link
Contributor

@xla xla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ebuchman @zramsay This is green like bean and significantly faster than what we had before for the CI runs.

👍 :octocat: :shipit: 🍡

docker run --name run_persistence -t tester bash test/persist/test_failure_indices.sh

# TODO undockerize
# bash test/persist/test_failure_indices.sh
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one reason the tests are faster here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol

circle.yml Outdated
- cd "$PROJECT_PATH" && mv test_integrations.log "${CIRCLE_ARTIFACTS}"
- cd "$PROJECT_PATH" && bash <(curl -s https://codecov.io/bash) -f coverage.txt
- cd "$PROJECT_PATH" && mv coverage.txt "${CIRCLE_ARTIFACTS}"
- cd "$PROJECT_PATH" && cp test/logs/messages "${CIRCLE_ARTIFACTS}/docker.log"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we still need this log

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docker log usually isn't very telling, what information would we lose?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docker log usually isn't very telling, what information would we lose?

huh? all docker containers are logging msgs to docker.log through syslog

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeh the eg. p2p integration tests spin up 4 containers and do a bunch of tests with them. The only way to see what's happening on those nodes is the docker.log (we don't pipe it out to circle stdout directly because its so verbose)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Surely, but the majority of those lines are the output of our binaries which we captured with the test_integration.log.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, clarified now. Would be nice to have a better split of those logs for easier tracing when doing forensics.

@zramsay
Copy link
Contributor Author

zramsay commented Mar 5, 2018

so are we just missing the logs from that last comment?

BUILD_FLAGS = -ldflags "-X github.com/tendermint/tendermint/version.GitCommit=`git rev-parse --short=8 HEAD`"

all: check build test install
all: check build test_integrations install
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we really want to run test_integrations here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC this was requested (somewhere, can't remember)

docker run --name run_persistence -t tester bash test/persist/test_failure_indices.sh

# TODO undockerize
# bash test/persist/test_failure_indices.sh
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol

@xla
Copy link
Contributor

xla commented Mar 6, 2018

I'm not sure if this PR makes sense anymore, with the changes happened to Circle and @greg-szabo efforts. Was able to rebase latest develop and it builds: https://circleci.com/workflow-run/7da83251-b7d6-4284-a572-dd908256b753

@ebuchman
Copy link
Contributor

ebuchman commented Mar 6, 2018

This PR is worth keeping for some of the cleanup (deleting unused files etc.). If someone wants to rebase and merge thatd be great!

@codecov-io
Copy link

codecov-io commented Mar 6, 2018

Codecov Report

Merging #1098 into develop will decrease coverage by 1.14%.
The diff coverage is n/a.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #1098      +/-   ##
===========================================
- Coverage    60.58%   59.43%   -1.15%     
===========================================
  Files          112      121       +9     
  Lines        10863    11040     +177     
===========================================
- Hits          6581     6562      -19     
- Misses        3644     3865     +221     
+ Partials       638      613      -25
Impacted Files Coverage Δ
lite/memprovider.go 79.06% <0%> (-18.37%) ⬇️
types/validator.go 61.4% <0%> (-8.6%) ⬇️
types/validator_set.go 43.68% <0%> (-6.5%) ⬇️
p2p/switch.go 47.03% <0%> (-4.22%) ⬇️
types/block.go 29.91% <0%> (-3.16%) ⬇️
p2p/node_info.go 51.78% <0%> (-2.94%) ⬇️
state/execution.go 68.54% <0%> (-1.85%) ⬇️
consensus/replay.go 67.75% <0%> (-1.68%) ⬇️
cmd/tendermint/commands/root.go 58.33% <0%> (-1.67%) ⬇️
p2p/conn/connection.go 81.11% <0%> (-1.23%) ⬇️
... and 59 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2ce57a6...b0c3942. Read the comment docs.

@xla
Copy link
Contributor

xla commented Mar 6, 2018

@ebuchman Done.

We might wanna squash for a bit cleaner history.

# unless there is a reason not to.
# https://www.gnu.org/software/make/manual/html_node/Phony-Targets.html
.PHONY: check build build_race dist install check_tools get_tools update_tools get_vendor_deps draw_deps get_deps_bin_size test test_race test_integrations test_release test100 vagrant_test fmt metalinter metalinter_all
.PHONY: check build build_race dist install check_tools get_tools update_tools get_vendor_deps draw_depsbuild_test_docker_image test_cover test_apps test_persistence test_p2p test test_race test_libs test_integrations test_release test100 vagrant_test fmt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"draw_depsbuild_test_docker_image" space is missing

@melekes melekes merged commit 13a2013 into develop Mar 8, 2018
@melekes melekes deleted the jenkins branch March 8, 2018 14:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants