Skip to content

Move CI job definitions to this repo#49096

Closed
alpar-t wants to merge 14 commits intoelastic:masterfrom
alpar-t:move-jobs
Closed

Move CI job definitions to this repo#49096
alpar-t wants to merge 14 commits intoelastic:masterfrom
alpar-t:move-jobs

Conversation

@alpar-t
Copy link
Copy Markdown
Contributor

@alpar-t alpar-t commented Nov 14, 2019

This PR moves over all CI jobs definitions.
All in all I think it reads much better than what we had before.

Some jobs were combined, e.x. the 3rd party is now a single job.
Oss and default distro bwc tests are now a single job and explicit.
Docs checks are the same, they test both oss and default in a single
job.
I did not add specific PR checks for docs and bwc with the oss distro.
Both will test the default and the oss will be tested lather.
The short-cut to only do a docs check on docs only changes is now also
removed to favor simplicity. The build cache should now be sufficient,
but we can add it back if that's not the case.

I'm submitting this for early feedback. I'll do some more testing and make sure to send out an email to the team before merging.

Closes https://github.com/elastic/infra/issues/15187

@alpar-t alpar-t added the :Delivery/Build Build or test infrastructure label Nov 14, 2019
@elasticmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-core-infra (:Core/Infra/Build)

Copy link
Copy Markdown
Member

@rjernst rjernst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a few comments.

.ci/build.ps1 Outdated
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we taking destructive actions? While I understand we are using ephemeral workers, why does the CI image contain gradle init scripts we don't want to use?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ephemeral workers don't have anything there, but we still have a few workers that are not ephemeral, and we want to make sure we don't keep old scripts around.
We have to use an init script as that's the only way it will be applicable to all invocations without a lot of boilerplate to pass that around.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated: Do we have an issue to add a vagrant image for 2019?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not explicitly. We have an issue to support vagrant as part of the CI image generation. When that happens we'll get all the platforms available in CI as vagrant images

.ci/setup-EAR.sh Outdated
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have a subdirectory for scripts? This top level directory is beginning to get crowded.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added it now. os.ps1 and os.sh will need to move after the old jobs that use them are removed.

qa/build.gradle Outdated
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this intentional? If we are to do this, it should be a separate change. Where would we be running these tests with the oss distribution?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. As I understand it we run everything against OSS and then we have a specific build to use the default distribution. Have we inverted this?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current jobs no longer assume a default, they pass this in explicitly.
The existing default is not consistent, we have the default distro for docs and oss for the qa directory. I find this to be confusing and hard to reason about coverage when looking at the jobs, so this change makes it consistent that we always default to the default distribution for the sole reason that it's a super set.
You'll see that some of the existing jobs don't have the correct coverage due to this confusion.
I think it's ok to do this as part of this change, because the new jobs do provide the correct coverage and the old ones are meant to go away very soon after we merge this.

Copy link
Copy Markdown
Contributor

@mark-vieira mark-vieira left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a few general questions:

  1. What's the numeric prefix for? Does this influence behavior in some way?
  2. How are these builds going to interact with existing builds that live in the infra repo? How does Jenkins deduplicate builds with the same name, or will they be considered unique because the displayname differs? Is there anyway we could (while things are living side by side) use a different job name prefix so we can put these all under a distinct view?
  3. Speaking of, how do we define views?

.ci/build.sh Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be --build-cache I believe.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, fixed

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're adding --build-cache in build.sh I think we can ditch it everywhere else.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of this job? Doesn't it overlap with checks part 1 and 2? We don't seem to be doing this in "stages" like we do with intake where we defer the later checks until precommit passes so I'm not sure this helps in catching things more quickly.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This replaces the other tests when the >test-mute label is present.
@mark-vieira since your taking this Pr over, it might be a good idea to add comments explaining what the odd pr builder syntax really means

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we had decided to ditch the nested virtualization builds? That is what this is, right?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We did, but this is essentially the same job running on bare metal.
It's a replacement for the existing bare metal jobs ( there's an experimental job already running ) . It make better use of the bare metal hardware bu spreading out to more workers.

This PR moves over all CI jobs definitions.
Some jobs were combined, e.x. the 3rd party is now a single job.
Oss and default distro bwc tests are now a single job and explicit.
Docs checks are the same, they test both oss and default in a single
job.
I did not add specific PR checks for docs and bwc with the oss distro.
Both will test the default and the oss will be tested lather.
The short-cut to only do a docs check on docs only changes is now also
removed to favor simplicity. The build cache should now be sufficient,
but we can add it back if that's not the case.
@alpar-t
Copy link
Copy Markdown
Contributor Author

alpar-t commented Nov 19, 2019

  • What's the numeric prefix for? Does this influence behavior in some way?

It does not, it's just to influence the order in which files are listed, I think it makes it a bit easier to think about the jobs.

  • How are these builds going to interact with existing builds that live in the infra repo? How does Jenkins deduplicate builds with the same name, or will they be considered unique because the displayname differs? Is there anyway we could (while things are living side by side) use a different job name prefix so we can put these all under a distinct view?

We could use a different prefix, and a new view, but I was hoping the switch will be quick enough to make that an overkill. There should be no interaction, all these jobs should have unique names, either one not used or -next added.

  • Speaking of, how do we define views?
    There is a way to define them in this repo, but I left that in the infra repo for now, because that's where know the branches we are interested in a view for.
    Creating one for each %BRANCH% would create too many as we have the doc builds on past release branches.

@mark-vieira
Copy link
Copy Markdown
Contributor

I just encountered https://github.com/elastic/infra/issues/17055 with JJBB which is pretty much a blocker for us. I've tried a number of workarounds but cannot successfully get JJBB to configure a job w/ a non-default workspace. Since we rely on using a ramdisk for our build workspace this will need to get sorted out before we migrate.

@mark-vieira
Copy link
Copy Markdown
Contributor

With my limited bandwidth, and the current state of CI, this is simply not a high priority right now. I'm parking this effort for now, in light of other work, as well as the upcoming TeamCity POC which may make this OBE anyway.

@rjernst rjernst added the Team:Core/Infra Meta label for core/infra team label May 4, 2020
@mark-vieira
Copy link
Copy Markdown
Contributor

Closing this since we are beginning our migration to TeamCity which would make migrating to JJBB unnecessary.

@mark-vieira mark-vieira added Team:Delivery Meta label for Delivery team and removed Team:Core/Infra Meta label for core/infra team labels Nov 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Delivery/Build Build or test infrastructure stalled Team:Delivery Meta label for Delivery team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants