Skip to content
This repository was archived by the owner on Feb 12, 2022. It is now read-only.

Conversation

@bricef
Copy link

@bricef bricef commented Feb 11, 2021

Current plan

  • Jumped back into this PR. Work was put on hold for a short time, but there are still things we can do to move this ahead
  • Everyone's thoughts to be discussed at the next working group meeting (Mar 11, 2021)
  • Schedule one or more real-time working session with the group soon thereafter
    1. The goal of this session is to come to a place where all concerns are addressed, and bring this PR to a mergeable first iteration. Note merging to main will still be subject to change, but something people can begin engaging with when visiting this repo
    2. This PR (or linked) will be the canonical source of truth (GitOps style)
    3. During the sync sessions, we will "break out" into a Zoom conversation, and use hackmd.io which @bricef and @scottrigby initially collaborated on before moving that work to this PR. There we can co-edit in real time, preview the markdown as we go, leave any hackmd.io comments for posterity, and then ultimately move whatever agreed upon changes back to this PR as commits with descriptive comments and correct Co-author credit
  • Reach general consensus on the first pre-release of the GitOps Principles, revised by the GitOps WG ✅ yay!
  • New PR added to a repo in the OpenGitOps project org @scottrigbyFor initial pre-release of GitOps Principles revised by the GitOps Working Group open-gitops/documents#4

Original PR description

This commit proposes a succinct definition of the GitOps principles with additional clarification.

Co-authored-by: Scott Rigby scott@r6by.com
Signed-off-by: Brice Fernandes brice@weave.works

This commit proposes a succint definition of the GitOps principles with
additional calrification.

Co-authored-by: Scott Rigby <scott@r6by.com>
Signed-off-by: Brice Fernandes <brice@weave.works>
@bricef bricef marked this pull request as draft February 11, 2021 18:37
- Software agents continuously check that the running system under management matches the desired state configuration, and if it does not, immediately either take remedial action to bring the system back in line with stated expectations or, if this cannot be done, alert a human operator that the system is no longer meeting expectations
- Automated delivery: Delivery of the declarative descriptions, from the repository to runtime environment, is fully automated.
- Software Agents: Reconcilers maintain system state and apply the resources described in the declarative configuration.
- Closed loop: Actions are performed on divergence between the version controlled declarative configuration and the actual state of the target system.
Copy link

@moshloop moshloop Feb 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I read closed loop I immediately think of https://en.wikipedia.org/wiki/Control_theory#Open-loop_and_closed-loop_(feedback)_control

Most GitOps implementations are open reconciliation loops, the change is applied irrespective of the target state and the success or failure of the operation does not feedback into Git. Some GitOps products e.g. Argo do try and build closed loops based but their feedback is internal only (either via UI or events) - Will open a new PR on this topic as I think it is an unsolved problem

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's true the Current State doesn't feedback to the Desired State store, but it does feed back to the Software Agents so they know how the current and desired states do or do not diverge. This part of that wikipedia page you linked sounds like a correct description to me, where the "controller" map to GitOps Software Agents :

The definition of a closed loop control system according to the British Standard Institution is "a control system possessing monitoring feedback, the deviation signal formed as a result of this feedback being used to control the action of a final control element in such a way as to tend to reduce the deviation to zero."

WDYT?


#### What is a system's Desired State?

_Configuration_ is a common feature of most software systems.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
_Configuration_ is a common feature of most software systems.
_Configurability_ is an attribute of all systems, whether described externally or hard-coded internally.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this change too.. nerdy. While true, the original sentence "clicks" with most heads and does not sound too niche.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically this is not true. Many crypto systems are explicitly defined not to be configurable. I don’t equate hard coded with configuration, unless you include reflection or AOP or pointer hacking, etc.

PRINCIPLES.md Outdated
Comment on lines 70 to 76
GitOps concerns the verifiable behaviour of computer systems and their interfaces.
Specifically, GitOps is _not_ about human processes, and is not intended as a model for judging human organisational designs and operational practices.

The GitOps principles are to be used as guiding principles in the development of modern software and system operations. They do not form a concrete specification.

The GitOps principles are a direction, not a destination. They should be applied pragmatically. For example, whilst desirable to apply them strictly to an entire software systems, they can also be applied loosely to selected parts of larger systems as part of a progressive adoption.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
GitOps concerns the verifiable behaviour of computer systems and their interfaces.
Specifically, GitOps is _not_ about human processes, and is not intended as a model for judging human organisational designs and operational practices.
The GitOps principles are to be used as guiding principles in the development of modern software and system operations. They do not form a concrete specification.
The GitOps principles are a direction, not a destination. They should be applied pragmatically. For example, whilst desirable to apply them strictly to an entire software systems, they can also be applied loosely to selected parts of larger systems as part of a progressive adoption.
GitOps makes sense in environments that have a healthy balance between the [socio (human) and technical](https://academic.oup.com/iwc/article/23/1/4/693091) elements.
i.e GitOps is the structured process in which humans and technical systems interact and the codification of policy around this interaction:
* For a human (or another technical system) to make a change, it must first describe and record the change
GitOps is a *direction*, **NOT** a *destination*. It should be applied pragmatically. For example, whilst desirable to apply them strictly to an entire system, it can also be applied selectively to sub-subsystems.

You cannot talk about any non-trivial system, without talking about the human element, they are one and the same

GitOps is a structured mechanism for humans to interact inside the system, not with - the distinction is subtle but vital nevertheless.

GitOps makes sense in environments that have a healthy balance between the human and technical elements. while most sales processes are heavily weighted towards human interaction, it is not unforeseeable that some exist where GitOps makes sense.

e.g. A sale in progress could require a new environment bet created, and then promoted through various levels of priority, security and reliability on its way into production.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think that GitOps also makes sense in environments that don't have a healthy balance yet.
There, GitOps can act as an agent that reconciles the mismatch. ;)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I second this sentiment. (Good link about socio-technical systems too.) I wanted to make sure that the scope excluded human decision making explicitly. I've re-worded to reflect this and make it clearer. Do you think the meaning is clearer and more accurate now?

Copy link

@kmb385 kmb385 Mar 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciated the original statement in line 83 and found it valuable/clarifying to directly callout the separation between our intent, the principles and a specification. It also addressed a small question with the inclusion of software development because most GitOps discussion revolves around operations (delivery/reconciliation). If I change the source code of a system to account for new business logic I believe that change falls under the umbrella of GitOps because now the systems run state will be altered with its next delivery.

Ultimately, that change is part of the system and should be captured, even if its delivery is peformed through another change that identifies packaged software to be executed. Most information I read is heavily weighted on Ops, do others agree that GitOps reaches this far left to commits of application source code? If so, should it be called out in the scope?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kmb385 currently this idea is captured in a few places in this PR (specifically "systems" or "subsystems"). Does this address your concern?

Here:

By "Configuration", we mean data that defines how a system or subsystem should behave.

For example, the same web server code may be running on thousands of different servers managed by hundreds of different companies.
The behaviour of an individual webserver will differ based on how it is configured.
Configuration data is typically in the form of files or arguments to a computer program, but some systems may also currently use configuration databases or remote configuration services. Configuration also includes data about what version of code a software system should run, so software version information is also considered configuration.

Together, the aggregate of all configuration data for a system form its "Desired State". The "Desired State" of a system is defined as data sufficient to recreate the system from nothing so that instances of the system are behaviorally indinstinguishable.

Also here:

The definition of a system can be quite broad, and may incorporate human as well as programmatic processes.
For example, is a company's sales process a system?
Should GitOps be applied to it?
Although a vision in which the GitOps practices are applied generally to all processes, human or otherwise is compelling, a more pragmatic approach is preferable, to avoid the risk of attempting to "boil the ocean".

Instead, we should focus on subsystems where the Desired State is well defined, implement the GitOps principles there, and grow out to capture more systems from that initial subsystem.

The PR also retains this from the README:

GitOps is fast becoming the methodology of choice for operating modern cloud native infrastructure and applications

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@scottrigby It does, I'm very new to the project and reading the definition of "software system" in the definitions section really helped. My interpretation of that definition is that it does include the application source code.


#### How much of a system must be declared?

Ideally, all of it; and the entire system can be recreated exclusively from its Desired State.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is confusing in the context of application data. If I really wanted to recreate an entire system I'd have to consider recreating the application data that it requires to operate, and may even be interpreted in the context of disaster recovery (is GitOps a way to recover the latest known state of a system should disaster strike?) This also moves the conversation into scenarios that are perhaps beyond configuration, for instance, should DB migrations be declared as part of the desired state, and should I want to recreate a system matching that state must GitOps guarantee db schema consistency?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely should mention that DB Migrations Question in a FAQ on the Website. Will be a commonly asked question.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is GitOps a way to recover the latest known state of a system should disaster strike?

The latest configured behaviour? Yes. The latest state including application data? Not at this time.

Definitely should mention that DB Migrations Question in a FAQ on the Website. Will be a commonly asked question.

👍 That's a super common discussion.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

system matching that state must GitOps guarantee db schema consistency?

I think so, although this is genuinely difficult currently. Or rather, the db schema at a point in time should be declared. Moving to that schema from some arbitrary starting point is detail left to the reconciling agent. Currently, this will be in the form of a set of unidirectional migrations generated manually. That's an implementation detail though.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we should mention application data here though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@murillodigital wrote:

should DB migrations be declared as part of the desired state?

Yes.


@murillodigital wrote:

must GitOps guarantee db schema consistency?

Should instead of must.

GitOps should guarantee database schema consistency.


@bricef wrote:

I think so, although this is genuinely difficult currently. Or rather, the db schema at a point in time should be declared. Moving to that schema from some arbitrary starting point is detail left to the reconciling agent. Currently, this will be in the form of a set of unidirectional migrations generated manually. That's an implementation detail though.

Agreed that database schema at a point in time should be declared.

Database migrations should be generated automatically, instead of manually.

Examples of implementation details:

  1. Ask HN: How does your development team handle database migrations?
  2. Evolutionary Database Design by @pramodsadalage and @martinfowler at @thoughtworks using tools such as:
    https://github.com/liquibase/liquibase by @liquibase and @Datical
    https://github.com/flyway/flyway by @flyway and @red-gate
    https://github.com/mybatis/migrations by @mybatis
  3. https://github.com/schemahero/schemahero by @schemahero, donated by @replicatedhq to @cncf

From GitOps Principles Committee - April 14, 2021 after 40:50

@bricef said:

It's more about an aspirational direction, not necessarily what we do right now.

Agreed.


@tonit wrote:

DB Migrations Question in a FAQ

Agreed.

PRINCIPLES.md Outdated

A version is the Desired State for a system as a whole. It is the canonical form of what we desire the system to be at a point in time.

It is insufficient to version part of the Desired State or to version these parts in separate State Stores. Real software systems often have overarching behaviour that is the result of coupling between components. If the Desired State of these components were to change independently, it would be difficult to map a change in observed behaviour of our system to a single change in Desired State. Being able to make this 1:1 mapping is operationally benefitial, as we can then map behavioural issues of our system directly to the changes that occured. The utility of having the entire system defined in a single canonical location grows in proportion to the complexity and internal coupling of the system. A web of references to configuration data located in different locations is undesirable, as it makes understanding the desired state particularly difficult.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I follow this in the context of application configuration vs platform configuration as desired state. For instance, I may have some state in which I hold the declaration that I need some service running or crd available in which my application will depend (and maybe other applications as well), and some other state in which I specify the runtime parameters for one such specific application (say, endpoints or other runtime specific configuration for the application itself) - not sure its the best example but the point is, as part of a system there may be platform components as well as application components that will be interrelated, and following this statements it seems the argument is, anything that is somehow interrelated should be declared in the same state store. I think that such interpretation would be in the least not realistic.

Also I'd change real software systems to production software systems or distributed systems, can't think of a surreal or imaginary software system 😆

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😄 Indeed.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the point is, as part of a system there may be platform components as well as application components that will be interrelated, and following this statements it seems the argument is, anything that is somehow interrelated should be declared in the same state store. I think that such interpretation would be in the least not realistic.

I agree that for all practical purposes, this is currently unrealistic, but the general point still stands. It is desirable to have all of a system's configuration in a single state store. I think changing "insufficient" to "undesirable" may be better. Do you think that's sufficient @murillodigital?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the point is, as part of a system there may be platform components as well as application components that will be interrelated, and following this statements it seems the argument is, anything that is somehow interrelated should be declared in the same state store. I think that such interpretation would be in the least not realistic.

I agree that for all practical purposes, this is currently unrealistic, but the general point still stands. It is desirable to have all of a system's configuration in a single state store. I think changing "insufficient" to "undesirable" may be better. Do you think that's sufficient @murillodigital?

How about changing the approach from that which is undesirable or suboptimal to that which is ideal:

Ideally, the configuration for all interrelated components of a system should be stored in a single state store, in order to have a concise and consolidate view across all components that may have be subject to overarching behavior.

Or something along those lines? @bricef

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about changing the approach from that which is undesirable or suboptimal to that which is ideal

@murillodigital I like this approach flip


<div style="background: #E3F2FD; border-left: 3px solid #0D47A1; padding:10px; margin-bottom:5px; font-style:italic">
Software agents continuously, and automatically, compare a system's <em>Actual State</em> to its <em>Desired State</em>.
If the actual and desired states differ, automated actions are immediately attempted to reconcile them.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should a GitOps compliant system ALWAYS reconcile on identified drift or are other mechanisms possible to "close the loop" (alerting or otherwise?) I think this may be a barrier to a lot of organizations at some initial degrees of capability, although I agree this should be the desired behavior. I guess an automated action could be an alert which triggers a human-lead attempt to reconciliation...

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Manually Closing the Loop". Definitely a FAQ point for the website.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think desired state should also include the policy for when reconciliation is allowed to happen.

I desire changes to Y to only occur during change windows, rollback on A, remediate on B and alert on C
I desire X to be Y

</div>

<!--
- If the software agents fail to bring the system's state in line with its desired state, a human operator is notified.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guess this is along the lines of my previous comment.

PRINCIPLES.md Outdated
### 4. Operations through declaration

<div style="background: #E3F2FD; border-left: 3px solid #0D47A1; padding:10px; margin-bottom:5px; font-style:italic">
When wishing to operate on a software system, a human or software agent will not interact with the running system and modify it directly.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this incorrect? A software agent WILL interact with the system to apply the declaration that the human committed to the state store? (think an operator in a cluster)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
When wishing to operate on a software system, a human or software agent will not interact with the running system and modify it directly.
When wishing to operate on a software system, a human or computer program will not interact with the running system and modify it directly. Only the reconciliation-agent will operate on the software system.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
When wishing to operate on a software system, a human or software agent will not interact with the running system and modify it directly.
The mechanism in which change is applied to a system by either a human operator or another system is through a declarative change committed to a state store. Reconciliation agents then apply these changes to the systems they control.


If the desired state of a system is not explicitly defined, it is impossible to verify if the system in a correct state. The state of a running system itself does not provide sufficent information to determine its correctness.

Consider logging into an administration console and seeing that 28 machines are healthily running.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should look for an example more consistent with cloud native architecture and operational practices, this is not a realistic, dynamic horizontally scaling scenario of ephemeral resources, which would most likely be more applicable to the use cases that will be looking to adopt GitOps.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think this is still reality. And when "selling" those low-tech firms the cloud-native cool-aid, GitOps-like practices are one of the showcases that really go well.
I agree another example for existing cloud architectures would make sense, too (as an addition).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concur. A better example would be a welcome here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@murillodigital do you want to propose an additional, cloud native example?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we consider the following example authored by @TomOnTime?

Thank you @murillodigital @tonit @bricef @scottrigby @moshloop @TomOnTime.

Suggested change
Consider logging into an administration console and seeing that 28 machines are healthily running.
An excerpt from https://queue.acm.org/detail.cfm?id=3237207
For example, a desired state for a VM configuration system might state, "There should be three virtual machines named foo1, foo2, and foo3." When the file is processed the first time, the three VMs are created. Processing the same configuration file a second time will leave the system unchanged, as the machines already exist. The configuration file is idempotent. If for some reason foo2 were deleted, processing the file again would re-create foo2.
Contrast this with an imperative configuration language that states, "Add three new virtual machines." This would generate three additional machines every time it is run, eventually exhausting all resources.

RATIONALE.md Outdated

If we have no record of the desired state of our system, how can we recover from failures that occur in the transition between states?
Such transitions are very common lifecycle events, such as upgrades, new features being released or scaling our resources. These are where the majority of hard software defects and transient errors occur.
Not only are such failures extremely common, but their likelihood grows algebraically with the number of components in our system.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Not only are such failures extremely common, but their likelihood grows algebraically with the number of components in our system.
Not only are such failures extremely common, but their likelihood grows exponentially with the number of components in our system.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a very specific reason why that term is used, but the context for it is poor. I'll reword that paragraph to be clearer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@murillodigital does this paragraph reword make more sense now?

@schlomo
Copy link

schlomo commented Feb 14, 2021

I read through the merged view and I can say that I fully agree with all what is written here. Before going into the details of the formulations I'd like to suggest that we maybe accept this PR as a 0.1.0 version and then iterate from there on? I find such large PRs hard to review.

Some thoughts to ponder, but we can also do that in subsequent PRs:

  • About the content I noticed that we completely ignore the actual user data that the systems most certainly contain. When we talk about declarative state we mean of course everything except the user data that is then put into those systems as part of their intended use. Therefore the claim that the desired state data sufficient to recreate the system from nothing so that instances of the system are behaviorally indinstinguishable could be misleading as the system's behavior also depends on the user data that was put in before.

    I would suggest to mention also the user data and to define our desired state as being so complete that together with a snapshot of the user data the recreated systems would be behaviorally indistinguishable.
  • I think that it would be valuable to connect somehow to the DevOps movement, specifically under Scope where we explain that GitOps is about tech and not people. I very much like the picture of GitOps is an evolution of DevOps and maybe we can simply state that DevOps deals with organizations and human processes while GitOps is focused on the technical side regardless of the organizational setup. We could note also that organizations who already practice DevOps/SRE or are in the process of getting there will find it substantially easier to adopt GitOps practices as well.
  • For me, GitOps is a major stepping stone towards achieving a fully automated hands-off IT operations model. Where we humans outtask operations to software that we write and improve instead of doing it ourselves. If you all also see it this way I'd suggest to put this also into the rationale.

If you agree with my views then I'll be happy to provide a PR.

PS: I really enjoyed reading about the significance of separating between the what and the how as this is exactly my line of arguing for automated governance:
image

If you find these slides useful I'd be happy to contribute those drawings to the working group. They are currently licensed CC-BY-SA.

Copy link

@tonit tonit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well done PR! Think its a big PR that should be merged soon.
It will not be the last iteration. Lets be a good example for good PRs.


#### What is a system's Desired State?

_Configuration_ is a common feature of most software systems.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this change too.. nerdy. While true, the original sentence "clicks" with most heads and does not sound too niche.


#### How much of a system must be declared?

Ideally, all of it; and the entire system can be recreated exclusively from its Desired State.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely should mention that DB Migrations Question in a FAQ on the Website. Will be a commonly asked question.


<div style="background: #E3F2FD; border-left: 3px solid #0D47A1; padding:10px; margin-bottom:5px; font-style:italic">
Software agents continuously, and automatically, compare a system's <em>Actual State</em> to its <em>Desired State</em>.
If the actual and desired states differ, automated actions are immediately attempted to reconcile them.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Manually Closing the Loop". Definitely a FAQ point for the website.

PRINCIPLES.md Outdated
### 4. Operations through declaration

<div style="background: #E3F2FD; border-left: 3px solid #0D47A1; padding:10px; margin-bottom:5px; font-style:italic">
When wishing to operate on a software system, a human or software agent will not interact with the running system and modify it directly.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
When wishing to operate on a software system, a human or software agent will not interact with the running system and modify it directly.
When wishing to operate on a software system, a human or computer program will not interact with the running system and modify it directly. Only the reconciliation-agent will operate on the software system.


If the desired state of a system is not explicitly defined, it is impossible to verify if the system in a correct state. The state of a running system itself does not provide sufficent information to determine its correctness.

Consider logging into an administration console and seeing that 28 machines are healthily running.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think this is still reality. And when "selling" those low-tech firms the cloud-native cool-aid, GitOps-like practices are one of the showcases that really go well.
I agree another example for existing cloud architectures would make sense, too (as an addition).

Brice Fernandes and others added 6 commits February 17, 2021 14:01
Thanks @tonit! Good catches all.

Co-authored-by: Toni Menzel <toni.menzel@rebaze.com>
Co-authored-by: Leonardo Murillo <leonardo@devops.cr>
Thanks @murillodigital!

Co-authored-by: Leonardo Murillo <leonardo@devops.cr>
Thank you!

Co-authored-by: Moshe Immerman <moshe@flanksource.com>
Co-authored-by: Brian Fox <brianhfox@gmail.com>
Co-authored-by: Leonardo Murillo <leonardo@devops.cr>
Co-authored-by: Moshe Immerman <moshe@flanksource.com>
Signed-off-by: Brice Fernandes <brice@weave.works>
Signed-off-by: Brice Fernandes <brice@weave.works>
@tonit
Copy link

tonit commented Feb 17, 2021

thanks @bricef for updating. Whats you plan with this? Merging this and move on? This already has 61 comments..
Maybe "Principles" need to be broken down (as they are too big of a textblock)?


#### Why is human readability required?

Operationally, it has been proven time and time again that the canonical Desired State of a system should be human-readable and writable. (See _The UNIX Philosophy (1995) by Mike Gancarz_ and _The Pragmatic Programmer (2000) Section 14 "The Power of Plain Text" by Andrew Hunt and David Thomas_)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proven is a strong word, and I think we should wait until there is academic research that proves this

Suggested change
Operationally, it has been proven time and time again that the canonical Desired State of a system should be human-readable and writable. (See _The UNIX Philosophy (1995) by Mike Gancarz_ and _The Pragmatic Programmer (2000) Section 14 "The Power of Plain Text" by Andrew Hunt and David Thomas_)
The consensus among operators that the desired state of a system should be human-readable and writable is continuing to grow.

Copy link
Member

@scottrigby scottrigby Mar 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@moshloop Are you saying you find these resources to be not credible?

Is this a sufficient change (less focus on "proven"), and keep the rest otherwise the same?

- Operationally, it has been proven time and time again…
+ Operationally, consensus continues to grow that…

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not true of ML systems. Human understanding or even observability is an area of ongoing research. Definitely not consensus yet. Growing, maybe. Desirable yes.

Comment on lines +77 to +80
The GitOps principles are to be used as guiding principles in the development of modern software and system operations. They do not form a concrete specification.

The GitOps principles are a _direction_, **not** a _destination_. They should be applied pragmatically. For example, whilst desirable to apply them strictly to an entire systems, they can also be applied selectively to sub-systems as part of a progressive adoption.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The GitOps principles are to be used as guiding principles in the development of modern software and system operations. They do not form a concrete specification.
The GitOps principles are a _direction_, **not** a _destination_. They should be applied pragmatically. For example, whilst desirable to apply them strictly to an entire systems, they can also be applied selectively to sub-systems as part of a progressive adoption.
GitOps is used as a guiding principle for modern system operations. They are a _direction_, **not** a _destination_.
As with all principles they should be applied pragmatically. For example, whilst desirable to apply them strictly to entire systems, they can also be applied selectively to sub-systems as part of a progressive adoption.

Copy link
Member

@scottrigby scottrigby Mar 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But GitOps can't be "used as a guiding principle", it's a practice guided by a set of principles. The intention of this PRINCIPLES doc is to define those.

Otherwise I like the rest of this suggestion 👍

## Scope

GitOps concerns the interaction between humans and technical systems, and between technical systems.
GitOps is not concerned with processes of human decision making or organisation, only how human decisions about technical systems are recorded and applied.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I strongly disagree with this line. GitOps is all about human decision-making, i.e. a human deciding to make a change to a system and another human deciding that the change is of low enough risk to be merged.

Part of the areas needing improvement is how human-decision making can be augmented by a better understanding of how a change could affect a system. (Tests / linting play a role here but usually disconnected from the system) i.e Will scaling down cause production workloads to be evicted or scaling up exceed my organizational budget

GitOps also has scaling problems in large organizations, and one of the goals of the WG should be to try and solve them e.g.

  • How do multiple teams in an organization interact on a single repo,
  • How is a system decomposed into multiple repositories
  • How are systematic changes across systems and repo's implemented and monitored
  • How are changes promoted through environments while respecting change windows and organizational change policies.
Suggested change
GitOps is not concerned with processes of human decision making or organisation, only how human decisions about technical systems are recorded and applied.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand your comment correctly, this is not what was meant here. This is to clarify that the scope of GitOps is to define and reconcile technical systems, not human systems. Humans are always on some level in control of the automated processes. No HAL or Skynet here LOL 😛 But seriously, we could probably word this better.

GitOps is not concerned with processes of human decision making or organisation, only how human decisions about technical systems are recorded and applied.
It is a structured process through which technical systems can be modified reliably.

GitOps is _not_ intended as a model for judging human organisational designs.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how this could even be attempted using GItops?

Suggested change
GitOps is _not_ intended as a model for judging human organisational designs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, keeping an organizations employee reporting structure in git. But I'm not actually 100% sure we should list this as out of scope. For example, certain governmental and legal sources are kept in version control - where changes from those sources could be automatically reconciled to a public facing website using a GitOps software agent. Let's discuss this in the next co-working session If it remains controversial, we could leave it out of the first iteration.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CNCF and Kubernetes TOC being an example.

@moshloop
Copy link

thanks @bricef for updating. Whats you plan with this? Merging this and move on? This already has 61 comments..
Maybe "Principles" need to be broken down (as they are too big of a textblock)?

I don't think this can be just merged and iterated on. Principles are something that requires consensus building, even if it takes time and is hard.

Once there is consensus within the group, the PR should still be held back while final feedback from the broader community is solicited.

@schlomo
Copy link

schlomo commented Feb 26, 2021

I feel that this written ping-pong is maybe not the optimal way to build concensus. How about everybody will add there thoughts here and then we have a workshop to actually talk about it?

@tonit
Copy link

tonit commented Feb 26, 2021

I feel that this written ping-pong is maybe not the optimal way to build concensus. How about everybody will add there thoughts here and then we have a workshop to actually talk about it?

+1 THIS

@scottrigby
Copy link
Member

We have our next meeting Tomorrow. I've added this as the first topic in the meeting notes agenda so we can discuss there.

I also updated this PR's description with "current status", and proposed a simple process for real-time collaboration that has worked previously in this and other projects.

See you all tomorrow!

@scottrigby
Copy link
Member

scottrigby commented Mar 12, 2021

We have our next meeting Tomorrow. I've added this as the first topic in the meeting notes agenda so we can discuss there.

I also updated this PR's description with "current status", and proposed a simple process for real-time collaboration that has worked previously in this and other projects.

See you all tomorrow!

Per yesterday's meeting, the next step will be to schedule (in CNCF Slack) synchronous times for interested parties to iterate on the Principles over the next couple of months. My action item is to help coordinate these sync sessions meeting times. This will be communicated in this GitHub PR, Slack (CNCF #wg-gitops-principles), and the mailing list.

@scottrigby
Copy link
Member

scottrigby commented Mar 17, 2021

Update: anyone interested please add your availability next week for this public GitOps principles sync: https://doodle.com/poll/r2h6dmrs6zmkeizq

This Doodle poll will close at 5PM EST Friday 18 March 2021.

Note: there will be more than one of these syncs, and will also be a chance to comment async on a simplified PR afterwards, so no one will be left out of the loop.

@scottrigby
Copy link
Member

Update: the next sync session will be on Friday March 26 1PM EST.

@scottrigby scottrigby added the task Basic task label Mar 24, 2021
@bricef
Copy link
Author

bricef commented Mar 26, 2021

🚨
IF INTERESTED, FILL IN POLL FOR NEXT MEETING: https://doodle.com/poll/7yfdb2iuqztwprc2

(between Monday 29th March 2021 and Friday 2nd April 2021)
🚨

@scottrigby
Copy link
Member

We have the most votes for Monday March 29 8pm UTC. I'll plan to host the zoom meeting from the WG account 👍

@bricef
Copy link
Author

bricef commented Mar 29, 2021

🚨
IF INTERESTED, FILL IN POLL FOR NEXT MEETING: https://doodle.com/poll/uia7w54hzf76e2u5

(between Monday 5th April 2021 and Friday 9th April 2021)
🚨

@scottrigby
Copy link
Member

scottrigby commented Apr 3, 2021

Happy weekend everyone 👋

Doodle poll says we have a date for WG Principles Sync # 2

When: Monday 5 April, 7pm Universal / 3pm Eastern / 12pm Pacific
What/Why: Live sync sessions to arrive at an initial, pre-release version of the GitOps Principles (see Current Plan)
How: See the meeting agenda/notes for links to the HackMD doc and Zoom

See you all then 🙂

scottrigby added a commit that referenced this pull request Apr 3, 2021
scottrigby added a commit that referenced this pull request Apr 14, 2021
See #48

Set principles to date TBD

Signed-off-by: Scott Rigby <scott@r6by.com>
scottrigby added a commit that referenced this pull request Apr 14, 2021
See #48

Set principles to date TBD

Signed-off-by: Scott Rigby <scott@r6by.com>
iboonox pushed a commit to iboonox/gitops-working-group that referenced this pull request Apr 18, 2021
See gitops-working-group#48

Set principles to date TBD

Signed-off-by: Scott Rigby <scott@r6by.com>
Signed-off-by: iboonox <iheb.dev@gmail.com>

2. [**The principle of immutable configuration versions**](#2-immutable-configuration-versions)

_Desired State_ is stored in a way that supports versioning, immutability of versions, and retains a complete version history.
Copy link
Contributor

@lloydchang lloydchang Apr 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
_Desired State_ is stored in a way that supports versioning, immutability of versions, and retains a complete version history.
_Desired State_ is stored in a way that may support versioning and immutability of versions, and may retain a complete version history.

Suggested adding multiple "may"s because this principle shouldn't prevent practitioners from implementing GitHub, Git, BFG features such as:

  1. Configuring commit squashing for pull requests: squash
  2. Git Tools - Rewriting History: commit --amend, rebase, filter-branch
  3. BFG is a simpler, faster alternative to git-filter-branch for cleansing bad data out of your Git repository history: bfg

Why:
• While versioning, immutability, completeness are recommended, practitioners should have the flexibility to opt-out.
• Seasoned git users are aware of situations that justify usages of squash, commit --amend, rebase, filter-branch, bfg.

This relates to https://github.com/gitops-working-group/gitops-working-group/discussions/93#discussioncomment-642197

Thank you @bricef @scottrigby @moshloop @o6uoq

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think any form of rewriting history should be allowed in the desired state store.

Rewriting history is perfectly acceptable before it merges into master (the desired state) but once it is desired state you can't erase it.

Likewise, filter-branch and bfg should not be allowed, if you commit a secret then you add a new commit to remove it and then revoke the secret.

If you commit a large file or binary, then you either live with it, or you go off the GitOps path, rewrite history, and then go back on the GitOps path.

Copy link
Contributor

@lloydchang lloydchang Apr 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we agree to disagree? Thank you @moshloop @roberthstrand @splushii

I believe rewriting history in the desired state store is appropriate to remove private information.

For example:

GitHub Private Information Removal Policy

We offer this private information removal process as an exceptional service only for high-risk content that violates GitHub's Terms of Service, such as when your security is at risk from exposed access credentials. This guide describes the information GitHub needs from you in order to process a request to remove private information from a repository.

What is Private Information?

For the purposes of this document, “private information” refers to content that (i) should have been kept confidential, and (ii) whose public availability poses a specific or targeted security risk to you or your organization.

"Security risk" refers to a situation involving exposure to physical danger, identity theft, or increased likelihood of unauthorized access to physical or network facilities.

Private information removal requests are appropriate for:

  • Access credentials, such as user names combined with passwords, access tokens, or other sensitive secrets that can grant access to your organization's server, network, or domain.
  • AWS tokens and other similar access credentials that grant access to a third party on your behalf. You must be able to show that the token does belong to you.
  • Documentation (such as network diagrams or architecture) that poses a specific security risk for an organization.
  • Information related to, and posing a security risk to, you as an individual (such as social security numbers or other government identification numbers).

@scottrigby
Copy link
Member

📣 Update: The Principles Committee has reachd general consensus on the first pre-release of the GitOps Principles, revised by the GitOps WG! 🎉
Group consensus was the task will be a new PR added to a repo in the OpenGitOps project org. I have updated the checklist at the top of this PR with the next steps. Stay tuned!

@scottrigby
Copy link
Member

OK this PR has moved to open-gitops/documents#4. Closing this one. Thanks everyone, see you over there!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

task Basic task

Projects

None yet

Development

Successfully merging this pull request may close these issues.