Skip to content

Move agents out of config xml#6271

Merged
maheshp merged 149 commits intogocd:masterfrom
vrushaliwaykole:move-agents-out-of-config-xml
Sep 16, 2019
Merged

Move agents out of config xml#6271
maheshp merged 149 commits intogocd:masterfrom
vrushaliwaykole:move-agents-out-of-config-xml

Conversation

@vrushaliwaykole
Copy link
Contributor

@vrushaliwaykole vrushaliwaykole commented May 8, 2019

As part of moving agents from config to the database (#911), below is the exhausting list of changes required along with their current (up-to-date) status,

  • Add additional columns in existing agents table to hold agent information that was earlier part of config XML
    • Add resources column to hold association with resources
    • Add environments column to hold association with environments
    • Add elasticAgentId and elasticPluginId columns
    • Add disabled column as a flag for disabled agents
    • Add deleted column as a flag for deleted agents.
  • Make changes in saveOrUpdate operations for agents. We refer UUID as a unique identifier throughout the application. Hibernate performs DB operations using the primary key which is id field in the Agents table. As we don't persist database id in in-memory objects. In order to perform database operations, we have to find id for UUID and set that on record to saveOrUpdate
  • Migrate existing agents from config XML
    • Remove <agents/> tag from top level and also from <environments/> tag.
    • Added AgentXmlToDBMigration migrator which will transfer all the records from config XML to database as part of XSL
    • As part of this migration, copy all the fields associated with an agent.
  • Make changes in cookie assignment - cookie field is mandatory in the DB which changes agent registration flow. In the case of auto registration, cookie gets assigned at the time of registration and In the case of manual registration, cookie is assigned when a user enables agent(s) manually.
  • Support soft deletion of agents using deleted boolean column in agents table
  • AgentConfigService and AgentDAO related changes
    • Make AgentDAO class responsible for performing DB operations on agents table
    • Remove existing AgentConfigService class and bring all that functionality into AgentService class
    • Make AgentService the only class responsible for talking to AgentDAO and for maintaining a cache of agents in the form of AgentInstances
    • Make AgentService be the single point of contact for performing any operations on agents. It will decide smartly when to go to DB through AgentDAO and when to return agent(s) detail from the cache
    • AgentService will listen for changes to agents in DB and refresh the agent's cache (AgentInstances) whenever appropriate
    • AgentService will support mechanism to register listeners so that agents changes in DB can be further propagated to interested parties with more relevant details
    • AgentService should make sure that when an environment-agent are associated in config repo then that association should not be allowed to be removed either from UI or from agents API
    • AgentService should make sure that when an environment-agent are associated in config repo then that association should not be stored in the database
    • AgentService should provide a new method to allow associating multiple agents with a single environment. This is to support the use case of editing an environment from the environment edit page.
  • Switch from using GoConfigService to AgentService to respond to any agent related queries and operations
  • Agents and Environments API changes
    • Make changes to environments API so that it does not handle maintaining (add/update/remove) agent associations. It should only handle managing environments in config XML
    • Make agents API changes so that it handles maintaining (add/update/remove) environments, resources association, enabling, disabling and deleting agents in DB instead of in config XML
  • Maintain the sanctity of agents data in DB as well as in-memory by putting in place appropriate locking/synchronization mechanisms wherever necessary
  • Environments UI changes,
    • In environments UI, earlier all the basic environments information along with pipeline, agents and environment variable associations was handled by environments API. This should be changed to make agents association be managed by agents API and rest everything should be managed by environments API
    • This change should also make sure that environment information (and it's pipeline and environment variable association) which is stored in config XML and agents information which is stored in DB are handled as one atomic operation
  • Rename AgentConfig class to Agent class so that there are only two classes that represent Agent. The Agent class represents a database row in the Agents table and AgentInstance contains runtime information of an agent.
  • Make changes to switch from going to *CruiseConfig*, *GoConfig* classes to read/write agents data to and from config XML to go to AgentService to read/write data to and from DB

  • Fix existing test failtures

    • All the failing unit test caused by the above changes
    • All the failing integration tests caused by the above changes
    • All the failing rails tests caused by the above changes
    • All the failing functional tests caused by the above changes
  • Add new version v6 of Agents API
    • Sunset V4 & V5 versions of Agents API
    • Document Agents V6 API
  • Document Environment API v3

  • Add new tests covering all of the above changes

    • Add unit tests
    • Add integration tests
    • Add rails tests
    • Add functional tests
  • Verifications - verify that,

    • For the same set of inputs, Agents API v6 behaves same across master and PR branch except for the anticipated changes
    • For the same set of inputs, Environment API v3 behaves same across master and PR branch except for the anticipated changes
    • There are unit-tests and/or integration tests for affected public methods in
      • AgentsControllerV5
      • AgentsControllerV4
      • EnvironmentsControllerV3
      • AgentService
      • AgentDao
      • GoConfigService (agent methods were removed. No tests to add for this)
      • Agents
      • Agent
      • AgentInstances
      • AgentInstance
      • ResourceConfig(s)
      • EnvironmentConfigService
      • EnvironmentsConfig
      • EnvironmentConfig subclasses
      • EnvironmentAgentConfig
      • Agent(s)UpdateValidator
      • CommaSeparatedString
      • GoConfigFileHelper (Test helper class - no tests needed for this)
      • resources_controller.rb
      • environments_controller.rb
    • Analytics plugin is working fine after these changes (mainly the notifications part)
    • Both the static & elastic agents use cases related to the execution of pipeline is working fine
  • Add Postgres migrations for the addon

@vrushaliwaykole vrushaliwaykole force-pushed the move-agents-out-of-config-xml branch from 1f427f0 to ee7cb12 Compare May 8, 2019 11:24
@vrushaliwaykole vrushaliwaykole force-pushed the move-agents-out-of-config-xml branch 4 times, most recently from 1d8bc2e to e3e18d8 Compare May 17, 2019 09:44
@vrushaliwaykole vrushaliwaykole force-pushed the move-agents-out-of-config-xml branch 5 times, most recently from 58a20c0 to f028eb1 Compare May 24, 2019 12:41
@vrushaliwaykole
Copy link
Contributor Author

vrushaliwaykole commented May 28, 2019

Done:

  1. Agent migration from XML to the database:
    • Agents removal from XML
    • Agents removal under environments
  2. Syncing of AgentInstances with database
  3. Basic job assignment and execution flow for static and elastic agents

TODO (Based on things that we're aware of till date):

  1. Remove reference of agents from EnvironmentConfig
  • Update EnvironmentPipelineMatcher to use agents from database
  • EnvironmentConfigService will no longer be responsible for agent operations. Instead, use AgentService
    As per discussions with @ketan, @maheshp, removing agents from environment config requires a lot of refactoring and hence we decided to keep agent references.
  1. Remove reference of agents from GoConfig
  • GoConfigService will no longer answer agent related queries. Instead, use AgentService
  1. Remove AgentConfig related commands
  • We use commands(Update, Delete, Enable, Disable) to update agents. These commands take care of validations at the time of insert/update. We need to move these validations on the database updates:
    • validatePresenceOfEnvironments while adding/removing environments on agents
    • checkIfResourcesAreBeingUpdatedOnElasticAgents while adding resources on elastic agents
    • validateOperationOnPendingAgents while performing any operation on pending agents before enabing them
  1. Job assignment
  • Check if a job is assigned to an agent based on environments
  1. Agent API
  • Update agent API to use agents from the database and associated environments from config.

Future scope

  1. Performance fixes. Use of cache or in memory AgentInstances for agent queries
  2. Check if we need to add listener which will remove an environment from an agent if the environment is removed from config.

/cc @maheshp @adityasood @rajiesh @ketan

@maheshp
Copy link
Contributor

maheshp commented May 28, 2019

@vrushaliwaykole agents can be assigned to an Environment through ConfigRepo. This is something to keep in mind and verify.

public String getIpAddress() {
return this.ipAddress;
public boolean isElastic() {
return isNotBlank(elasticAgentId) && isNotBlank(elasticPluginId);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to have polymorphic objects like ElasticAgent extends Agent. I know it might cascade into a lot of changes everywhere else, so feel free to ignore this for now.


private String dburl(Boolean mvccEnabled) {
return "jdbc:h2:" + systemEnvironment.getDbPath() + "/" + configuration.getName()
return "jdbc:h2:" + systemEnvironment.getDbPath().getAbsolutePath() + "/" + configuration.getName()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was getAbsolutePath() deliberately added?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The DB is now being updated during the config migration and it required absolute path for the same.

@vrushaliwaykole vrushaliwaykole force-pushed the move-agents-out-of-config-xml branch 2 times, most recently from feb0548 to 657813c Compare May 30, 2019 11:03
@vrushaliwaykole
Copy link
Contributor Author

Current scenario:
cruise-config.xml does not have an association of agents to environments. But config-repo can still associate agents to environments. There is a validation to validate agent uuid associated with the environment. This checks whether an agent exists or not.

Issue:
Validating whether agent exists or not while validating tree of CruiseConfig does not make any sense as the agents are not part of CruiseConfig. Also, If we try to do so, we will need to access AgentConfigService to check if an agent exists in the database or not. This will open new issues of circular dependencies.

Solution:
Apply validation explicitly.

  1. For config repo:
    Check if agent exists or not while loading the partial config.

  2. For environments API:
    Environments API calls EnvironmentConfigService which takes care of CRUD operations for environments. EnvironmentConfigService calls AgentConfigService to update agent in case of association of agents and environments are changed. The validation can be added here to check if an agent exists or not.

suggestions appreciated.

/cc @akshaydewan @ketan @msaurabh1978 @maheshp

kritika-singh3 and others added 30 commits September 16, 2019 13:50
 - ensured that in case of updating a pending agent, if for some reason, the DB save fails save fails, the cache should not get updated.
 - in Env Config Service - there is only one sync method which will update the local envs copy with the input and also sync it with the agents in DB
…ect and throws appropriate exceptions for various error situations.

AgentsControllerV6.update() is modified to understand those exception and create appropriate response.

Environment config classes have quite a few critical business logic but there were no sufficient tests so added those missing tests for basic environment config even though these were not modified as part of moving agents work
…tion, UnprocessableEntityException to be handled by centralized exception handlers, that are registered using RoutesHelper.init()
… service layer.

Service layer in case of delete throws appropriate exceptions for various scenarios

AgentsController does not handle and rethrows HttpException which are eventually  handled by centralized exception handler

AgentsController handles all exceptions other than HttpException and create appropriate response
…ironment Config methods

Even though these methods are not modified, they are in some way related to agents
…alize(), which loads all agents from DB to be cached in memory.

When there is an error in initialization of Agent Service, application can't function properly.

Hence throwing an exception with appropriate error message which is logged in the application start up code.

Hence not logging the exception in AgentDao.
…ring after commit callback to a generic util which takes a lambda as input to register it as after commit hook.

and rectified exception handling in agent service initialization
…gentsAssociationOfEnvironment() to not be aware of result object.

They now throw appropriate exception for various possible error situations and which are caugh/handled in controller and appropriate message is sent back in the response.

This is still WIP as AgentService.updateAgentsAssociationOfEnvironment() is used by rails controller which needs to be rectified to incorporate mentioned changes.
Reverted all the changes made to maintain the same in the EnvConfig service cache
Updated the AgentEnv Representer class to utlize envs stored in the agent object to determine the unknown configs and render them accordingly
Updated the tests as well.
 - removed the `values` method on `AgentInstances` as the object itself can be utilized as is. For creating a stream, used `stream(spliterator, noParalle)` method
   Updated the relevant tests
 - removed logging on `GoConfigMigrator.migrate` exception catch block as it is already handled by the handler
 - On Env Config Update API v3 - for rename entity exception, used case-insensitive equals
…ult object to the agent service.

instead surrounded the service call with begin-rescue and rendered the appropriate error
…er - which can be invoked from anywhere on demand and is disabled/dormant by default
 - Updated the agent resource validation to add error on `resource` key
 - Added report generation logic at the end of agent-env association performance test.
 - patchEnv method of Env Config Service will no longer accept any list of agents to add/remove.
 - Updated the PatchEnvCommand to remove all logic of agent association/validation
 - Updated the EnvAgentConfig.validateUuidPresent method to add config error if the list of uuids received is null.
   This is required in case a config repo with agent-env association is added to a GoCD server with no agents.
 - Removed all config annotations from EnvAgentConfig(s)
…nt was not found in the DB. This is done to ensure that if an agent is not assigned a cookie until the same has been successfully registered/enabled in the GoCD server.
…adding agent-env association via either config change or agent service listeners
… in the DB

Added logs to specify what all agents are being migrated
Updated the tests to verify for both insert and update agent scenario
 - elastic agent created for a job (pipeline) associated with an env was failing. This is fixed.
 - delete agent request from elastic-agent plugins will now be executed without any validations. We assume that the plugin knows best abt the agents.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants