Development seed files

Development seed files are listed under gitlab/db/fixtures/development/ and gitlab/ee/db/fixtures/development/ folders. These files are used to populate the database with records to help verifying if feature functionalities, like charts, are working as expected on local host.

The task rake db:seed_fu can be used to run all development seeds with the exception of the ones under a flag which is usually passed as an environment variable.

The following table summarizes the seeds and tasks that can be used to generate data for features.

FeatureCommandSeed
DevOps AdoptionFILTER=devops_adoption bundle exec rake db:seed_fu31_devops_adoption.rb
Value Streams DashboardFILTER=cycle_analytics SEED_VSA=1 bundle exec rake db:seed_fu17_cycle_analytics.rb
Value Streams Dashboard overview countsFILTER=vsd_overview_counts SEED_VSD_COUNTS=1 bundle exec rake db:seed_fu93_vsd_overview_counts.rb
Value Stream AnalyticsFILTER=customizable_cycle_analytics SEED_CUSTOMIZABLE_CYCLE_ANALYTICS=1 bundle exec rake db:seed_fu30_customizable_cycle_analytics
CI/CD analyticsFILTER=ci_cd_analytics SEED_CI_CD_ANALYTICS=1 bundle exec rake db:seed_fu38_ci_cd_analytics
Contributions Analytics

Productivity Analytics

Code review Analytics

Merge Request Analytics
FILTER=productivity_analytics SEED_PRODUCTIVITY_ANALYTICS=1 bundle exec rake db:seed_fu90_productivity_analytics
Repository AnalyticsFILTER=14_pipelines NEW_PROJECT=1 bundle exec rake db:seed_fu14_pipelines
Issue Analytics

Insights
NEW_PROJECT=1 bin/rake gitlab:seed:insights:issuesinsights Rake task
DORA metricsSEED_DORA=1 FILTER=dora_metrics bundle exec rake db:seed_fu92_dora_metrics
Code Suggestion data in ClickHouseFILTER=ai_usage_stats bundle exec rake db:seed_fu94_ai_usage_stats
GitLab DuoSEED_GITLAB_DUO=1 FILTER=gitlab_duo bundle exec rake db:seed_fu95_gitlab_duo
GitLab Duo: Seed failed CI jobs for Root Cause Analysis (/troubleshoot) evaluationLANGCHAIN_API_KEY=$Key bundle exec rake gitlab:duo_chat:seed:failed_ci_jobsseed_failed_ci_jobs
Pipeline metricsFILTER=pipeline_metrics SEED_PIPELINE_METRICS=1 bundle exec rake db:seed_fu98_pipeline_metrics.rb

Seed project and group resources for GitLab Duo

The gitlab:duo:setup setup script executes the development seed file for GitLab Duo project and group resources. In self-managed mode, the task is idempotent and skips reseeding if the gitlab-duo group already exists. To force reseeding from the setup task, set GITLAB_DUO_RESEED=1.

To run the seed directly (outside the setup task) and recreate all resources:

SEED_GITLAB_DUO=1 FILTER=gitlab_duo bundle exec rake db:seed_fu

Configurable paths

By default, the seeder creates a group at gitlab-duo with a project at gitlab-duo/test, cloned from the test-repo. You can override these defaults with the following environment variables:

Environment variableDefaultDescription
GITLAB_DUO_GROUP_PATHgitlab-duoPath of the group to create.
GITLAB_DUO_PROJECT_PATHtestPath of the project to create inside the group.
GITLAB_DUO_PROJECT_CLONE_URLhttps://gitlab.com/.../test-repo.gitGit URL to clone as the project repository.

For example, to seed into a custom group and project:

GITLAB_DUO_GROUP_PATH=my-duo-group GITLAB_DUO_PROJECT_PATH=my-project \
  SEED_GITLAB_DUO=1 FILTER=gitlab_duo bundle exec rake db:seed_fu

The same environment variables are respected by the gitlab:duo:setup Rake task.

Fixed IDs and evaluation framework compatibility

GitLab Duo group and project resources are also used by the Central Evaluation Framework for automated GitLab Duo evaluation. Some evaluation datasets refer to group or project resources (for instance, Summarize issue #123 requires a corresponding issue record in PostgreSQL).

Currently, this development seed file and evaluation datasets are managed separately. To ensure that the integration keeps working, this seeder has to create the same group/project resources every time. For example, ID and IID of the inserted PostgreSQL records must be the same every time we run this seeding process.

When using the default group and project paths (gitlab-duo/test), the seeder assigns a fixed base ID of 1_000_000 to all seeded records (group, project, epic, issue, merge request, and so on). This guarantees deterministic IDs that the evaluation datasets depend on.

When you provide custom paths through the environment variables above, fixed IDs are not applied and PostgreSQL assigns IDs automatically.

Custom-path seeds are fully functional for interactive use of GitLab Duo features (Duo Chat, code suggestions, and so on). The only limitation is that the Central Evaluation Framework datasets expect the fixed IDs created by the default paths. If you need to run automated evaluations, use the default gitlab-duo/test paths.

These fixtures are depended by the following projects:

See this architecture doc for more information.