Skip to content

nixos/matrix-synapse: add override for PostgreSQL database assertion#236062

Closed
lschuermann wants to merge 1 commit intoNixOS:masterfrom
lschuermann:dev/synapse-postgres-assertion-override
Closed

nixos/matrix-synapse: add override for PostgreSQL database assertion#236062
lschuermann wants to merge 1 commit intoNixOS:masterfrom
lschuermann:dev/synapse-postgres-assertion-override

Conversation

@lschuermann
Copy link
Copy Markdown
Member

Description of changes

This adds an option to override the (useful) assertion that PostgreSQL must be enabled if Synapse is configured to use a database-connection to localhost (or ::1, 127.0.0.1). It was introduced to ensure that an upgrade to NixOS 20.03 does not result in PostgreSQL being disabled, as it is no longer automatically enabled in the Synapse module.

However, this assertion is problematic in more specific use-cases. For instance, we run multiple Synapse instances on a single machine within NixOS containers, which share the host's network namespace. In this setup, a PostgreSQL service is running on localhost, just not defined as part of the container's configuration. Also, this assertion prevents running PostgreSQL through means other than the NixOS module, or using other more complex network configurations.

Thus, this introduces an option to override this check.

Things done
  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandbox = true set in nix.conf? (See Nix manual)
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 23.11 Release Notes (or backporting 23.05 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

This adds an option to override the (useful) assertion that PostgreSQL
must be enabled if Synapse is configured to use a database-connection
to `localhost` (or `::1`, `127.0.0.1`). It was introduced to ensure
that an upgrade to NixOS 20.03 does not result in PostgreSQL being
disabled, as it is no longer automatically enabled in the Synapse
module.

However, this assertion is problematic in more specific use-cases. For
instance, we run multiple Synapse instances on a single machine within
NixOS containers, which share the host's network namespace. In this
setup, a PostgreSQL service is running on localhost, just not defined
as part of the container's configuration. Also, this assertion
prevents running PostgreSQL through means other than the NixOS module,
or using other more complex network configurations.

Thus, this introduces an option to override this check.
@github-actions github-actions bot added 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 8.has: module (update) This PR changes an existing module in `nixos/` labels Jun 5, 2023
@lschuermann lschuermann requested a review from mweinelt June 5, 2023 09:42
@mweinelt mweinelt requested review from D4ndellion, Ekleog, Ma27, Ralith, mguentner and sumnerevans and removed request for mweinelt June 5, 2023 09:45
@ofborg ofborg bot added 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin. 10.rebuild-linux: 1-10 This PR causes between 1 and 10 packages to rebuild on Linux. labels Jun 5, 2023
'';
};

overrideLocalPostgresCheck = mkOption {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the naming is slightly misleading since the condition using this option (hasLocalPostgresDB) also influences how unit ordering of matrix-synapse.service looks like (i.e. whether or not to depend on postgresql.service) and the option's name and description should reflect that.

Unfortunately I can't think of a better option name right now, sorry.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good point, and something I didn't think of. However, I find it surprising / magic(TM) that this is at all dependent on the database URL as configured in the synapse configuration.

It seems that currently there is no strict dependency on PostgreSQL other than both being enabled, and the start order being encoded in the synapse unit's After= setting. In contrast to Requires=, the systemd.unit manpage seems to suggest that this solely ensures ordering, but does not specify a hard requirement on the specified unit. This also holds in practice: specifying a non-existent service in After= is fine, whereas Requires= will generate a warning.

Keeping this in mind, it seems there's two options here:

  • Right now, a local PostgreSQL database configured within Synapse will require PostgreSQL to be enabled, and started beforehand. We can generalize this behavior and say, if a PostgreSQL service is enabled through the corresponding NixOS module (generating a postgres.service unit), always start synapse after that (unconditionally add After=postgres.service). The one edge-case where the behavior would be different is when the system hosts a PostgreSQL database, but Synapse connects to a non-local instance instead.
  • When a user chooses to override this check, it doesn't really make any sense to depend on an arbitrary postgres.service, either specified manually or generated through the corresponding unit file. Thus it also seems perfectly reasonable to remove this After= dependency when the override is used.

I also can't really come up with a better name. I'm leaning towards the latter option; could a compromise be that I add this as a remark in the option's documentation? Given that this is an edge-case, it seems fair to assume users would read that comment.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, I find it surprising / magic(TM) that this is at all dependent on the database URL as configured in the synapse configuration

Yeah, noticed that too and wanted to fix it later on, but right now I have a different idea:

  • drop the assertion because it was mainly supposed to be a helper to ease the migration. Currently, everyone who runs synapse should have postgres configured properly and everyone who's about to do so can see from the docs that configuring postgres manually is required, i.e. there's no real value to keep the assertion.
  • add a requires = that adds postgresql.service the same way after does. This will work fine for the majority of the cases.
  • in your edge case you can fix the problem on your own entirely by doing systemd.services.matrix-synapse.requires = lib.mkForce [];.

And finally, we don't need to come up with an option name on our own then ;-)

wdyt?

Ma27 added a commit to Ma27/nixpkgs that referenced this pull request Oct 9, 2023
…B to be up

Closes NixOS#236062

The PR NixOS#236062 was submitted because of the following problem: a synapse
instance was running in a NixOS container attached to the host network
and a postgresql instance on the host as database. In this setup,
synapse connected to its DB via 127.0.0.1, but the DB wasn't locally set
up and thus not configured in NixOS (i.e.
`config.services.postgresql.enable` was `false`). This caused the
assertion removed in this patch to fail.

Over three years ago this assertion was introduced when this module
stopped doing autoconfiguration of postgresql entirely[1] because a
breaking change in synapse couldn't be managed via an auto-upgrade on
our side. To make sure people don't deploy their DB away by accident,
this assertion was introduced.

Nowadays this doesn't serve any value anymore because people with
existing instances should've upgraded by now (otherwise it's their job
to carefully read the release notes when missing upgrades for
several years) and people deploying fresh instances are instructed by
the docs to also configure postgresql[2].

Instead, it only causes issues in corner cases like NixOS#236062, so after
some discussion in that PR I think it's time to remove the assertion
altogether.

Also, there's no `Requires=` for `postgresql.service` in the systemd
units which means that it's not strictly guaranteed that the DB is up
when synapse starts up. This is fixed now by adding `requires`. To avoid
being bitten by above mentioned cases again, this only happens if
`config.services.postgresql.enable` is `true`.

If somebody uses a non-local postgresql, but has also deployed a local
postgresql instance on the synapse server (rather unlikely IMHO), it's
their job to opt out of this behavior with `mkForce` (this is precisely one
of the use-cases `mkForce` and friends were built for IMHO).

[1] NixOS#80447
[2] https://nixos.org/manual/nixos/stable/#module-services-matrix-synapse
@Ma27 Ma27 closed this in #259980 Oct 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 8.has: module (update) This PR changes an existing module in `nixos/` 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin. 10.rebuild-linux: 1-10 This PR causes between 1 and 10 packages to rebuild on Linux.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants