Skip to content

sql: support client-provided password hashes#72579

Merged
craig[bot] merged 3 commits intocockroachdb:masterfrom
knz:20211105-pw
Dec 1, 2021
Merged

sql: support client-provided password hashes#72579
craig[bot] merged 3 commits intocockroachdb:masterfrom
knz:20211105-pw

Conversation

@knz
Copy link
Copy Markdown
Contributor

@knz knz commented Nov 9, 2021

Fixes #50757

We wish to use this in the CC control plane, when provisioning SQL
accounts in new clusters, or when users manipulate their user list in
the CC management console.

Release note (security update): It is now possible to pre-compute the
hash of the password credentials of a SQL user client-side, and set
the SQL user's password using the hash, so that the CockroachDB never
sees the password string in clear in the SQL session.

This auto-detection is subject to the new cluster setting
server.user_login.store_client_pre_hashed_passwords.enabled. This setting
defaults to true (i.e. feature enabled).

This feature is meant for use in automation/orchestration, when the
control plane constructs passwords for users outside of CockroachDB,
and there is an architectural desire to ensure that cleartext
passwords are not transmitted/stored in-clear.

Note: when the client provides the password hash, CockroachDB
cannot carry any checks on the internal structure of the password,

such as minimum length, special characters, etc.

Should a deployment require such checks to be performed database-side,
the operator would need to disable the mechanism via the cluster
setting named above. When upgrading a cluster from a previous version,
to ensure that the feature remains disabled throughout the upgrade,
use the following statement prior to the upgrade: sql INSERT INTO system.settings(name, value, "valueType") VALUES('server.user_login.store_client_pre_hashed_passwords.enabled', 'false', 'b');

(We do not recommend relying on the database to perform password
checks. Our recommended deployment best practice is to implement
credential definitions in a control plane / identity provider that is
separate from the database.)

Release note (sql change): The CREATE ROLE and ALTER ROLE
statements now
accept password hashes computed by the client app. For example:
CREATE USER foo WITH PASSWORD 'CRDB-BCRYPT$2a$10$.....'.

Note: this feature is not meant for use by human users / in
interactive sessions; it is meant for use in programs, using the
computation algorithm described below.

This auto-detection can be disabled by changing the cluster setting
server.user_login.store_client_pre_hashed_passwords.enabled to false.

Note: this design mimics the behavior of PostgreSQL, which recognizes
pre-computed password hashes when presented to the regular PASSWORD
option (postgresql.org/docs/14/sql-createrole.html).

The password hashes are auto-detected based on their lexical
structure. For example, any password that starts with the prefix
CRDB-BCRYPT$, followed by a valid encoding of a bcrypt hash (as
detailed below), is considered a candidate password hash.

To ascertain whether a password hash will be recognized as such,
orchestration code can use the new built-in function
crdb_internal.check_password_hash_format().

Currently, CockroachDB only recognizes password hashes computed using
Bcrypt, as follows (we detail this algorithm so that orchestration
software can implement their own password hash computation, separate
from the database):

  1. take the cleartext password string.

  2. append the following byte array to the password:
    e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
    (these are 32 hex-encoded bytes)

    (What are these bytes? it's the SHA-256 hash of an empty string. Why
    is it appended? This is a historical oddity in the CockroachDB with
    no particular reason. It adds no security.)

  3. choose a Bcrypt cost. (CockroachDB servers use cost 10 by default.)

  4. generate a bcrypt hash of the string generated at step 2 with the
    cost chosen at step 3, as per

    https://en.wikipedia.org/wiki/Bcrypt

    or

    https://bcrypt.online

    Note: at this point, CockroachDB only supports hashes computed using
    Bcrypt version 2a.

  5. Encode the hash into the format recognized by CockroachDB: the
    string CRDB-BCRYPT, followed by the standard bcrypt hash
    encoding ($2a$...).

Summary:

Hash method Recognized by check_password_hash_format() ALTER/CREATE USER WITH PASSWORD
crdb-bcrypt yes (CRDB-BCRYPT$2a$...) recognized if enabled via cluster setting
scram-sha-256 yes (SCRAM-SHA-256$4096:...) not implemented yet (issue #42519)
md5 yes (md5...) obsolete, will likely not be implemented

@knz knz requested review from a team, RichardJCai and aaron-crl November 9, 2021 17:45
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@knz
Copy link
Copy Markdown
Contributor Author

knz commented Nov 9, 2021

@bdarnell @aaron-crl a question remains of whether we want this feature to have an opt-out configuration setting? I don't think so, but please confirm.

Also we could perhaps add a check on the minimum/maximum value of the bcrypt cost?

@knz knz requested a review from a team November 9, 2021 17:50
@knz knz force-pushed the 20211105-pw branch 2 times, most recently from bb62637 to cb886ae Compare November 9, 2021 19:09
@rafiss
Copy link
Copy Markdown
Collaborator

rafiss commented Nov 9, 2021

Just curious, how does this interact with the scram authentication method described at https://www.postgresql.org/docs/14/auth-password.html and #42519 ?

I'm curious why we're deciding to introduce a new HASHED PASSWORD role option? Based on my read of those PG docs, it seems like the way PG supports receiving hashed passwords is if the user sets the password_encryption setting in postgresql.conf. Perhaps instead of a special HASHED PASSWORD role option or a .conf file, we could add a cluster setting like server.auth.password_encryption with allowed values of password or scram-bcrypt. (Open question: not sure what to do with the password_encryption session setting.)

I'm suggesting this because:

  1. having fewer role options might be less confusing, especially when we have both a HASHED PASSWORD and a PASSWORD role option
  2. This doesn't seem like the kind of thing that operators would do by hand, since it involves computing hashes. So if they have existing tools, it might be an obstacle to update those tools to use this new HASHED PASSWORD syntax. Especially if there's an expectation that they should be able to use a tool that worked with Postgres.

NB: I might be misunderstanding the intent of this PR. Let me know if the PR is solving a different problem than the one I'm talking about.

@knz
Copy link
Copy Markdown
Contributor Author

knz commented Nov 10, 2021

These are valid questions.

how does this interact with the scram authentication method ?

It's orthogonal, and complementary to it. We can provision SCRAM seeds/hashes via the new HASHED PASSWORD syntax, when we start supporting SCRAM.

why we're deciding to introduce a new HASHED PASSWORD role option?

I added this text to the release note:

Note that this design differs from PostgreSQL, which recognizes
pre-computed password hashes when presented to the regular PASSWORD
option (https://www.postgresql.org/docs/14/sql-createrole.html). The
choice to diverge from PostgreSQL was made after observing that the pg
logic would prevent a user from using a random string of characters as
password that happens to have a similar structure as a password hash.

Specific quote from the pg doc, for context: If the presented password string is already in MD5-encrypted or SCRAM-encrypted format, then it is stored as-is regardless of password_encryption (since the system cannot decrypt the specified encrypted password string, to encrypt it in a different format).

This PR's choice to eschew auto-detection and use separate syntax was requested by @bdarnell in this comment.

I am personally ambivalent about whether to use a new syntax or have the baseline PASSWORD option auto-detect hashes. Note that if we decided to change this later, this would be backward-compatible with this PR (we are not painting ourselves in a corner with this design).

Based on my read of those PG docs, it seems like the way PG supports receiving hashed passwords is if the user sets the password_encryption setting in postgresql.conf.

That's incorrect. password_encryption is only used/applied when the client provides a cleartext password. It's not possible to convert a user-supplied hash to a different hash function.

Perhaps instead of a special HASHED PASSWORD role option or a .conf file, we could add a cluster setting like server.auth.password_encryption with allowed values of password or scram-bcrypt. (Open question: not sure what to do with the password_encryption session setting.)

When we support multiple hashing functions, yes, we will want to do this. This is discussed in this section of the RFC.

This current PR does not add new password encoding methods so we do not need to add a configuration option yet.

  1. having fewer role options might be less confusing, especially when we have both a HASHED PASSWORD and a PASSWORD role option
  • nb: the HASHED PASSWORD "option" is not really a separate option from PASSWORD; it's merely a modifier on it. They are handled the same in the code modulo a conditional.
  • If the main objection is confusion, I would be OK with not documenting the HASHED PASSWORD syntax, or disclaiming that it is only intended for programmatic uses (not interactive).
  1. This doesn't seem like the kind of thing that operators would do by hand

definitely not, you're right. This is exclusively meant for use by automation, in our case in the CC control plane logic that provisions SQL accounts in new clusters.

I added notes to that effect in the commit message and release note.

If they have existing tools, it might be an obstacle to update those tools to use this new HASHED PASSWORD syntax.

There is no current way we can make existing postgres tools compatible with CockroachDB, even if we supported the postgres way to provide client-computed hashes to the server (using the pg syntax/semantics explained above), for another reason: we currently do not support any pg-compatible hash function (pg only supports md5 and scram, crdb currently only supports bcrypt).

If/when we do support scram, and we want compatibility with pg tools, we can revisit the choice to use a different syntax and perhaps consider doing the same auto-detection that pg does (with the same caveat about security as currently mentioned by Ben and in the release note). This will be possible since, as pointed out above, we can make this chance in a backward-compatible way with this PR. I'd like to delay this choice until strictly necessary.

Let me know if the PR is solving a different problem than the one I'm talking about.

This PR is part of the larger-scope project to "make passwords better" in the CC infrastructure. It has been pointed out in two places:

@knz
Copy link
Copy Markdown
Contributor Author

knz commented Nov 10, 2021

Note that even with the separate syntax as spelled out here, this code is still relying on auto-detection to distinguish bcrypt-hashed passwords and scram-hashed passwords.

In my opinion, this is OK: the textual encoding of the hashes contains the type of hash as prefix. Bcrypt hashes start with $2a$, md5 (which we will not support) start with md5: and the scram hashes start with scram-sha-256:.

However, if we don't like it, I have a strawman proposal: provide the chosen hash function as an extra positional argument in the syntax, for example:

CREATE USER foo WITH PASSWORD '$2a$10$.....' USING HASH 'bcrypt';

(I added a final commit in this PR to implement just that, for discussion. We can remove it if we don't like it and keep just HASHED PASSWORD).

Here are some arguments in favor:

  • makes it possible to double check the intent of the code (I find this suspicious: the prefix in the hash encoding also signals intent by declaring which hash function is being used)
  • clarifies to a human inspector what is going on (I also find this suspicious: the crdb bcrypt is really the odd one out, and when we support scram, the scram-sha-256 prefix in the hash value will clarify what is going on already)

Consider why these arguments are suspicious:

CREATE USER foo WITH PASSWORD 'scram-sha-256:.....' USING HASH 'scram-sha-256'; -- this seems redundant?

Here are some arguments against providing the hashing function as separate syntax:

  • it seems largely redundant.
  • it forces the orchestration / automation code to store crdb's idea of the name of the hash function alongside the value in the control plane, for when the credentials are meant to be re-populated in multiple clusters or after a backup. It's likely that the orchestration will either not need to store the hash function type, or will have its own (internal) representation for it. Requiring the automation to store crdb's hash function type string creates an odd coupling.

All in all, I'd be in favor of using auto-detection instead, i.e. remove the last commit with the alternate syntax. But I welcome arguments either way.

@catj-cockroach catj-cockroach requested review from catj-cockroach and removed request for a team November 19, 2021 17:22
Copy link
Copy Markdown
Contributor

@catj-cockroach catj-cockroach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 4 of 4 files at r1, 10 of 11 files at r2, 9 of 9 files at r3, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aaron-crl, @knz, and @RichardJCai)


pkg/security/password.go, line 85 at r2 (raw file):

// CheckPasswordHashValidity checks that a (user-provided) password
// hash is recognized as a valid hash.
func CheckPasswordHashValidity(ctx context.Context, hashedPassword string) ([]byte, error) {

Why is context.Context necessary here? It doesn't appear we're using it?

Copy link
Copy Markdown
Contributor Author

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 11 files at r2.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aaron-crl, @catj-cockroach, and @RichardJCai)


pkg/security/password.go, line 85 at r2 (raw file):

Previously, catj-cockroach (Cat J) wrote…

Why is context.Context necessary here? It doesn't appear we're using it?

It makes it easier to add logging when troubleshooting.

Copy link
Copy Markdown
Collaborator

@jeffswenson jeffswenson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a slight preference for auto detecting the hash. As you say, the lack of compatible hash functions means using the syntax does not make us compatible with postgres. But I still think we should default to doing things the postgres way.

Overall I think this PR is great. All of the variants discussed would work for Cockroach Cloud.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aaron-crl, @catj-cockroach, @knz, and @RichardJCai)


pkg/security/password.go, line 85 at r3 (raw file):

// CheckPasswordHashValidity checks that a (user-provided) password
// hash is recognized as a valid hash.
func CheckPasswordHashValidity(ctx context.Context, hashedPassword string) ([]byte, error) {

What do you think about creating a custom prefix for CRDB's hashing scheme? I'm thinking something like:

CREATE USER userhpw WITH ENCRYPTED PASSWORD 'crdb-brcypt:$2a$10$....'

That is a better match for the 'scram-sha-256:.....' syntax. It also reduces the probability of a randomly generated password looking like a bcrypt hashed password.

Copy link
Copy Markdown
Contributor Author

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aaron-crl, @catj-cockroach, @jeffswenson, and @RichardJCai)


pkg/security/password.go, line 85 at r3 (raw file):

Previously, JeffSwenson (Jeff Swenson) wrote…

What do you think about creating a custom prefix for CRDB's hashing scheme? I'm thinking something like:

CREATE USER userhpw WITH ENCRYPTED PASSWORD 'crdb-brcypt:$2a$10$....'

That is a better match for the 'scram-sha-256:.....' syntax. It also reduces the probability of a randomly generated password looking like a bcrypt hashed password.

If we were designing CockroachDB from scratch, then yes, absolutely.

However, in order to support transparent backward compatibility with existing CockroachDB deployments, this doesn't work. We have to recognize the non-prefixed bcrypt variant in any case. And then if we have to recognize it anyway... why would we then optionally add the prefix? it feels to me like noise.

Copy link
Copy Markdown
Contributor Author

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the reviews so far.

In response to the latest comments, I have removed the extra commit from the PR to use an alternate syntax (with USING).

And instead, I added a new extra commit that does hash auto-detection like PostgreSQL, upon invitation by Jeff.

PTAL

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aaron-crl, @catj-cockroach, @jeffswenson, @knz, and @RichardJCai)

@knz
Copy link
Copy Markdown
Contributor Author

knz commented Nov 24, 2021

@bdarnell can you have a look too? You were the main proponent of using a separate syntax form for pre-hashed passwords. How do you feel now?

(the extra commit also removes the new syntax "WITH HASHED PASSWORD" and sticks to the baseline "WITH PASSWORD". If those new semantics are accepted, I will squash both into just 1 so that the release notes can be simplified.)

Copy link
Copy Markdown
Collaborator

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this PR is great. i'm in favor of the current state! just had small nits

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aaron-crl, @catj-cockroach, @jeffswenson, @knz, and @RichardJCai)


-- commits, line 95 at r6:
can this paragraph spell out a bit more what each of these components means? one of them is the algorithm, one of them is the cost? or if it differs from alg-to-alg then it'd be helpful to reference some standard format. then again, since we only support bcrypt right now, we might as well just be specific about that here


pkg/sql/logictest/testdata/logic_test/role, line 876 at r6 (raw file):


statement error pq: conflicting role options
CREATE ROLE rolewithcreate WITH PASSWORD foo HASHED PASSWORD bar

nit: i think this test is no longer relevant now with the latest changes

Copy link
Copy Markdown
Collaborator

@jeffswenson jeffswenson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aaron-crl, @catj-cockroach, @knz, and @RichardJCai)


pkg/security/password.go, line 85 at r3 (raw file):

Previously, knz (kena) wrote…

If we were designing CockroachDB from scratch, then yes, absolutely.

However, in order to support transparent backward compatibility with existing CockroachDB deployments, this doesn't work. We have to recognize the non-prefixed bcrypt variant in any case. And then if we have to recognize it anyway... why would we then optionally add the prefix? it feels to me like noise.

Is there a technical need to couple the storage format to the format expected by ALTER USER? I suspect we will eventually need to decouple them. For example, we may want to replace the storage format with a protobuf when we add support for multiple passwords.

Copy link
Copy Markdown
Contributor

@RichardJCai RichardJCai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sql / role option stuff LGTM, don't have expertise to comment on the actual hashing mechanisms.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aaron-crl, @catj-cockroach, and @knz)


-- commits, line 32 at r6:
This release note is no longer accurate right?

Copy link
Copy Markdown
Contributor

@catj-cockroach catj-cockroach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the only risk for automatically detecting different hashes would be if we have two different algorithms producing similar output. Given that we only support bcrypt here, I don't see any issue in not adding the USING HASH syntax. If we decide to add another hashing algorithm, we should discuss at that point.

The autodetection logic looks good to me!

Reviewed 2 of 15 files at r4, 13 of 13 files at r5, 13 of 13 files at r6, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aaron-crl and @knz)

Copy link
Copy Markdown
Contributor Author

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aaron-crl, @jeffswenson, and @knz)


pkg/security/password.go, line 85 at r3 (raw file):

Previously, JeffSwenson (Jeff Swenson) wrote…

Is there a technical need to couple the storage format to the format expected by ALTER USER? I suspect we will eventually need to decouple them. For example, we may want to replace the storage format with a protobuf when we add support for multiple passwords.

Aha, 💡 !

Thanks for the hint. You're right, we can be prescriptive about the SQL syntax of the hash even if we strip the prefix in the bcrypt case under the hood before storage. That also removes the asymmetry with postgresql, and simplifies the heuristic for auto-detection. Thanks for the tip! I'll amend the PR accordingly.

Copy link
Copy Markdown
Contributor Author

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aaron-crl, @bdarnell, @catj-cockroach, @jeffswenson, @knz, @rafiss, and @RichardJCai)


-- commits, line 59 at r8:
I agree it should be enabled by default. However, we do want the option to disable still; I rewrote the release note to clarify:

Note: when the client provides the password hash, CockroachDB
cannot carry any checks on the internal structure of the password,

such as minimum length, special characters, etc.
Should a deployment require such checks to be performed database-side,
the operator would need to disable the mechanism via the cluster
setting named above.


-- commits, line 68 at r8:

Previously, bdarnell (Ben Darnell) wrote…

I'd include the dollar sign in this prefix, and reject any passwords that have a hash-format prefix but are not validly encoded. (edit: I see later on that that's what the implementation does, but this description sounds like passwords starting with BCRYPT are allowed as long as the remaining characters are not properly formatted for bcrypt).

Clarified.


-- commits, line 91 at r8:

The prefix for this format should be something like "CRDB-BCRYPT"

Good idea. Done.

No need for us to go into a detailed-but-informal description of what bcrypt encoding means.

The algorithm should be documented so that it can be implemented reliably in a 3rd party tool without peeking at the crdb source code.


-- commits, line 100 at r8:

Previously, bdarnell (Ben Darnell) wrote…

The 2 here is not really version 2 of bcrypt, it's version two of crypt (https://en.wikipedia.org/wiki/Crypt_(C)#Key_derivation_functions_supported_by_crypt). I would say that CockroachDB supports bcrypt versions 2 and 2a (it looks like it also accepts all other one-character minor versions but treats them all equivalent to 2a).

Done.


pkg/security/password.go, line 85 at r3 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Note that the same insight would let us support the md5 password format without taking a step backwards in security (I am not advocating for this, just mentioning it in case we want to soften the "never" language. I don't think it's worth the effort since SCRAM is a better solution, but this would have been a reasonable option if md5 were the only non-plaintext password option in pgwire).

We could define a new storage format BCRYPT-MD5 (i.e. bcrypt(cost, md5(password))) and when the password is set using the input format password or MD5$hash we could compute and store the BCRYPT-MD5 encoding. That would let us support the pgwire md5 auth protocol in addition to plaintext (and as a bonus would be a first step to phasing out the hash-of-empty-string accident for new passwords).

interesting. filing a separate issue #73337 to track.


pkg/sql/sem/builtins/builtins.go, line 6100 at r8 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Nit: "check_password" to me implies comparing an input password/hash to the stored one for authn purpose. I think I'd call this validate_password_hash (or add _format to the name?)

I'm going for _format.

Copy link
Copy Markdown
Contributor

@bdarnell bdarnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aaron-crl, @bdarnell, @catj-cockroach, @jeffswenson, @knz, @rafiss, and @RichardJCai)


-- commits, line 59 at r8:

Previously, knz (kena) wrote…

I agree it should be enabled by default. However, we do want the option to disable still; I rewrote the release note to clarify:

Note: when the client provides the password hash, CockroachDB
cannot carry any checks on the internal structure of the password,

such as minimum length, special characters, etc.
Should a deployment require such checks to be performed database-side,
the operator would need to disable the mechanism via the cluster
setting named above.

In that case I think a different name for the setting would be appropriate. Putting "detect" in the name suggests that the setting is about the autodetection causing some theoretically-valid passwords to be rejected or misinterpreted, while an explicit syntax might allow for pre-hashed and raw passwords to be passed unambiguously. The setting name should instead have something to do with the enforcement of password constraints (and since we don't have those constraints yet, maybe the introduction of the setting should wait until we do).


-- commits, line 91 at r8:

Previously, knz (kena) wrote…

The prefix for this format should be something like "CRDB-BCRYPT"

Good idea. Done.

No need for us to go into a detailed-but-informal description of what bcrypt encoding means.

The algorithm should be documented so that it can be implemented reliably in a 3rd party tool without peeking at the crdb source code.

Yes, but this documentation is incomplete: it just refers to "the bcrypt hash", but that hash is specified alongside its nonstandard base64-encoding and in practice all implementations I know of give you the encoded form instead of something for you to encode with this alphabet yourself.

We wish to use this in the CC control plane, when provisioning SQL
accounts in new clusters, or when users manipulate their user list in
the CC management console.

Release note (security update): It is now possible to pre-compute the
hash of the password credentials of a SQL user client-side, and set
the SQL user's password using the hash, so that the CockroachDB never
sees the password string in clear in the SQL session.

This auto-detection is subject to the new cluster setting
`server.user_login.store_client_pre_hashed_passwords.enabled`. This setting
defaults to `true` (i.e. feature enabled).

This feature is meant for use in automation/orchestration, when the
control plane constructs passwords for users outside of CockroachDB,
and there is an architectural desire to ensure that cleartext
passwords are not transmitted/stored in-clear.

Note: **when the client provides the password hash, CockroachDB
cannot carry any checks on the internal structure of the password,**
such as minimum length, special characters, etc.

Should a deployment require such checks to be performed database-side,
the operator would need to disable the mechanism via the cluster
setting named above. When upgrading a cluster from a previous version,
to ensure that the feature remains disabled throughout the upgrade,
use the following statement prior to the upgrade: ```sql INSERT INTO
system.settings(name, value, "valueType")
VALUES('server.user_login.store_client_pre_hashed_passwords.enabled', 'false',
'b'); ```

(We do not recommend relying on the database to perform password
checks. Our recommended deployment best practice is to implement
credential definitions in a control plane / identity provider that is
separate from the database.)

Release note (sql change):  The `CREATE ROLE` and `ALTER ROLE`
statements now
accept password hashes computed by the client app. For example:
`CREATE USER foo WITH PASSWORD 'CRDB-BCRYPT$2a$10$.....'`.

Note: this feature is not meant for use by human users / in
interactive sessions; it is meant for use in programs, using the
computation algorithm described below.

This auto-detection can be disabled by changing the cluster setting
`server.user_login.store_client_pre_hashed_passwords.enabled` to `false`.

Note: this design mimics the behavior of PostgreSQL, which recognizes
pre-computed password hashes when presented to the regular PASSWORD
option (https://www.postgresql.org/docs/14/sql-createrole.html).

The password hashes are auto-detected based on their lexical
structure. For example, any password that starts with the prefix
`CRDB-BCRYPT$`, followed by a valid encoding of a bcrypt hash (as
detailed below), is considered a candidate password hash.

To ascertain whether a password hash will be recognized as such,
orchestration code can use the new built-in function
`crdb_internal.check_password_hash_format()`.

Currently, CockroachDB only recognizes password hashes computed using
Bcrypt, as follows (we detail this algorithm so that orchestration
software can implement their own password hash computation, separate
from the database):

1. take the cleartext password string.
2. append the following byte array to the password:
   e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
   (these are 32 hex-encoded bytes)

   (What are these bytes? it's the SHA-256 hash of an empty string. Why
   is it appended? This is a historical oddity in the CockroachDB with
   no particular reason. It adds no security.)

3. choose a Bcrypt cost. (CockroachDB servers use cost 10 by default.)
4. generate a bcrypt hash of the string generated at step 2 with the
   cost chosen at step 3, as per

   https://en.wikipedia.org/wiki/Bcrypt

   or

   https://bcrypt.online/

   Note: at this point, CockroachDB only supports hashes computed using
   Bcrypt version 2a.

5. Encode the hash into the format recognized by CockroachDB: the
   string `CRDB-BCRYPT`, followed by the standard bcrypt hash
   encoding (`$2a$...`).

Summary:

| Hash method     | Recognized by `check_password_hash_format()` | ALTER/CREATE USER WITH PASSWORD           |
|-----------------|----------------------------------------------|-------------------------------------------|
| `crdb-bcrypt`   | yes (`CRDB-BCRYPT$2a$...`)                   | recognized if enabled via cluster setting |
| `scram-sha-256` | yes (`SCRAM-SHA-256$4096:...`)               | not implemented yet (issue cockroachdb#42519)        |
| `md5`           | yes (`md5...`)                               | obsolete, will likely not be implemented  |
Copy link
Copy Markdown
Contributor Author

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aaron-crl, @bdarnell, @catj-cockroach, @jeffswenson, @knz, @rafiss, and @RichardJCai)


-- commits, line 59 at r8:

Previously, bdarnell (Ben Darnell) wrote…

In that case I think a different name for the setting would be appropriate. Putting "detect" in the name suggests that the setting is about the autodetection causing some theoretically-valid passwords to be rejected or misinterpreted, while an explicit syntax might allow for pre-hashed and raw passwords to be passed unambiguously. The setting name should instead have something to do with the enforcement of password constraints (and since we don't have those constraints yet, maybe the introduction of the setting should wait until we do).

  1. I see your point about conveying the meaning in the name of the cluster stting. changed to server.user_login.store_client_pre_hashed_passwords.enabled.

  2. we already have password constraints, thanks to your particular insistence to make #51502 happen.


-- commits, line 91 at r8:

Previously, bdarnell (Ben Darnell) wrote…

Yes, but this documentation is incomplete: it just refers to "the bcrypt hash", but that hash is specified alongside its nonstandard base64-encoding and in practice all implementations I know of give you the encoded form instead of something for you to encode with this alphabet yourself.

TIL - I didn't know the spec included the representation of the hash.
Simplified the release note accordingly.

@knz
Copy link
Copy Markdown
Contributor Author

knz commented Dec 1, 2021

all right, all this seems to hold together now. (We can do followup PRs to fix stuff if necessary)

bors r=RichardJCai,JeffSwenson,catj-cockroach,rafiss,bdarnell

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Dec 1, 2021

Build failed (retrying...):

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Dec 1, 2021

Build succeeded:

@craig craig bot merged commit e2140ec into cockroachdb:master Dec 1, 2021
@knz knz deleted the 20211105-pw branch December 1, 2021 20:03
@jeffswenson
Copy link
Copy Markdown
Collaborator

@knz I am interested in back porting this to v21.2. Do you have any objections to me creating a back port pr? I think it should be safe as long as I back port the setting default as "false".

@knz
Copy link
Copy Markdown
Contributor Author

knz commented Dec 8, 2021

@jeffswenson No objections. But we'll also need to teach @abhinavg6 about the backport process and how to weigh the pros/cons of a backport and issue a PM approval.

@abhinavg6
Copy link
Copy Markdown

Based on following information from @jeffswenson

Serverless uses this feature to reduce the connection setup latency from ~120ms to ~40ms. Connection setup latency is important for the Serverless product, because it is designed to be used from Serverless compute platforms like AWS Lambda. Lambda functions are started on demand and are unable to preallocate connections. As a result connection setup latency is something our users are likely to measure and it is in the critical path of serving requests. The thought is we would enable this setting by default for Serverless clusters. Serverless is still in beta, so we can accept more risk than the rest of Cockroach Cloud. If we decide not to back port the feature, then all of the passwords generated between now and May (the likely release date of 22.1) will be generated using the existing expensive hash function. Once the feature is released, we would need to ask our users to rotate their passwords in order for them to take advantage of the improved performance.

I approve the backport of this functionality into 21.2 release, so we don't create a future tax burden for our Serverless customers. We should ensure that the following principles are met:

  • Is this additive-only and only runs for customers who have specifically “opted in” to it? - I think the answer to that is Yes since it's controlled by a cluster setting.
  • Is the new code protected by a conditional check that is trivial to verify and ensures that it only runs for opt-in customers?

@jeffswenson
Copy link
Copy Markdown
Collaborator

blathers backport 21.2

@blathers-crl
Copy link
Copy Markdown

blathers-crl bot commented Dec 14, 2021

Encountered an error creating backports. Some common things that can go wrong:

  1. The backport branch might have already existed.
  2. There was a merge conflict.
  3. The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.


error creating merge commit from b208dc7 to blathers/backport-release-21.2-72579: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 21.2 failed. See errors above.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@knz
Copy link
Copy Markdown
Contributor Author

knz commented Dec 15, 2021

@jeffswenson do you know how to issue a backport by hand?

jeffswenson added a commit to jeffswenson/cockroach that referenced this pull request Dec 15, 2021
This commit contains the non-trivial changes needed to backport cockroachdb#72579.

Release note: None
jeffswenson added a commit to jeffswenson/cockroach that referenced this pull request Dec 16, 2021
This commit contains the non-trivial changes required to backport cockroachdb#72579.

server.user_login.store_client_pre_hashed_passwords.enabled's default
value was changed to false. Changing the default required updating some
of the tests to clear and set the setting.

The logic tests were adjusted to account for the lack of cockroachdb#71498. cockroachdb#71498
changed the AST representation of the role name to an identifier. This
is mostly transparent to users, but it changes the normalized
representation.

Release note (sql change): This backports the
server.user_login.store_client_pre_hashed_passwords.enabled setting. The
backported default value is false. In 22.1 the default is true.
jeffswenson added a commit to jeffswenson/cockroach that referenced this pull request Dec 17, 2021
This is part of the backport of cockroachdb#72579.

server.user_login.store_client_pre_hashed_passwords.enabled's default
value was changed to false. Changing the default required updating some
of the tests to clear and set the setting.

Release note (sql change): This backports the
server.user_login.store_client_pre_hashed_passwords.enabled setting. The
backported default value is false. In 22.1 the default is true.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

sql: recognize client-supplied hashes in WITH PASSWORD like pg

8 participants