cli: the `--insecure` flag will be removed and replaced by easier-to-use always-secure options

Epic: CRDB-12037

The `--insecure` flag disables a number of security controls which are completely expected even for developers or folk trying out CockroachDB.

Instead, we aim to ensure that all clusters are secure, but offer **new secure options** where the user can choose the combination of security features that best matches their needs and their environment.

(Additionally, the word "insecure" gives the wrong impression that CockroachDB is not a secure database. The "insecure mode" only really exists for internal testing by the CockroachDB team and should not have been exposed to users from the start.)

- [What you need to know in a post-insecure world](#what-you-need-to-know-in-a-post-insecure-world)
  - [New clusters](#new-clusters)
  - [Existing clusters](#existing-clusters-previously-run-with-insecure-mode)
- [Rationale](#rationale)
  - [Background](#background)
  - [Motivation](#motivation) 
  - [Strategy](#strategy)

## What you need to know in a post-insecure world

CockroachDB aims to remain easy to use when all clusters are secured!

### New clusters


Simplified decision making:
- for developers, single machine, no security requirements: 
  - `cockroach demo`, automatically secure
  - single-node secure cluster, with auto-managed TLS
- multiple machines or prod system:
  - for node-node traffic:
    - no special requirement on inter-node traffic: **auto-managed node-node TLS**
    - special requirements: 
      - org-mandated PKI: use org-mandated PKI to manage node-node TLS
      - user responsible for network security: no-TLS RPC
  - for node-client traffic:
    - no special requirements: **auto-managed SQL/HTTP TLS with either password or TLS authn**  
      **This is the most common pg-compatible scenario!**
    - special requirements:
      - org-mandated PKI: use org-mandated cert manager to manage SQL/HTTP TLS
      - user responsible for network security: no-TLS SQL/HTTP

Cluster is secure in all cases.

### Threat model

**Node-node connections**

| Threat \ Protection                                   | TLS transport+authn                       | no-TLS + Shared secret       | no-TLS + no auth (`--insecure`) | Mitigation possible via network restrictions? |
|-------------------------------------------------------|-------------------------------------------|------------------------------|---------------------------------|----------------------------------------------|
| Mistaken node join from wrong cluster                 | protected                                 | protected                    | vulnerable                      | no                                           |
| Complete cluster takeover  via malicious node         | protected                                 | vulnerable via bruteforce    | vulnerable                      | partial                                      |
| Attacker MITM snoop of authn credentials              | protected                                 | vulnerable                   | vulnerable                      | yes                                          |
| Attacker accesses confidential data via network snoop | protected                                 | vulnerable                   | vulnerable                      | yes                                          |
| Attacker modifies cluster data in-flight              | protected                                 | vulnerable                   | vulnerable                      | yes                                          |
| Compromised authn credentials                         | Fix via cert rotation/revocation (online) | Update secret + full restart | N/A                             | partial                                      |

Note: the vulnerabilities in the 2nd and 3rd column are amplified by sharing the same TCP listener (address/port) between SQL and RPC listeners.

Partial mitigation possible via separate `--sql-addr` and fencing off the node-to-node traffic into a private network.

**Node-client connections**

| Threat \ Protection                                        | TLS transport+authn                       | no-TLS + SCRAM authn     | no-TLS + no auth (`--insecure`) | Mitigation possible via network restrictions? |
|------------------------------------------------------------|-------------------------------------------|--------------------------|---------------------------------|-----------------------------------------------|
| Mistaken connect from/to wrong app or wrong cluster        | protected                                 | protected                | vulnerable                      | no                                            |
| Complete cluster takeover via  escalation of privilege     | protected                                 | protected                | vulnerable                      | partial                                       |
| Rogue client app unrestricted access to data of 1 SQL user | protected                                 | protected                | vulnerable                      | partial                                       |
| Attacker MITM snoop of authn credentials                   | protected                                 | protected                | vulnerable                      | yes                                           |
| Attacker accesses app data data via network snoop          | protected                                 | vulnerable               | vulnerable                      | yes                                           |
| Attacker modifies app data in-flight                       | protected                                 | vulnerable               | vulnerable                      | yes                                           |
| Compromised authn credentials                              | Fix via cert rotation/revocation (online) | Update password (online) | N/A                             | no                                            |

Note that the first column in the table above assumes that clients also validate server
TLS certs! Otherwise the following table applies:

| Threat \ Protection                                        | Two-way cert validation                   | Only srv auths client |
|------------------------------------------------------------|-------------------------------------------|-----------------------|
| Mistaken connect from/to wrong app or wrong cluster        | protected                                 | protected             |
| Complete cluster takeover via  escalation of privilege     | protected                                 | protected             |
| Rogue client app unrestricted access to data of 1 SQL user | protected                                 | protected             |
| Attacker MITM snoop of authn credentials                   | protected                                 | vulnerable            |
| Attacker accesses app data data via network snoop          | protected                                 | vulnerable            |
| Attacker modifies app data in-flight                       | protected                                 | vulnerable            |
| Compromised authn credentials                              | Fix via cert rotation/revocation (online) | still vulnerable (!)  |



### Existing clusters previously run with "insecure mode"

1. determine the secure configuration that suits your needs (see above)
2. prepare configuration:
   - create authn credentials for SQL users (can be simple password)
   - prepare server flags and TLS certs that suit the needs identified above
   - add flag `--security-upgrade` to node configs.
3. perform rolling restart of cluster. At this point the cluster accepts but does not mandate the secure config.
4. remove flag `--security-upgrade` from node configs
5. perform 2nd rolling restart. This enforces the secure configuration.

Advanced decision flowchart under the fold

<details>

![Variants of secure clusters](https://user-images.githubusercontent.com/642886/92719011-dc80e200-f362-11ea-88cd-4844702c81f2.png)

</details>

## Rationale

### Background

For context, `--insecure` does the following:

1. deactivates TLS handshakes for node-node connections
2. deactivates TLS handshakes for node-client RPC connections
3. deactivates TLS handshakes for node-client SQL connections over TCP
4. deactivates TLS handshakes for node-client HTTP connections
5. deactivates node-to-node authentication
6. deactivates HTTP authentication
7. deactivates SQL authentication
8. deactivates SQL authorization
9. deactivate certain SQL features, so as to not create the illusion of security

### Motivation

We know from experience that customers who advocate for "insecure mode" really only care about:

- the complexity of setting up TLS certificates for nodo-to-node authentication, or
- they want TLS-less SQL connections because they are running a server over a privileged network, for example when the nodes and the SQL app are inside the same k8s cluster (see  cockroachdb/helm-charts#228, cockroachdb/cockroach#44842).

These users certainly do not want to disable authentication and authorization, yet *we also know that users do not realize that `--insecure` disables these internal protections* inside cockroachdb.

Some users also care about point 4 because setting up TLS certs in a web browser is a pain. However since v20.1 we have a solution for those users: `--unencrypted-localhost-http` makes the HTTP endpoint localhost-only without TLS.

### Strategy

- [x] in v20.2 make `--insecure` deprecated
- [x] Solve cockroachdb/cockroach#44842 so that users get a choice to connect with SQL over non-TLS connections - cockroachdb/cockroach#53991
- [x] Solve  cockroachdb/cockroach#48069 to polish the HTTP probe experience
- [ ] Advance cockroachdb/cockroach#51991 to simplify the TLS usage in the common case
- [ ] Possibly solve cockroachdb/cockroach#54007 to offer an online upgrade path from insecure clusters to secure
- [ ] advertise these mechanisms in docs
- [ ] update roachtests to always use a secure cluster, sometimes without TLS to preserve the ability to capture network activity between clients and nodes
- [ ] remove support for the flag in *servers* in v21.1 or v21.2
- [ ] remove support for the flag in *clients* in the release after that (to preserve the ability for a new client to connect to a server running the version before)



Epic CRDB-12037

Jira issue: CRDB-3870

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cli: the `--insecure` flag will be removed and replaced by easier-to-use always-secure options #53404

What you need to know in a post-insecure world

New clusters

Threat model

Existing clusters previously run with "insecure mode"

Rationale

Background

Motivation

Strategy

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Threat \ Protection	TLS transport+authn	no-TLS + Shared secret	no-TLS + no auth (`--insecure`)	Mitigation possible via network restrictions?
Mistaken node join from wrong cluster	protected	protected	vulnerable	no
Complete cluster takeover via malicious node	protected	vulnerable via bruteforce	vulnerable	partial
Attacker MITM snoop of authn credentials	protected	vulnerable	vulnerable	yes
Attacker accesses confidential data via network snoop	protected	vulnerable	vulnerable	yes
Attacker modifies cluster data in-flight	protected	vulnerable	vulnerable	yes
Compromised authn credentials	Fix via cert rotation/revocation (online)	Update secret + full restart	N/A	partial

Threat \ Protection	TLS transport+authn	no-TLS + SCRAM authn	no-TLS + no auth (`--insecure`)	Mitigation possible via network restrictions?
Mistaken connect from/to wrong app or wrong cluster	protected	protected	vulnerable	no
Complete cluster takeover via escalation of privilege	protected	protected	vulnerable	partial
Rogue client app unrestricted access to data of 1 SQL user	protected	protected	vulnerable	partial
Attacker MITM snoop of authn credentials	protected	protected	vulnerable	yes
Attacker accesses app data data via network snoop	protected	vulnerable	vulnerable	yes
Attacker modifies app data in-flight	protected	vulnerable	vulnerable	yes
Compromised authn credentials	Fix via cert rotation/revocation (online)	Update password (online)	N/A	no

cli: the --insecure flag will be removed and replaced by easier-to-use always-secure options #53404

Description

What you need to know in a post-insecure world

New clusters

Threat model

Existing clusters previously run with "insecure mode"

Rationale

Background

Motivation

Strategy

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

cli: the `--insecure` flag will be removed and replaced by easier-to-use always-secure options #53404