-
Notifications
You must be signed in to change notification settings - Fork 4.1k
cli: the --insecure flag will be removed and replaced by easier-to-use always-secure options #53404
Description
Epic: CRDB-12037
The --insecure flag disables a number of security controls which are completely expected even for developers or folk trying out CockroachDB.
Instead, we aim to ensure that all clusters are secure, but offer new secure options where the user can choose the combination of security features that best matches their needs and their environment.
(Additionally, the word "insecure" gives the wrong impression that CockroachDB is not a secure database. The "insecure mode" only really exists for internal testing by the CockroachDB team and should not have been exposed to users from the start.)
What you need to know in a post-insecure world
CockroachDB aims to remain easy to use when all clusters are secured!
New clusters
Simplified decision making:
- for developers, single machine, no security requirements:
cockroach demo, automatically secure- single-node secure cluster, with auto-managed TLS
- multiple machines or prod system:
- for node-node traffic:
- no special requirement on inter-node traffic: auto-managed node-node TLS
- special requirements:
- org-mandated PKI: use org-mandated PKI to manage node-node TLS
- user responsible for network security: no-TLS RPC
- for node-client traffic:
- no special requirements: auto-managed SQL/HTTP TLS with either password or TLS authn
This is the most common pg-compatible scenario! - special requirements:
- org-mandated PKI: use org-mandated cert manager to manage SQL/HTTP TLS
- user responsible for network security: no-TLS SQL/HTTP
- no special requirements: auto-managed SQL/HTTP TLS with either password or TLS authn
- for node-node traffic:
Cluster is secure in all cases.
Threat model
Node-node connections
| Threat \ Protection | TLS transport+authn | no-TLS + Shared secret | no-TLS + no auth (--insecure) |
Mitigation possible via network restrictions? |
|---|---|---|---|---|
| Mistaken node join from wrong cluster | protected | protected | vulnerable | no |
| Complete cluster takeover via malicious node | protected | vulnerable via bruteforce | vulnerable | partial |
| Attacker MITM snoop of authn credentials | protected | vulnerable | vulnerable | yes |
| Attacker accesses confidential data via network snoop | protected | vulnerable | vulnerable | yes |
| Attacker modifies cluster data in-flight | protected | vulnerable | vulnerable | yes |
| Compromised authn credentials | Fix via cert rotation/revocation (online) | Update secret + full restart | N/A | partial |
Note: the vulnerabilities in the 2nd and 3rd column are amplified by sharing the same TCP listener (address/port) between SQL and RPC listeners.
Partial mitigation possible via separate --sql-addr and fencing off the node-to-node traffic into a private network.
Node-client connections
| Threat \ Protection | TLS transport+authn | no-TLS + SCRAM authn | no-TLS + no auth (--insecure) |
Mitigation possible via network restrictions? |
|---|---|---|---|---|
| Mistaken connect from/to wrong app or wrong cluster | protected | protected | vulnerable | no |
| Complete cluster takeover via escalation of privilege | protected | protected | vulnerable | partial |
| Rogue client app unrestricted access to data of 1 SQL user | protected | protected | vulnerable | partial |
| Attacker MITM snoop of authn credentials | protected | protected | vulnerable | yes |
| Attacker accesses app data data via network snoop | protected | vulnerable | vulnerable | yes |
| Attacker modifies app data in-flight | protected | vulnerable | vulnerable | yes |
| Compromised authn credentials | Fix via cert rotation/revocation (online) | Update password (online) | N/A | no |
Note that the first column in the table above assumes that clients also validate server
TLS certs! Otherwise the following table applies:
| Threat \ Protection | Two-way cert validation | Only srv auths client |
|---|---|---|
| Mistaken connect from/to wrong app or wrong cluster | protected | protected |
| Complete cluster takeover via escalation of privilege | protected | protected |
| Rogue client app unrestricted access to data of 1 SQL user | protected | protected |
| Attacker MITM snoop of authn credentials | protected | vulnerable |
| Attacker accesses app data data via network snoop | protected | vulnerable |
| Attacker modifies app data in-flight | protected | vulnerable |
| Compromised authn credentials | Fix via cert rotation/revocation (online) | still vulnerable (!) |
Existing clusters previously run with "insecure mode"
- determine the secure configuration that suits your needs (see above)
- prepare configuration:
- create authn credentials for SQL users (can be simple password)
- prepare server flags and TLS certs that suit the needs identified above
- add flag
--security-upgradeto node configs.
- perform rolling restart of cluster. At this point the cluster accepts but does not mandate the secure config.
- remove flag
--security-upgradefrom node configs - perform 2nd rolling restart. This enforces the secure configuration.
Advanced decision flowchart under the fold
Rationale
Background
For context, --insecure does the following:
- deactivates TLS handshakes for node-node connections
- deactivates TLS handshakes for node-client RPC connections
- deactivates TLS handshakes for node-client SQL connections over TCP
- deactivates TLS handshakes for node-client HTTP connections
- deactivates node-to-node authentication
- deactivates HTTP authentication
- deactivates SQL authentication
- deactivates SQL authorization
- deactivate certain SQL features, so as to not create the illusion of security
Motivation
We know from experience that customers who advocate for "insecure mode" really only care about:
- the complexity of setting up TLS certificates for nodo-to-node authentication, or
- they want TLS-less SQL connections because they are running a server over a privileged network, for example when the nodes and the SQL app are inside the same k8s cluster (see server,ui: infinite HTTP redirect when accessing UI for secure cluster behind nginx-ingress in k8s helm-charts#228, cli,server: enable SQL clients on TCP with non-TLS modes #44842).
These users certainly do not want to disable authentication and authorization, yet we also know that users do not realize that --insecure disables these internal protections inside cockroachdb.
Some users also care about point 4 because setting up TLS certs in a web browser is a pain. However since v20.1 we have a solution for those users: --unencrypted-localhost-http makes the HTTP endpoint localhost-only without TLS.
Strategy
- in v20.2 make
--insecuredeprecated - Solve cli,server: enable SQL clients on TCP with non-TLS modes #44842 so that users get a choice to connect with SQL over non-TLS connections - pgwire: accept non-TLS client conns safely in secure mode #53991
- Solve server: handle some HTTP APIs without TLS #48069 to polish the HTTP probe experience
- Advance RFC: auto-generation of TLS certificates for new clusters and newly added nodes #51991 to simplify the TLS usage in the common case
- Possibly solve server: enable secure clusters with authenticated gRPC, but without TLS #54007 to offer an online upgrade path from insecure clusters to secure
- advertise these mechanisms in docs
- update roachtests to always use a secure cluster, sometimes without TLS to preserve the ability to capture network activity between clients and nodes
- remove support for the flag in servers in v21.1 or v21.2
- remove support for the flag in clients in the release after that (to preserve the ability for a new client to connect to a server running the version before)
Epic CRDB-12037
Jira issue: CRDB-3870
