Adding stack_monitoring_agent role by ycombinator · Pull Request #34369 · elastic/elasticsearch

ycombinator · 2018-10-09T17:06:43Z

Motivation

Starting with 6.4.0 users have been able to monitor (in the x-pack sense) Kibana using Metricbeat. And starting with 6.5.0 users will be able to monitor Elasticsearch using Metricbeat as well.

If users are using more than a Basic license with security enabled in Elasticsearch (xpack.security.enabled: true), then Metricbeat will need to provide a username and password to connect to Elasticsearch. The username must belong to a user with a security role that has sufficient privileges for Metricbeat to collect monitoring data for the stack and index it into monitoring indices, .monitoring-*. For discussion purposes, let's call this role stack_monitoring_agent.

Metricbeat needs to call APIs of each Elastic stack product that it's monitoring to collect metrics for these products. So the stack_monitoring_agent role must have sufficient privileges to call these APIs.

Additionally, Metricbeat needs to connect to the Elasticsearch monitoring cluster in order to index monitoring data collected from various Elastic stack products into .monitoring-* indices. So the stack_monitoring_agent role must have sufficient privileges to perform this indexing operation, including creating new time-based indices when necessary.

Access Granularity

The above section suggests that we have a single stack_monitoring_agent role to cover all privileges necessary for collection as well as indexing of monitoring data. This makes the stack_monitoring_agent role a very coarsely-grained role.

The benefit of a coarse level of granularity is that it is easier for users to get started with Metricbeat stack monitoring and security: at minimum, they need only create a single user having this single role and they are all set to use that user in their Metricbeat module configurations for Elastic stack modules (for collection purposes) as well as their Metricbeat elasticsearch output configuration (for indexing purposes). The other benefit of a single, coarse role is that it also provides abstraction: we may increase or reduce privileges behind this single role over time without requiring any changes from the user. The downside of this coarse level of granularity, of course, is lessened security due to broad access regardless of which specific product a single Metricbeat instance might be monitoring.

At the other end of the granularity spectrum is having a plethora of fine-grained roles for collecting monitoring data from specific Elastic stack products as well as for indexing monitoring data into specific monitoring indices (.monitoring-es-*, .monitoring-kibana-*, etc.). The benefit of this fine level of granularity is increased security because of the ability to provide fine-tuned access. The downside is the potential for a user to screw up their configuration in defining security users and associating the right roles with these users and further with setting up Metricbeat configuration to use the right security users in the right configuration files.

And, of course, there are variations of access granularity in between the two ends of the spectrum mentioned above.

After discussion with @ruflin, we decided to go with the coarse granularity approach in order to favor ease-of-use of of the box. Concretely, that means we will provide a built-in stack_monitoring_agent security role.

Of course, users who want to have finer-grained access may choose to create their own custom roles with the desired access granularity. In order to do this, users may need to know about some lower-level details about stack monitoring, including details that might change over time (such as index names). We may consider documenting these details for such users, either publicly or as a support knowledge base article, probably based on demand.

This PR implements this stack_monitoring_agent built-in role.

elasticmachine · 2018-10-10T01:01:44Z

Pinging @elastic/es-security

elasticmachine · 2018-10-10T01:01:45Z

Pinging @elastic/es-core-infra

ycombinator · 2018-10-10T01:26:19Z

We'll handle documentation for this new built-in stack_monitoring_agent role in a separate PR(s), as part of the broader effort for documenting Stack monitoring with Metricbeat that @lcawl is working on.

ruflin

Looks like CI failure is related? Overall LGTM.

albertzaharovits · 2018-10-14T20:55:18Z

...core/src/main/java/org/elasticsearch/xpack/core/security/authz/store/ReservedRolesStore.java

@ycombinator can you please explain why manage_index_templates is required ?
Isn't .indices(".monitoring-*", "metricbeat-*").privileges("index", "create_index") enough?

can you please explain why manage_index_templates is required ?

When Metricbeat starts up today, it checks if the Metricbeat index template exists and, if not, tries to create it.

Isn't .indices(".monitoring-", "metricbeat-").privileges("index", "create_index") enough?

Creating templates is a cluster-level privilege (AFAIK) so having these index-level privileges is not enough.

One thing I could do here to tighten things up a bit is to specifically grant the indices:admin/template/get and indices:admin/template/put cluster-level privileges instead of the broader manage_index_templates cluster-level privilege.

[EDIT] I've done this in a33449a73e7. Let me know what you think.

What I was thinking about, was, do you need both to put template and create index. Cannot Metricbeat use one or the other. Preferably create index because we have tighter authz for it (the privilege has the index name pattern), although in this case, when for example there is a centralized monitoring cluster, a template would be more suitable.

[EDIT] I've done this in a33449a. Let me know what you think.

So, you've removed the delete action but put is an overwrite (our fault not yours 😃). It's fine either way.

albertzaharovits · 2018-10-14T20:56:12Z

...core/src/main/java/org/elasticsearch/xpack/core/security/authz/store/ReservedRolesStore.java

Also, could you please explain why .indices(".kibana*").privileges("read") is required?

could you please explain why .indices(".kibana*").privileges("read") is required?

The kibana module in metricbeat calls the Kibana Settings API to retrieve a specific setting, xPack:defaultAdminEmail and index it into .monitoring-kibana-*. This value is then used for Monitoring's Cluster Alerts feature today. This feature uses Watcher today; every Cluster Alerts watch queries .monitoring-kibana-* for the value of this setting in order to decide which email address to send Cluster Alerts notifications to.

In the future, when Alerting is a Kibana feature, we will no longer need to index this setting's value in .monitoring-kibana-*. At that time we can remove this particular API call from metricbeat and this privilege as well.

I am not sorry, but I understand that Metricbeat calls a Kibana API and does not require to query the .kibana index (that API might do it on its account, but Metricbeat does not handle it itself). Please correct me here if I'm wrong.
The .kibana index is a treasure trove, we should be wary when reading it outside of through Kibana itself.

It's correct that Beats does not query the index directly but through the API. As far as I understood it so far still the user needs to have these access rights so the API call is executed correctly. @ycombinator @chrisronline can perhaps confirm or correct?

The user which metricbeat is using does need .kibana privileges as the user's role is used to determine permissions within the Kibana api. This particular api needs to query against the .kibana index so the user who makes the api request needs the appropriate permissions

The .kibana index is a treasure trove, we should be wary when reading it outside of through Kibana itself.

Adding to what @ruflin and @chrisronline said, the GET /api/settings Kibana API that Metricbeat calls reads the .kibana index, but only exposes a very narrow view of what's in that index. Here's the result of a call I just made locally while using the elastic superuser:

{ "cluster_uuid":"zLXBfdzPSrO12OBIz3ybyg", "settings":{ "xpack":{ "default_admin_email":"foo@bar.com" }, "kibana":{ "uuid":"5b2de169-2785-441b-ae8c-186a1936b17d", "name":"Shaunaks-MacBook-Pro-2.local", "index":".kibana", "host":"localhost", "transport_address":"localhost:5601", "version":"7.0.0-alpha1", "snapshot":false, "status":"green" } } }

It may not be super obvious from the code in Kibana for this API, but it only returns the xPack:defaultAdminEmail setting and other metadata about the kibana instance:

https://github.com/elastic/kibana/blob/master/x-pack/plugins/monitoring/server/kibana_monitoring/collectors/get_settings_collector.js

Okay understood! Thank you @ycombinator and @chrisronline .

ycombinator · 2018-10-15T01:30:03Z

jenkins, test this

tvernum · 2018-10-15T05:37:33Z

My feeling is that the choice to use a single role is the wrong trade-off. but it may not matter too much, as long as the following 2 things are true:

remote_monitoring_agent provides all the access needed to perform the indexing of monitoring data
metricbeat will support using different users for collection & indexing.

The main use case I'm worried about here is customers with a central monitoring cluster. In that case, the admin for the monitoring cluster will want to issue credentials that allow indexing the data, but will not want to grant read/monitor access to that cluster.
I would like us to have an easy to setup answer to that problem. I think remote_monitoring_agent is the answer, provided that this new stack-monitoring doesn't require anything outside of that role.

I think the counter-side, of the monitored cluster not wanting to grant monitoring-indexing privileges is also real, but far less likely to be a concern.

they need only create a single user having this single role

Why are we proposing that the customer needs to create the user? Shouldn't we ship a reserved user with the necessary role(s)?

ycombinator · 2018-10-15T07:34:25Z

@tvernum Thanks for your comment. I think I'm following but just a quick clarification: when you said remote_monitoring_agent, did you mean the stack_monitoring_agent role introduced in this PR? Asking because there is already an existing remote_monitoring_agent role that is used currently and has slightly different privileges.

If I'm following you correctly, at minimum what we'd want to provide (in terms of roles) would be:

a role to index data into the monitoring cluster.
a role to collect data from a production (aka monitored) cluster.

In essence, that'd mean splitting up the stack_monitoring_agent role introduced in this PR into the above two roles.

Further, we'd want to provide built-in users associated with each of these roles, for ease of setup.

Am I understanding this correctly?

albertzaharovits · 2018-10-15T10:07:14Z

I am with @tvernum on splitting the role in two. And also, I think he suggests that the remote_monitoring_agent role we have now, can be used as the index half of the proposed stack_monitoring_agent.

We don't need to create two users out of the box. The Metricbeat user (that we ship with? - no creation during setup?) can have these two roles. But when the customer asks how to deal with the scenarios Tim's describing we have an easy answer, that being to create a user with only one of these roles.

tvernum · 2018-10-16T03:58:22Z

@albertzaharovits has understood me correctly.

For background, I think the original proposal falls into a mistake we have made a few times before where we make it easy to get started with something that meets the security needs of the most common case, but when a customer has tighter security needs, we don't have a lot to offer them out-of-the-box and we put the work on to them.

I would ike us to being looking for ways that we can give customers a variety of options along that ease-of-use/tightness-of-security balance, that is, as you need to tighten security you don't get a disproportionate drop in ease-of-use.

For this, I think 1 builtin user with 2 roles is a good way to offer that.

The easiest option is to use the builtin user
Customers who don't want to use the builtin user (e.g. they want to avoid "known generic users", or want to use PKI auth, or an LDAP user) can create a new user with those 2 roles
Customers who want to split collection and indexing can create new users with whichever role is appropriate.

I'm open to other approaches, but I don't think we should take a position of "if you want to separate collection and indexing then you need to define your own roles and then test & update them with every release". We want to support customers who need tighter security than our default setup.

tvernum · 2018-10-16T04:10:59Z

My earlier comment was intentionally refering to remote_monitoring_agent because at the time I felt like it would be a reasonable approximation of the indexing role that we need, and maybe that would be good enough (because I think the "indexing only" requirement is likely to be the higher priority).

So at the very least we could then document "if you are collecting your stack monitoring data into a remote cluster, you can do it with any user that has the remote_monitoring_agent role".

I'd still be OK with that option, but having a builtin user and 2 completely distinct roles would be better.

tvernum · 2018-10-19T02:54:22Z

@ycombinator I hope you don't mind, I pushed to your branch to fix the broken test from the previous CI run.

ycombinator · 2018-10-19T04:46:02Z

Not at all, @tvernum. Appreciate the help!

ycombinator · 2018-10-19T08:42:51Z

@tvernum CI has gone green after your latest commit. Mind giving this PR another review? Thanks.

ycombinator · 2018-10-23T06:17:55Z

Hey @tvernum, GitHub seems to be back to normal now. Mind reviewing this PR again? Thanks.

tvernum

LGTM, but I'd like @jaymode to comment on the changes to the agent role.

We'll also need to do docs updates, but that can be a separate PR.

tvernum · 2018-10-23T06:45:54Z

...core/src/main/java/org/elasticsearch/xpack/core/security/authz/store/ReservedRolesStore.java

-                                RoleDescriptor.IndicesPrivileges.builder().indices(".monitoring-*").privileges("all").build() },
+                                RoleDescriptor.IndicesPrivileges.builder().indices(".monitoring-*").privileges("all").build(),
+                                RoleDescriptor.IndicesPrivileges.builder()
+                                    .indices("metricbeat-*").privileges("index", "create_index").build() },


@jaymode Any thoughts here?
This is grant write access to the metricbeat-* indices for an existing role.
On the one hand I worry about expanding the access for roles that are in use by customers, on the other hand creating an ever increasing set of new roles is a usability nightmare for customers.

Are you happy to run with this as is, and handle it in the release notes?

I am happy to go with this; reserved roles are controlled by us and if we feel this makes sense, which I think it does, then expanding access is ok.

ycombinator · 2018-10-23T06:55:34Z

Thanks @tvernum. I'll wait for @jaymode's comments. Re: docs updates we are handling that separately over here: elastic/stack-docs#127.

tvernum · 2018-10-23T07:13:19Z

@ycombinator Unfortunately, we currently have references to the builtin users & roles across a bunch of docs in multiple repos, but I imagine @lcawl has it under control.

stack-docs/docs/en/stack/security/authentication/built-in-users.asciidoc
stack-docs/docs/en/stack/security/authorization/built-in-roles.asciidoc
stack-docs/docs/en/stack/security/get-started-builtin-users.asciidoc
stack-docs/docs/en/stack/security/get-started-security.asciidoc
stack-docs/docs/upgrading-stack.asciidoc
elasticsearch/x-pack/docs/en/security/configuring-es.asciidoc
elasticsearch/docs/reference/commands/setup-passwords.asciidoc
elasticsearch/docs/reference/setup/install/windows.asciidoc

tvernum · 2018-10-23T07:15:56Z

@elastic/microsoft Does the MSI installer need to get updated when we add a new user to setup-passwords? If so, this PR will introduce a new remote_monitoring_user.

pickypg

LGTM. At a certain point do we expect to deprecate and remove the old users, thereby forcing the user to choose to create new, specific users or live with the coarse version?

ycombinator · 2018-10-24T08:03:39Z

At a certain point do we expect to deprecate and remove the old users, thereby forcing the user to choose to create new, specific users or live with the coarse version?

@pickypg Just so I'm understanding clearly, specifically which old users are you referring to?

jaymode · 2018-10-24T14:15:18Z

@joegallo fyi, for 6.5 there will be some more role updates

ycombinator · 2018-10-24T14:19:05Z

Thanks @jaymode. @pickypg and others, we can continue further conversation on some of the other questions posted here, but I think this PR is good to merge now. Merging...

* Adding stack_monitoring_agent role * Fixing checkstyle issues * Adding tests for new role * Tighten up privileges around index templates * s/stack_monitoring_user/remote_monitoring_collector/ + remote_monitoring_user * Fixing checkstyle violation * Fix test * Removing unused field * Adding missed code * Fixing data type * Update Integration Test for new builtin user

ycombinator · 2018-10-24T14:31:39Z

Backport to 6.x PR: #34805

* Adding stack_monitoring_agent role * Fixing checkstyle issues * Adding tests for new role * Tighten up privileges around index templates * s/stack_monitoring_user/remote_monitoring_collector/ + remote_monitoring_user * Fixing checkstyle violation * Fix test * Removing unused field * Adding missed code * Fixing data type * Update Integration Test for new builtin user

russcam · 2018-10-26T02:48:08Z

Thanks for the ping @tvernum.

The MSI installer does not use setup-passwords.bat file, so a new user added to the batch file will not affect the installer. Is remote_monitoring_user required to be configured? If so, we will need to update the installer

tvernum · 2018-10-26T03:06:58Z

Thanks @russcam - I was aware that you don't actually call the passwords tool, it was more a case of whether you/we want to keep the MSI in sync with users handled by setup-passwords.
I don't have a strong opinion on that, other than that the stack monitoring docs will be will probably all assume that the user has a password and the admin knows it. However, I suspect that is often not true, and the docs should probably start with a message of "If you don't know the password, or never set a password, then ..."

Given that the person doing the initial install of the stack will often not know which users they will need, or even what their purposes is, I wonder whether we ought to move to a model where we configure the minimum set of passwords (elastic & kibana probably) and then document the process for manually setting passwords for other users when they're needed.

tvernum · 2018-10-26T03:08:15Z

@colings86 Is there a reason why we don't want the :Security/Authorization label?

colings86 · 2018-10-26T07:24:50Z

@tvernum The current script that generates the release notes does not cope well with PRs that have multiple area labels because it doesn't know where in the release notes to put them. Whilst this is not ideal and needs to be fixed in the new script we are planning for release up until we have this new script the release manager for ES (me for 6.5.0) will need to make sure all PRs only have 1 area label. On this PR monitoring seemed like the more appropriate of the two but I'm happy to switch it the other way if you think thats better?

russcam · 2018-10-29T00:11:02Z

@tvernum the installer only configures elastic, kibana and logstash_system as of now, we do not configure beats_system either.

I wonder whether we ought to move to a model where we configure the minimum set of passwords (elastic & kibana probably) and then document the process for manually setting passwords for other users when they're needed.

Agreed - keep the "from download to up and running" configuration as simple as possible.

* Adding stack_monitoring_agent role * Fixing checkstyle issues * Adding tests for new role * Tighten up privileges around index templates * s/stack_monitoring_user/remote_monitoring_collector/ + remote_monitoring_user * Fixing checkstyle violation * Fix test * Removing unused field * Adding missed code * Fixing data type * Update Integration Test for new builtin user

ycombinator added WIP :Core/Infra/Monitoring DEPRECATED, DO NOT USE :Security/Authorization Roles, Privileges, DLS/FLS, RBAC/ABAC labels Oct 10, 2018

ycombinator force-pushed the security-role-stack-monitoring branch from 3880abd to 580e2c7 Compare October 10, 2018 03:02

ycombinator added review and removed WIP labels Oct 10, 2018

ycombinator requested review from jaymode, pickypg, ruflin and tlrx October 10, 2018 03:18

ruflin reviewed Oct 10, 2018

View reviewed changes

ycombinator force-pushed the security-role-stack-monitoring branch from 580e2c7 to d2c9824 Compare October 10, 2018 08:36

rjernst removed the review label Oct 10, 2018

lcawl mentioned this pull request Oct 10, 2018

[DOCS] Adding stack_monitoring_agent role elastic/stack-docs#127

Merged

ycombinator added v7.0.0 v6.5.0 backport pending labels Oct 14, 2018

albertzaharovits reviewed Oct 14, 2018

View reviewed changes

This was referenced Oct 15, 2018

[DOCS] Adds security for monitoring steps elastic/kibana#24052

Merged

[DOCS] Updates Elasticsearch monitoring tasks #34339

Merged

tvernum approved these changes Oct 23, 2018

View reviewed changes

pickypg approved these changes Oct 23, 2018

View reviewed changes

ycombinator merged commit 0f1a5ec into elastic:master Oct 24, 2018

ycombinator deleted the security-role-stack-monitoring branch October 24, 2018 14:20

ycombinator mentioned this pull request Oct 24, 2018

Built-in roles and user for stack monitoring with beats #34805

Merged

ycombinator removed the backport pending label Oct 24, 2018

colings86 added >enhancement and removed :Security/Authorization Roles, Privileges, DLS/FLS, RBAC/ABAC labels Oct 25, 2018

lcawl mentioned this pull request Nov 7, 2018

[DOCS] Adds remote_monitoring_user to security tutorial elastic/stack-docs#143

Merged

jimczi added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019

Conversation

ycombinator commented Oct 9, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Access Granularity

Uh oh!

elasticmachine commented Oct 10, 2018

Uh oh!

elasticmachine commented Oct 10, 2018

Uh oh!

ycombinator commented Oct 10, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ruflin left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ycombinator Oct 15, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ycombinator Oct 15, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ycombinator Oct 15, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ycombinator commented Oct 15, 2018

Uh oh!

tvernum commented Oct 15, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ycombinator commented Oct 15, 2018

Uh oh!

albertzaharovits commented Oct 15, 2018

Uh oh!

tvernum commented Oct 16, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tvernum commented Oct 16, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tvernum commented Oct 19, 2018

Uh oh!

ycombinator commented Oct 19, 2018

Uh oh!

ycombinator commented Oct 19, 2018

Uh oh!

ycombinator commented Oct 23, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tvernum left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ycombinator commented Oct 23, 2018

Uh oh!

tvernum commented Oct 23, 2018

Uh oh!

tvernum commented Oct 23, 2018

ycombinator commented Oct 9, 2018 •

edited

Loading

ycombinator commented Oct 10, 2018 •

edited

Loading

ycombinator Oct 15, 2018 •

edited

Loading

ycombinator Oct 15, 2018 •

edited

Loading

ycombinator Oct 15, 2018 •

edited

Loading

tvernum commented Oct 15, 2018 •

edited

Loading

tvernum commented Oct 16, 2018 •

edited

Loading

tvernum commented Oct 16, 2018 •

edited

Loading

ycombinator commented Oct 23, 2018 •

edited

Loading

tvernum commented Oct 26, 2018 •

edited

Loading