Skip to content

Add Azure tutorial that shows Elastic Agent#2174

Merged
dedemorton merged 19 commits intoelastic:mainfrom
dedemorton:issue#2091_azure
Oct 10, 2022
Merged

Add Azure tutorial that shows Elastic Agent#2174
dedemorton merged 19 commits intoelastic:mainfrom
dedemorton:issue#2091_azure

Conversation

@dedemorton
Copy link
Copy Markdown
Contributor

@dedemorton dedemorton commented Sep 10, 2022

Preview: https://observability-docs_2174.docs-preview.app.elstc.co/guide/en/observability/master/monitor-azure-elastic-agent.html

Open issues and questions:

  • @zmoog In the Elastic Agent tutorial, which integrations do we want to showcase? Right now we only show billing metrics. TBH, when I look at Kibana, I can't tell if there is a high level integration we can use to capture all the things. Need to know specifically which integrations we want to showcase in this tutorial. Should be at least one for metrics (billing?) and one for logs.
    Decision: We will cover activity logs or Active Directory logs...not both. DeDe will make final decision.
  • @zmoog What do users need to do to send logs to an event hub and which of these steps should we document in the tutorial?
    Outcome: DeDe and Maurizio met to go through each scenario. DeDe will go through the steps and document them.
  • Does everyone agree that putting the native Azure integration steps in a separate topic makes sense? yes
  • @zmoog Are the details in the cloud docs accurate? I would like to point to this content for more info about the native Azure integration.
    Decision: We'll assume that these docs are correct and point to them.

Plus all the comments here: #2174 (comment)

TODO before merging:

TODO after merging:

@dedemorton dedemorton added the backport-8.4 Automated backport with mergify label Sep 10, 2022
@dedemorton dedemorton self-assigned this Sep 10, 2022
@ghost
Copy link
Copy Markdown

ghost commented Sep 10, 2022

A documentation preview will be available soon:

@zmoog
Copy link
Copy Markdown
Contributor

zmoog commented Sep 21, 2022

which integrations do we want to showcase?

We have four Azure integrations, each with one or more data streams:

  • Azure Logs
    • Activity logs
    • Active Directory logs
      • Sign-in logs
      • Audit logs
    • Firewall logs
    • Platform logs
    • Spring Cloud logs (soon to be renamed Spring App logs)
    • Identity Protection logs (coming soon)
    • Provisioning logs (coming soon)
  • Azure Metrics
    • Compute VM
    • Container Instance
    • Container Registry
    • Container Service
    • ...
  • Azure Application Insights Metrics
    • Application Insights Metrics
    • Application State Insights Metrics
  • Azure Billing Metrics
    • Billing Metrics

For the Azure Logs, from my experience, the most used ones are the Activity logs and the two Active Directory logs (sign-in and audit logs). I would pick one from Sign-in or Activity logs.

@dedemorton
Copy link
Copy Markdown
Contributor Author

@zmoog From reading our Azure Logs integration docs, it looks like the steps for exporting activity logs to an event hub (using legacy collection) are quite different from exporting Active Directory logs. That makes me wonder if we should cover activity and Active Directory logs. On the other hand, the steps for exporting Active Directory logs look more straightforward and recommending something called legacy collection seems like a bad idea. WDYT?

@zmoog
Copy link
Copy Markdown
Contributor

zmoog commented Sep 26, 2022

That makes me wonder if we should cover activity and Active Directory logs.

IMO we should cover activity or Active Directory logs.

On the other hand, the steps for exporting Active Directory logs look more straightforward and recommending something called legacy collection seems like a bad idea.

Activity and Directory logs are VERY similar from the integration perspective; the only difference is the source of the logs.

Here's a quick diagram I am working on for the Azure Logs revamp I started today:

  ┌────────────────┐   ┌──────────────┐   ┌────────────────┐                    
  │    Azure AD    │   │  Diagnostic  │   │ Azure AD logs  │                    
  │  <<service>>   │──▶│   settings   │──▶│ <<event hub>>  │──┐                 
  └────────────────┘   └──────────────┘   └────────────────┘  │   ┌────────────┐
                                                              │   │  Elastic   │
                                                              ├──▶│   Agent    │
  ┌────────────────┐   ┌──────────────┐   ┌────────────────┐  │   └────────────┘
  │ Azure Monitor  │   │  Diagnostic  │   │ Activity logs  │  │                 
  │  <<service>>   ├──▶│   settings   │──▶│ <<event hub>>  │──┘                 
  └────────────────┘   └──────────────┘   └────────────────┘                                

@zmoog
Copy link
Copy Markdown
Contributor

zmoog commented Sep 26, 2022

About the choice "activity logs vs. Active Directory logs" as integration to showcase alongside Azure Billing, we can have a quick tour of both during our zoom call later today.

@zmoog
Copy link
Copy Markdown
Contributor

zmoog commented Sep 26, 2022

About what users need to do to send logs to an event hub, we can check what I am writing for the already mentioned Azure Logs doc revamp.

I believe there's an overlap between the two documents, but you can probably give me some advice about what goes in the tutorial and what goes in the general doc.

@zmoog
Copy link
Copy Markdown
Contributor

zmoog commented Sep 26, 2022

Does everyone agree that putting the native Azure integration steps in a separate topic makes sense?

Yeah! AFAIK the Native Azure Integration does not require an Agent 1, and we should handle it differently.

Footnotes

  1. At least not for all its features.

@zmoog
Copy link
Copy Markdown
Contributor

zmoog commented Sep 26, 2022

Are the details in the cloud docs accurate

I do not have a significant experience in the Native Azure Integration, but AFAIK they are accurate.

@dedemorton
Copy link
Copy Markdown
Contributor Author

dedemorton commented Sep 30, 2022

@zmoog The content in this topic is ready to review: https://observability-docs_2174.docs-preview.app.elstc.co/guide/en/observability/master/monitor-azure-elastic-agent.html.

I think it's OK for the tutorial to focus on the big stuff and point to users for more detail in the integrations documentation. I'm not going to try to sync up the tasks we cover because the tutorial style is different and the tasks won't stay in sync anyhow. So try to review for accuracy, not total consistency with what you've written. :-)

I have some remaining questions for you to consider as you review the content:

  • Should we state that Elastic Agent 8.4.2 or later is a prereq? (8.4.2 is required to see forecast data because there is a bug in earlier versions.) We don't usually mention bug fix releases, tho. Added a note to indicate that users should install the latest version of Elastic Agent

  • Under "Create an Azure service principal," is this still the recommended way to control access? I see a note on the Azure page for app registrations, but I don't know if it's relevant: "Starting June 30th, 2020 we will no longer add any new features to Azure Active Directory Authentication Library (ADAL) and Azure AD Graph. We will continue to provide technical support and security updates but we will no longer provide feature updates. Applications will need to be upgraded to Microsoft Authentication Library (MSAL) and Microsoft Graph." I'm going to leave the docs as they are for now. We can fix this later if we need to, especially since rewriting these docs wasn't the assignment...though that's more or less what I've done

  • Under "Grant access permission for your service principal," exactly which permissions are required? My setup had Reader and Billing Reader, but I don’t know if they are both required.

  • Under "Step 5: Visualize Azure activity logs," the "failed to find message" lines in the screen grab are not a good look, but I don't know how to fix them. This was a problem for the AWS logs too. _I'm going to leave this as-is for now. I've blown away all my deployments and can't really see how to change this, but since it's what customers see initially, I think it's OK. We can change this in a later update if users are confused.

  • Under "Configure diagnostic settings to send logs to the event hub," should we tell users which activity logs to collect? I am not sure myself.

  • Under "Configure the Azure Logs integration to collect activity logs," do users specify the the connection string of the namespace OR the event hub? Fixed the docs to say the connection string of the event hub namespace. The docs already have a link to the integrations doc for more info about settings, so I think we're covered there.

@dedemorton dedemorton marked this pull request as ready for review September 30, 2022 02:43
@dedemorton dedemorton requested a review from a team as a code owner September 30, 2022 02:43
@dedemorton dedemorton added the backport-8.5 Automated backport with mergify label Sep 30, 2022
@dedemorton dedemorton requested a review from zmoog September 30, 2022 02:43
@dedemorton
Copy link
Copy Markdown
Contributor Author

@zmoog I am running into an error when I try to test the steps in the tutorial about using the native integration. When I try to create an Elasticsearch (Elastic Cloud) resource, I get the following message:

{"code":"DeploymentFailed","message":"At least one resource deployment operation failed. Please list deployment operations for details. Please see https://aka.ms/DeployOperations for usage details.","details":[{"code":"500","message":"There is an existing Elastic Cloud account associated with your email address that has active deployments not running Azure. Please log in to [cloud.elastic.co](http://cloud.elastic.co/) and remove any deployments that are not running on Azure and try again. Contact [support@elastic.co](mailto:support@elastic.co) if you need further assistance."}]}

I really wanted to get this done today, so I deleted all my deployments on cloud, but I am still getting the same message. I even tried creating a deployment in Elastic Cloud on Azure to "grease the skids" but no luck. I feel like I saw something somewhere about issues related to our elastic cloud accounts, but can't remember the details. Can you help?

I'm blocked on testing the native steps until I figure out what's wrong.

@zmoog
Copy link
Copy Markdown
Contributor

zmoog commented Oct 3, 2022

Should we state that Elastic Agent 8.4.2 or later is a prereq? (8.4.2 is required to see forecast data because there is a bug in earlier versions.) We don't usually mention bug fix releases, tho.

Forecast data stopped working reliably right after we shipped the upgrade. There is a backport in progress to 7.17.7, but version 8.4.2 is required to have it working across the existing Azure account types.

What about recommending using the latest Agent version for 7.x and 8.x to get support for the latest API changes from Azure?

@zmoog
Copy link
Copy Markdown
Contributor

zmoog commented Oct 3, 2022

Under "Create an Azure service principal," is this still the recommended way to control access?

Not 100% sure here. I believe app registration is still the way to go, but I will double-check if the upcoming changes will affect app registration or is an authentication library detail only.

I didn't know this post, but we were aware of the in-progress transition to the Microsoft Identity service due to changes in the Azure SDK for Go we use to access Azure services from integration and Beats. Thank you for bringing this up! The post adds more details to the context.

The new Azure SDK for Go library that supports Microsoft Identity 1 requires Go version 1.18. Fortunately Go 1.18 landed in 7.x and 8.x and we can finally move forward with the plans to switch to azidentity.

Footnotes

  1. The library is called azidentity.

@zmoog
Copy link
Copy Markdown
Contributor

zmoog commented Oct 3, 2022

Under "Grant access permission for your service principal," exactly which permissions are required? My setup had Reader and Billing Reader, but I don't know if they are both required.

I understand that Azure Billing Metrics only requires the built-in "Billing Reader" role assignment. The "Reader" role is a broader role that gives the app access to a much wider set of information.

The "Reader" role gives the app access to 5991 individual permissions across multiple Azure services. The "Billing Reader" rolprovides the app withthe app access to 86 permissions.

You can check the permissions list by:

  • Visit our Azure subscription > Access Control (IAM)
  • Select the "Role" tab
  • Filter the role name in the "search by" field
  • Click on the "View" link in the "Details" column

Azure Billing Metrics only requires a handful of permissions. I think we could write instructions about how to write a custom role with the strictly required permissions only. I believe using the Azure built-in role is okay from a security standpoint and can provide a better user experience.

However, we could also provide a list with all the permissions required, so security-inclined customers have the option to nail down the exact permission set the app needs to work.

@dedemorton WDYT?

@zmoog
Copy link
Copy Markdown
Contributor

zmoog commented Oct 3, 2022

Under "Step 5: Visualize Azure activity logs," the "failed to find message" lines in the screen grab are not a good look, but I don't know how to fix them. This was a problem for the AWS logs too.

Yeah, it does not look good. This probably happens because the Logs stream page looks for a field named "message" by default, but Azure logs (and probably also AWS logs) don't have such a field.

I think this can be customized in the Logs stream option, so we and the users can pick an existing and meaningful field (for example, "category" or something specific to this log category).

@zmoog
Copy link
Copy Markdown
Contributor

zmoog commented Oct 3, 2022

Under "Configure diagnostic settings to send logs to the event hub," should we tell users which activity logs to collect? I am not sure myself.

Yeah, we added tables with a detailed list of supported log categories in the recent Azure Logs documentation updates. Maybe we can point users to those tables to get the most up-to-date version while we update existing integration and add new ones? Not sure we have Activity logs yet but are working on it so we can add them.

@zmoog
Copy link
Copy Markdown
Contributor

zmoog commented Oct 3, 2022

Under "Configure the Azure Logs integration to collect activity logs," do users specify the connection string of the namespace OR the event hub?

The connection string is on the event hub namespace:

Event hub namespace > NAMESPACE > Shared access policies > POLICY > key > Connection string

Since we met last, I expanded the even hub setup portion in elastic/integrations#4300 to better describe this step. Let me know if you think the doc describes this step well enough.

@dedemorton
Copy link
Copy Markdown
Contributor Author

@zmoog OK, I think I've figured out how to resolve most of my open issues. I have a couple follow-up questions:

  • Regarding roles/permissions, it's probably best to recommend a built-in role (to avoid making the tutorial even longer). I think you are saying that the Billing Reader role is required, but not Reader. Is that correct? I don't have time to run through the steps all over again. :-)

  • For the diagnostic settings, can you just confirm that the categories I show in the screen capture here are supported? I can add a link to the Azure logs integration docs later.

@zmoog
Copy link
Copy Markdown
Contributor

zmoog commented Oct 4, 2022

  • Regarding roles/permissions, it's probably best to recommend a built-in role (to avoid making the tutorial even longer). I think you are saying that the Billing Reader role is required, but not Reader. Is that correct? I don't have time to run through the steps all over again. :-)

I used Billing Reader during all my tests, so I'm confident it is a good fit for the Azure Billing Metrics integration.

For the diagnostic settings, can you just confirm that the categories I show in the screen capture here are supported? I can add a link to the Azure logs integration docs later.

I don't have a list ready now, but I think the log categories listed are supported. Once complete, the list with the supported log categories will show up in the internal integration doc in Kibana and https://docs.elastic.co/integrations/azure/activitylogs.

@dedemorton
Copy link
Copy Markdown
Contributor Author

@zmoog And now for the latest chapter in the saga "All the Azure Monitoring stuff that doesn't work as expected": I am not getting activity logs when I use the native Azure integration.

image

The diagnostics settings in azure look OK (I think):
image

I followed the steps in the tutorial and deployed the resource before enabling Logs & Metrics. Do I need to redeploy after enabling those options? (I am seeing platform logs, so I don't think this is my problem.)

@dedemorton
Copy link
Copy Markdown
Contributor Author

@zmoog Native steps are updated! I'm not going to worry about the activity logs not showing up right now, but someone needs to investigate to see if there's a software issue or the steps are wrong.

This tutorial is ready for a final review. Thanks again for your help with this.

@dedemorton dedemorton requested a review from zmoog October 7, 2022 01:41
@zmoog
Copy link
Copy Markdown
Contributor

zmoog commented Oct 10, 2022

I'm not going to worry about the activity logs not showing up right now, but someone needs to investigate to see if there's a software issue or if the steps are wrong.

Hey @dedemorton, I'll check the activity logs and create an issue if something is wrong or get back to you if there's something related to the tutorial.

Copy link
Copy Markdown
Contributor

@zmoog zmoog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I just added a couple of non-blocking comments.

@dedemorton dedemorton merged commit c2ce10c into elastic:main Oct 10, 2022
@dedemorton dedemorton deleted the issue#2091_azure branch October 10, 2022 23:17
mergify bot pushed a commit that referenced this pull request Oct 10, 2022
Adds new tutorial based on the existing one plus updates the native tutorial steps.

(cherry picked from commit c2ce10c)
mergify bot pushed a commit that referenced this pull request Oct 10, 2022
Adds new tutorial based on the existing one plus updates the native tutorial steps.

(cherry picked from commit c2ce10c)
dedemorton added a commit that referenced this pull request Oct 10, 2022
Adds new tutorial based on the existing one plus updates the native tutorial steps.

(cherry picked from commit c2ce10c)

Co-authored-by: DeDe Morton <dede.morton@elastic.co>
dedemorton added a commit that referenced this pull request Oct 10, 2022
Adds new tutorial based on the existing one plus updates the native tutorial steps.

(cherry picked from commit c2ce10c)

Co-authored-by: DeDe Morton <dede.morton@elastic.co>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-8.4 Automated backport with mergify backport-8.5 Automated backport with mergify

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants