Conversation
nreese
commented
May 8, 2024
|
Oh my gosh, thank you so much @nreese! |
|
And resolves https://github.com/elastic/security-team/issues/3067 😂 |
|
@spong and @stephmilovic Thanks for all of the linked issues. This seems to be a popular request. I have 2 questions
|
This is a great start, but we can definitely expand on it. For example, @patrykkopycinski has a PR open with faux malware events to use with the attack discovery feature: #182918
Ah I didn't realize this was targeting developers only. As you've built this with the example plugins, I think adding But yeah from the perspective of new users as covered in Garrett's issue and Kseniia's issue we'd love to put this in front of new customers with the rest of the sample data. What would it take to make it available there? |
Very low technical effort. The implementation would just be moved out of examples and into either Might be more push back on this solution since one of down sides of sample data is that it bloats the kibana distribution size. Also, there has been push back with solution sample data in production since Product wants users to go through the process of setting up real data work flows. Easily accessible sample data does not push users towards setting up work flows. |
@MikePaquette I saw you weigh in on Garrett's ticket. How do you feel about packaging security sample data with Kibana? |
|
Given the reasons outlined in the last attempt that didn't make it, it's probably best to keep this developer centric. I didn't know about Knowing that, we can have supplemental sample data both locally and in cloud PR's, so that's a big win in my book. Perhaps the user-facing initiative has stalled (that comment was from awhile ago), but seems best to let security product drive that one. |
|
Thanks @nreese
Agreed, let's keep this for developer examples only.
Yes, we've decided to invest in steering users to demo systems rather than investing in making a robust and safe process for using sample data on a system that might become a production customer system. |
There was a problem hiding this comment.
Please describe where/how the proposed sample data was obtained and verify that the sample data:
- contains no personally identifiable information of any person
- contains no confidential information of Elastic or any other person, organization, or company
- is not subject to any license or copyright
- is not otherwise restricted for this use case
|
@nreese, is this the data that I prepared on the |
It is the data pulled from the cluster you provided. Is there a more complete data set I could use? |
No. We could improved it but only to a certain degree. I suggest to change the description otherwise users will get disappointed by the limited use they can do of that data. |
thanks. This is a good place to start and we can always iterate on the data set. @cavokz would you mind answering #182979 (review) since you have more knowledge on where the data is coming from? |
The IP addresses are totally random, they come from this Geneve formula The geo info come from the Faker geo provider which in turn takes the data from geonames.org where it's licensed under Creative Commons Attribution 3.0 License. |