[Security Assistant] Vertex chat model by stephmilovic · Pull Request #193032 · elastic/kibana

stephmilovic · 2024-09-16T14:59:59Z

Summary

Works towards addressing #189771 by changing the Security Assistant over from ActionsClientGeminiChatModel to ActionsClientChatVertexAI.

Adds a new chat model ActionsClientChatVertexAI that extends ChatVertexAI. This is the model meant to be used with Gemini JSON auth. Our current Gemini chat model (ActionsClientGeminiChatModel) extends ChatGoogleGenerativeAI which does not support the same authentication methods as we use with Gemini. Additionally, ChatVertexAI uses the proper request body format while ChatGoogleGenerativeAI uses something close that puts the system prompt as a user message rather than in the appropriate systemInstruction property. Moving the system prompt to the proper field makes a big difference in result quality.

Prompt improvements

Thanks to help from @afirstenberg, we have a shiny new system prompt for Gemini that is working much better for us. The prompt hammers home how great tool use is, and reinforces behavior with positive statements rather than negative. I also applied this positive reinforcement strategy to the Gemini prompt in generate_chat_title.ts and the prompt for the nl-to-esql-tool.

User prompt

The strategy Allen suggested also includes prepending a "user prompt" to the last user message in the conversation. The "user prompt" does not get saved in persistent history. When the history of the conversation is sent with a follow up question, you'll only see the "user prompt" in the last user message that is sent to Gemini.

You can see an example of the "user prompt" in place within a conversation on this trace. In this trace we see the system prompt at the top as expected, then the conversation history, then only the most recent message has the user prompt prepended:

LangSmith tests

I have my own tests that I run with the assistant, and then tests with the evaluator. In my tests, the first row is the old chat model ActionsClientGeminiChatModel. The next 2 rows are ActionsClientChatVertexAI without streaming and ActionsClientChatVertexAI with streaming. ActionsClientChatVertexAI successfully passes each test, and outperforms ActionsClientGeminiChatModel in some cases.

Code

💚 = Improvement
💛 = Same
❌ = Error Result

`ESQLKnowledgeBaseTool`	`AlertsCountTool`	`OpenAndAcknowledgedAlertsTool`	Alert Summary	Remediate Conversation p1	p2	p3
❌ ActionsClientGeminiChatModel	ActionsClientGeminiChatModel	❌ ActionsClientGeminiChatModel	ActionsClientGeminiChatModel	ActionsClientGeminiChatModel	❌ ActionsClientGeminiChatModel	Could not complete as previous step errored
💚 ActionsClientChatVertexAI	💛 ActionsClientChatVertexAI	💚 ActionsClientChatVertexAI	💛 ActionsClientChatVertexAI	💛 ActionsClientChatVertexAI	💚 ActionsClientChatVertexAI	💚 ActionsClientChatVertexAI
💚 ActionsClientChatVertexAI streaming	💛 ActionsClientChatVertexAI streaming	💚 ActionsClientChatVertexAI streaming	💛 ActionsClientChatVertexAI streaming	💛 ActionsClientChatVertexAI streaming	💚 ActionsClientChatVertexAI streaming	💚 ActionsClientChatVertexAI streaming

I ran the ES|QL Generation Regression dataset against the new chat model. There is a significant boost in correctness and the Vertex model is indeed more regularly invoking the tools than the previous model. We are also not hitting the GraphRecursionError we see getting hit with the previous model. The following are screenshots of the ES|QL Generation Regression for each model using gemini-1.5-pro-002. You see significant improvements in Vertex.

ES|QL Generation Regression dataset

LangSmith Playground

An advantage to extending ChatVertexAI is that since it will work with our API credentialsJson, we can use the LangSmith playground to test prompts and iterate. To do so, select an ActionsClientChatVertexAI model run and hit the playground button:

Once in playground, ensure VertexAI is the selected model and you've entered our valid credentialsJson in the Secrets & API Keys area. Now you can iterate on the system prompt to ensure desired results.

To test

Select Gemini as your conversation connector. Run through the prompts from my tests above, or your own prompts that you've found challenged Gemini.

elasticmachine · 2024-09-17T00:40:15Z

Pinging @elastic/security-solution (Team: SecuritySolution)

This reverts commit 359cb4e.

stephmilovic · 2024-10-03T19:55:34Z

x-pack/plugins/stack_connectors/server/connector_types/gemini/gemini.ts

      maxOutputTokens: DEFAULT_TOKEN_LIMIT,
    },
-    ...(systemInstruction
-      ? { system_instruction: { role: 'user', parts: [{ text: systemInstruction }] } }


@pgayvallet Was there a reason for adding role? I don't see it in the API docs. I think all we need is parts

stephmilovic · 2024-10-03T20:13:04Z

.../elastic_assistant/server/lib/langchain/graphs/default_assistant_graph/nodes/translations.ts

-  "The final response is the only output the user sees and should be a complete answer to the user's question. Do not leave out important tool output. The final response should never be empty. Don't forget to use tools.";
+const ALLENS_PROMPT =
+  'You are an assistant that is an expert at using tools and Elastic Security, doing your best to use these tools to answer questions or follow instructions. It is very important to use tools to answer the question or follow the instructions rather than coming up with your own answer. Tool calls are good. Sometimes you may need to make several tool calls to accomplish the task or get an answer to the question that was asked. Use as many tool calls as necessary.';
+const KB_CATCH =


I found if I do not add this catch and Gemini gets empty knowledge base results from a user question, it will not try to answer. See the differences is these traces for before and after I added KB_CATCH

peluja1012 · 2024-10-03T23:15:58Z

.../elastic_assistant/server/lib/langchain/graphs/default_assistant_graph/nodes/translations.ts

-export const GEMINI_SYSTEM_PROMPT =
-  `ALWAYS use the provided tools, as they have access to the latest data and syntax.` +
-  "The final response is the only output the user sees and should be a complete answer to the user's question. Do not leave out important tool output. The final response should never be empty. Don't forget to use tools.";
+const ALLENS_PROMPT =


Could we rename to something like GEMINI_MAIN_SYSTEM_PROMP

peluja1012

Reviewed the code with Steph and ran evaluations. Results for ES|QL generation for Gemini improved. Results for other models like gpt-4o and Sonnet 3.5 remained consistently high. So no regressions on other models. Evaluation results for custom knowledge improved for Gemini as well. Thanks for the great work, Steph!

kibana-ci · 2024-10-04T01:09:24Z

💛 Build succeeded, but was flaky

Buildkite Build
Commit: c5a684f

Failed CI Steps

FTR Configs #68

Test Failures

[job] [logs] FTR Configs #68 / discover/esql discover esql view switch modal should not show switch modal when switching to a data view while a saved search is open

Metrics [docs]

Unknown metric groups

ESLint disabled line counts

id	before	after	diff
`@kbn/langchain`	1	2	+1

Total ESLint disabled count

id	before	after	diff
`@kbn/langchain`	1	2	+1

History

💚 Build #239240 succeeded 08f4fb2
💚 Build #235367 succeeded 5c3e678
💚 Build #234922 succeeded 62d0fba
💔 Build #234897 failed 28b44b0
💔 Build #234861 failed a631591
💔 Build #234649 failed c29aba4

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

kibanamachine · 2024-10-04T13:40:03Z

Starting backport for target branches: 8.x

https://github.com/elastic/kibana/actions/runs/11181020258

(cherry picked from commit aae8c50)

kibanamachine · 2024-10-04T13:44:30Z

💚 All backports created successfully

Status	Branch	Result
✅	8.x

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

# Backport This will backport the following commits from `main` to `8.x`: - [[Security Assistant] Vertex chat model (#193032)](#193032)  ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport)  Co-authored-by: Steph Milovic <stephanie.milovic@elastic.co>

stephmilovic added 3 commits September 13, 2024 17:14

working for non-streaming

5ba961e

Merge branch 'main' into vertex_chat

3721107

prompts

b879c00

stephmilovic added release_note:skip Skip the PR/issue when compiling release notes Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. Team:Security Generative AI Security Generative AI v8.16.0 labels Sep 16, 2024

stephmilovic added 7 commits September 16, 2024 09:00

rm from integrations assistant

af46a8e

add streaming

4f4c673

Merge branch 'main' into vertex_chat

4d56a8b

import from LC

2f9e725

revert translation

b87fed3

fix types

857d68f

add test

1fe3e7e

stephmilovic marked this pull request as ready for review September 17, 2024 00:40

stephmilovic requested a review from a team as a code owner September 17, 2024 00:40

kibanamachine and others added 8 commits September 17, 2024 00:50

[CI] Auto-commit changed files from 'node scripts/notice'

c29aba4

Merge branch 'main' into vertex_chat

7488ea1

fix

a631591

rm elastic-assistant reference

c79057d

[CI] Auto-commit changed files from 'node scripts/notice'

28b44b0

fix circular dep

62d0fba

Merge branch 'main' into vertex_chat

5618557

better prompting

5c3e678

stephmilovic mentioned this pull request Sep 18, 2024

[Security solution] naturalLanguageToEsql Tool added to default assistant graph #192042

Merged

stephmilovic added 4 commits September 30, 2024 10:18

Merge branch 'main' into vertex_chat

5afa6d9

Merge branch 'main' into vertex_chat

877c187

prompt wip

a8c5582

Merge branch 'main' into vertex_chat

0a05507

stephmilovic added 6 commits October 1, 2024 14:37

working pretty well

df96c81

fix finish reason

aa42f6f

prompt fixes

04da653

wip

359cb4e

Merge branch 'main' into vertex_chat

1fc00ac

Revert "wip"

d7d8094

This reverts commit 359cb4e.

stephmilovic requested review from a team as code owners October 3, 2024 17:19

rm logs

b35ba41

stephmilovic added the backport:prev-minor label Oct 3, 2024

stephmilovic added 2 commits October 3, 2024 11:47

revert topK

5c3459f

no role in system instruction

08f4fb2

stephmilovic commented Oct 3, 2024

View reviewed changes

peluja1012 reviewed Oct 3, 2024

View reviewed changes

rename const

c5a684f

peluja1012 approved these changes Oct 3, 2024

View reviewed changes

stephmilovic merged commit aae8c50 into elastic:main Oct 4, 2024

kibanamachine added the v9.0.0 label Oct 4, 2024

kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Oct 4, 2024

[Security Assistant] Vertex chat model (elastic#193032)

b67e892

(cherry picked from commit aae8c50)

kibanamachine mentioned this pull request Oct 4, 2024

[8.x] [Security Assistant] Vertex chat model (#193032) #194949

Merged

tiansivive pushed a commit to tiansivive/kibana that referenced this pull request Oct 7, 2024

[Security Assistant] Vertex chat model (elastic#193032)

b9bdb26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Security Assistant] Vertex chat model#193032

[Security Assistant] Vertex chat model#193032
stephmilovic merged 32 commits intoelastic:mainfrom
stephmilovic:vertex_chat

stephmilovic commented Sep 16, 2024 •

edited by kibanamachine

Loading

Uh oh!

elasticmachine commented Sep 17, 2024

Uh oh!

stephmilovic Oct 3, 2024

Uh oh!

stephmilovic Oct 3, 2024

Uh oh!

peluja1012 Oct 3, 2024

Uh oh!

peluja1012 left a comment

Uh oh!

kibana-ci commented Oct 4, 2024

ESLint disabled line counts

Total ESLint disabled count

Uh oh!

kibanamachine commented Oct 4, 2024

Uh oh!

kibanamachine commented Oct 4, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

stephmilovic commented Sep 16, 2024 • edited by kibanamachine Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Prompt improvements

User prompt

LangSmith tests

Code

ES|QL Generation Regression dataset

LangSmith Playground

To test

Uh oh!

elasticmachine commented Sep 17, 2024

Uh oh!

stephmilovic Oct 3, 2024

Choose a reason for hiding this comment

Uh oh!

stephmilovic Oct 3, 2024

Choose a reason for hiding this comment

Uh oh!

peluja1012 Oct 3, 2024

Choose a reason for hiding this comment

Uh oh!

peluja1012 left a comment

Choose a reason for hiding this comment

Uh oh!

kibana-ci commented Oct 4, 2024

💛 Build succeeded, but was flaky

Failed CI Steps

Test Failures

Metrics [docs]

ESLint disabled line counts

Total ESLint disabled count

History

Uh oh!

kibanamachine commented Oct 4, 2024

Uh oh!

kibanamachine commented Oct 4, 2024

💚 All backports created successfully

Questions ?

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

stephmilovic commented Sep 16, 2024 •

edited by kibanamachine

Loading