[Security Assistant] Vertex chat model#193032
Conversation
|
Pinging @elastic/security-solution (Team: SecuritySolution) |
| maxOutputTokens: DEFAULT_TOKEN_LIMIT, | ||
| }, | ||
| ...(systemInstruction | ||
| ? { system_instruction: { role: 'user', parts: [{ text: systemInstruction }] } } |
There was a problem hiding this comment.
@pgayvallet Was there a reason for adding role? I don't see it in the API docs. I think all we need is parts
| "The final response is the only output the user sees and should be a complete answer to the user's question. Do not leave out important tool output. The final response should never be empty. Don't forget to use tools."; | ||
| const ALLENS_PROMPT = | ||
| 'You are an assistant that is an expert at using tools and Elastic Security, doing your best to use these tools to answer questions or follow instructions. It is very important to use tools to answer the question or follow the instructions rather than coming up with your own answer. Tool calls are good. Sometimes you may need to make several tool calls to accomplish the task or get an answer to the question that was asked. Use as many tool calls as necessary.'; | ||
| const KB_CATCH = |
| export const GEMINI_SYSTEM_PROMPT = | ||
| `ALWAYS use the provided tools, as they have access to the latest data and syntax.` + | ||
| "The final response is the only output the user sees and should be a complete answer to the user's question. Do not leave out important tool output. The final response should never be empty. Don't forget to use tools."; | ||
| const ALLENS_PROMPT = |
There was a problem hiding this comment.
Could we rename to something like GEMINI_MAIN_SYSTEM_PROMP
peluja1012
left a comment
There was a problem hiding this comment.
Reviewed the code with Steph and ran evaluations. Results for ES|QL generation for Gemini improved. Results for other models like gpt-4o and Sonnet 3.5 remained consistently high. So no regressions on other models. Evaluation results for custom knowledge improved for Gemini as well. Thanks for the great work, Steph!
💛 Build succeeded, but was flaky
Failed CI StepsTest Failures
Metrics [docs]Unknown metric groupsESLint disabled line counts
Total ESLint disabled count
History
To update your PR or re-run it, just comment with: |
|
Starting backport for target branches: 8.x |
(cherry picked from commit aae8c50)
💚 All backports created successfully
Note: Successful backport PRs will be merged automatically after passing CI. Questions ?Please refer to the Backport tool documentation |
# Backport This will backport the following commits from `main` to `8.x`: - [[Security Assistant] Vertex chat model (#193032)](#193032) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Steph Milovic","email":"stephanie.milovic@elastic.co"},"sourceCommit":{"committedDate":"2024-10-04T13:39:46Z","message":"[Security Assistant] Vertex chat model (#193032)","sha":"aae8c50f4083208508a49ab5324b31047aea5e68","branchLabelMapping":{"^v9.0.0$":"main","^v8.16.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","v9.0.0","Team: SecuritySolution","backport:prev-minor","Team:Security Generative AI","v8.16.0"],"title":"[Security Assistant] Vertex chat model","number":193032,"url":"https://github.com/elastic/kibana/pull/193032","mergeCommit":{"message":"[Security Assistant] Vertex chat model (#193032)","sha":"aae8c50f4083208508a49ab5324b31047aea5e68"}},"sourceBranch":"main","suggestedTargetBranches":["8.x"],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/193032","number":193032,"mergeCommit":{"message":"[Security Assistant] Vertex chat model (#193032)","sha":"aae8c50f4083208508a49ab5324b31047aea5e68"}},{"branch":"8.x","label":"v8.16.0","branchLabelMappingKey":"^v8.16.0$","isSourceBranch":false,"state":"NOT_CREATED"}]}] BACKPORT--> Co-authored-by: Steph Milovic <stephanie.milovic@elastic.co>
Summary
Works towards addressing #189771 by changing the Security Assistant over from
ActionsClientGeminiChatModeltoActionsClientChatVertexAI.Adds a new chat model
ActionsClientChatVertexAIthat extendsChatVertexAI. This is the model meant to be used with Gemini JSON auth. Our current Gemini chat model (ActionsClientGeminiChatModel) extendsChatGoogleGenerativeAIwhich does not support the same authentication methods as we use with Gemini. Additionally,ChatVertexAIuses the proper request body format whileChatGoogleGenerativeAIuses something close that puts the system prompt as a user message rather than in the appropriatesystemInstructionproperty. Moving the system prompt to the proper field makes a big difference in result quality.Prompt improvements
Thanks to help from @afirstenberg, we have a shiny new system prompt for Gemini that is working much better for us. The prompt hammers home how great tool use is, and reinforces behavior with positive statements rather than negative. I also applied this positive reinforcement strategy to the Gemini prompt in
generate_chat_title.tsand the prompt for thenl-to-esql-tool.User prompt
The strategy Allen suggested also includes prepending a "user prompt" to the last user message in the conversation. The "user prompt" does not get saved in persistent history. When the history of the conversation is sent with a follow up question, you'll only see the "user prompt" in the last user message that is sent to Gemini.
You can see an example of the "user prompt" in place within a conversation on this trace. In this trace we see the system prompt at the top as expected, then the conversation history, then only the most recent message has the user prompt prepended:

LangSmith tests
I have my own tests that I run with the assistant, and then tests with the evaluator. In my tests, the first row is the old chat model
ActionsClientGeminiChatModel. The next 2 rows areActionsClientChatVertexAIwithout streaming andActionsClientChatVertexAIwith streaming.ActionsClientChatVertexAIsuccessfully passes each test, and outperformsActionsClientGeminiChatModelin some cases.Code
💚 = Improvement
💛 = Same
❌ = Error Result
ESQLKnowledgeBaseToolAlertsCountToolOpenAndAcknowledgedAlertsToolI ran the ES|QL Generation Regression dataset against the new chat model. There is a significant boost in correctness and the Vertex model is indeed more regularly invoking the tools than the previous model. We are also not hitting the
GraphRecursionErrorwe see getting hit with the previous model. The following are screenshots of the ES|QL Generation Regression for each model usinggemini-1.5-pro-002. You see significant improvements in Vertex.ES|QL Generation Regression dataset
LangSmith Playground
An advantage to extending

ChatVertexAIis that since it will work with our APIcredentialsJson, we can use the LangSmith playground to test prompts and iterate. To do so, select anActionsClientChatVertexAImodel run and hit the playground button:Once in playground, ensure VertexAI is the selected model and you've entered our valid

credentialsJsonin the Secrets & API Keys area. Now you can iterate on the system prompt to ensure desired results.To test
Select Gemini as your conversation connector. Run through the prompts from my tests above, or your own prompts that you've found challenged Gemini.