Skip to content

[Security Assistant] Vertex chat model#193032

Merged
stephmilovic merged 32 commits intoelastic:mainfrom
stephmilovic:vertex_chat
Oct 4, 2024
Merged

[Security Assistant] Vertex chat model#193032
stephmilovic merged 32 commits intoelastic:mainfrom
stephmilovic:vertex_chat

Conversation

@stephmilovic
Copy link
Copy Markdown
Contributor

@stephmilovic stephmilovic commented Sep 16, 2024

Summary

Works towards addressing #189771 by changing the Security Assistant over from ActionsClientGeminiChatModel to ActionsClientChatVertexAI.

Adds a new chat model ActionsClientChatVertexAI that extends ChatVertexAI. This is the model meant to be used with Gemini JSON auth. Our current Gemini chat model (ActionsClientGeminiChatModel) extends ChatGoogleGenerativeAI which does not support the same authentication methods as we use with Gemini. Additionally, ChatVertexAI uses the proper request body format while ChatGoogleGenerativeAI uses something close that puts the system prompt as a user message rather than in the appropriate systemInstruction property. Moving the system prompt to the proper field makes a big difference in result quality.

Prompt improvements

Thanks to help from @afirstenberg, we have a shiny new system prompt for Gemini that is working much better for us. The prompt hammers home how great tool use is, and reinforces behavior with positive statements rather than negative. I also applied this positive reinforcement strategy to the Gemini prompt in generate_chat_title.ts and the prompt for the nl-to-esql-tool.

User prompt

The strategy Allen suggested also includes prepending a "user prompt" to the last user message in the conversation. The "user prompt" does not get saved in persistent history. When the history of the conversation is sent with a follow up question, you'll only see the "user prompt" in the last user message that is sent to Gemini.

You can see an example of the "user prompt" in place within a conversation on this trace. In this trace we see the system prompt at the top as expected, then the conversation history, then only the most recent message has the user prompt prepended:
Screenshot 2024-10-03 at 12 28 45 PM

LangSmith tests

I have my own tests that I run with the assistant, and then tests with the evaluator. In my tests, the first row is the old chat model ActionsClientGeminiChatModel. The next 2 rows are ActionsClientChatVertexAI without streaming and ActionsClientChatVertexAI with streaming. ActionsClientChatVertexAI successfully passes each test, and outperforms ActionsClientGeminiChatModel in some cases.

Code

💚 = Improvement
💛 = Same
❌ = Error Result

ESQLKnowledgeBaseTool AlertsCountTool OpenAndAcknowledgedAlertsTool Alert Summary Remediate Conversation p1 p2 p3
ActionsClientGeminiChatModel ActionsClientGeminiChatModel ActionsClientGeminiChatModel ActionsClientGeminiChatModel ActionsClientGeminiChatModel ActionsClientGeminiChatModel Could not complete as previous step errored
💚 ActionsClientChatVertexAI 💛 ActionsClientChatVertexAI 💚 ActionsClientChatVertexAI 💛 ActionsClientChatVertexAI 💛 ActionsClientChatVertexAI 💚 ActionsClientChatVertexAI 💚 ActionsClientChatVertexAI
💚 ActionsClientChatVertexAI streaming 💛 ActionsClientChatVertexAI streaming 💚 ActionsClientChatVertexAI streaming 💛 ActionsClientChatVertexAI streaming 💛 ActionsClientChatVertexAI streaming 💚 ActionsClientChatVertexAI streaming 💚 ActionsClientChatVertexAI streaming

I ran the ES|QL Generation Regression dataset against the new chat model. There is a significant boost in correctness and the Vertex model is indeed more regularly invoking the tools than the previous model. We are also not hitting the GraphRecursionError we see getting hit with the previous model. The following are screenshots of the ES|QL Generation Regression for each model using gemini-1.5-pro-002. You see significant improvements in Vertex.

ES|QL Generation Regression dataset

Screenshot 2024-10-03 at 1 06 20 PM

LangSmith Playground

An advantage to extending ChatVertexAI is that since it will work with our API credentialsJson, we can use the LangSmith playground to test prompts and iterate. To do so, select an ActionsClientChatVertexAI model run and hit the playground button:
Screenshot 2024-09-16 at 6 12 20 PM

Once in playground, ensure VertexAI is the selected model and you've entered our valid credentialsJson in the Secrets & API Keys area. Now you can iterate on the system prompt to ensure desired results.
Screenshot 2024-09-16 at 6 13 37 PM

To test

Select Gemini as your conversation connector. Run through the prompts from my tests above, or your own prompts that you've found challenged Gemini.

@stephmilovic stephmilovic added release_note:skip Skip the PR/issue when compiling release notes Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. Team:Security Generative AI Security Generative AI v8.16.0 labels Sep 16, 2024
@stephmilovic stephmilovic marked this pull request as ready for review September 17, 2024 00:40
@stephmilovic stephmilovic requested a review from a team as a code owner September 17, 2024 00:40
@elasticmachine
Copy link
Copy Markdown
Contributor

Pinging @elastic/security-solution (Team: SecuritySolution)

@stephmilovic stephmilovic requested review from a team as code owners October 3, 2024 17:19
maxOutputTokens: DEFAULT_TOKEN_LIMIT,
},
...(systemInstruction
? { system_instruction: { role: 'user', parts: [{ text: systemInstruction }] } }
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pgayvallet Was there a reason for adding role? I don't see it in the API docs. I think all we need is parts

"The final response is the only output the user sees and should be a complete answer to the user's question. Do not leave out important tool output. The final response should never be empty. Don't forget to use tools.";
const ALLENS_PROMPT =
'You are an assistant that is an expert at using tools and Elastic Security, doing your best to use these tools to answer questions or follow instructions. It is very important to use tools to answer the question or follow the instructions rather than coming up with your own answer. Tool calls are good. Sometimes you may need to make several tool calls to accomplish the task or get an answer to the question that was asked. Use as many tool calls as necessary.';
const KB_CATCH =
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found if I do not add this catch and Gemini gets empty knowledge base results from a user question, it will not try to answer. See the differences is these traces for before and after I added KB_CATCH

export const GEMINI_SYSTEM_PROMPT =
`ALWAYS use the provided tools, as they have access to the latest data and syntax.` +
"The final response is the only output the user sees and should be a complete answer to the user's question. Do not leave out important tool output. The final response should never be empty. Don't forget to use tools.";
const ALLENS_PROMPT =
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we rename to something like GEMINI_MAIN_SYSTEM_PROMP

Copy link
Copy Markdown
Contributor

@peluja1012 peluja1012 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the code with Steph and ran evaluations. Results for ES|QL generation for Gemini improved. Results for other models like gpt-4o and Sonnet 3.5 remained consistently high. So no regressions on other models. Evaluation results for custom knowledge improved for Gemini as well. Thanks for the great work, Steph!

@kibana-ci
Copy link
Copy Markdown

💛 Build succeeded, but was flaky

Failed CI Steps

Test Failures

  • [job] [logs] FTR Configs #68 / discover/esql discover esql view switch modal should not show switch modal when switching to a data view while a saved search is open

Metrics [docs]

Unknown metric groups

ESLint disabled line counts

id before after diff
@kbn/langchain 1 2 +1

Total ESLint disabled count

id before after diff
@kbn/langchain 1 2 +1

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@stephmilovic stephmilovic merged commit aae8c50 into elastic:main Oct 4, 2024
@kibanamachine
Copy link
Copy Markdown
Contributor

Starting backport for target branches: 8.x

https://github.com/elastic/kibana/actions/runs/11181020258

kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Oct 4, 2024
@kibanamachine
Copy link
Copy Markdown
Contributor

💚 All backports created successfully

Status Branch Result
8.x

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

kibanamachine added a commit that referenced this pull request Oct 4, 2024
# Backport

This will backport the following commits from `main` to `8.x`:
- [[Security Assistant] Vertex chat model
(#193032)](#193032)

<!--- Backport version: 9.4.3 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Steph
Milovic","email":"stephanie.milovic@elastic.co"},"sourceCommit":{"committedDate":"2024-10-04T13:39:46Z","message":"[Security
Assistant] Vertex chat model
(#193032)","sha":"aae8c50f4083208508a49ab5324b31047aea5e68","branchLabelMapping":{"^v9.0.0$":"main","^v8.16.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","v9.0.0","Team:
SecuritySolution","backport:prev-minor","Team:Security Generative
AI","v8.16.0"],"title":"[Security Assistant] Vertex chat
model","number":193032,"url":"https://github.com/elastic/kibana/pull/193032","mergeCommit":{"message":"[Security
Assistant] Vertex chat model
(#193032)","sha":"aae8c50f4083208508a49ab5324b31047aea5e68"}},"sourceBranch":"main","suggestedTargetBranches":["8.x"],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/193032","number":193032,"mergeCommit":{"message":"[Security
Assistant] Vertex chat model
(#193032)","sha":"aae8c50f4083208508a49ab5324b31047aea5e68"}},{"branch":"8.x","label":"v8.16.0","branchLabelMappingKey":"^v8.16.0$","isSourceBranch":false,"state":"NOT_CREATED"}]}]
BACKPORT-->

Co-authored-by: Steph Milovic <stephanie.milovic@elastic.co>
tiansivive pushed a commit to tiansivive/kibana that referenced this pull request Oct 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release_note:skip Skip the PR/issue when compiling release notes Team:Security Generative AI Security Generative AI Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. v8.16.0 v9.0.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants