Skip to content

[WorkplaceAI] SharePoint Online stack connector#248737

Merged
mattnowzari merged 33 commits intoelastic:mainfrom
mattnowzari:sharepoint_online
Jan 29, 2026
Merged

[WorkplaceAI] SharePoint Online stack connector#248737
mattnowzari merged 33 commits intoelastic:mainfrom
mattnowzari:sharepoint_online

Conversation

@mattnowzari
Copy link
Copy Markdown
Contributor

@mattnowzari mattnowzari commented Jan 12, 2026

Summary

This PR adds a SharePoint Online v2 stack connector.

The actions this connector provides are:

  • search (accepts a query and entity type to search)
  • getAllSites (retrieve all sites, potentially large return)
  • getSitePages (retrieve all pages of a given site)
  • getSitePageContents (get the content of a site page)
  • getSite (retrieve a single given site)
  • getSiteDrives (retrieve the drives of a single given site)
  • getDriveItems (retrieve all items of a single given drive)
  • downloadDriveItems (download the contents of a single drive item via a driveId and itemId)
  • downloadItemFromURL (download the contents of a single drive item via its downloadUrl)
  • getSiteLists (retrieve the lists of a single given site)
  • getSiteListItems (retrieve the items of single given list of a single given site)
  • callGraphAPI (transparent action that allows an LLM with knowledge of the Microsoft Graph API to formulate and call endpoints at will. Intended to be a power user feature)

What this PR does not include

  • Data Source registration
  • Workflow YAML definitions

These will be provided in a fast-follow PR, as this one is quite sizeable already.

Checklist

Check the PR satisfies following conditions.

Reviewers should verify this PR satisfies this list as well.

Release Notes

  • A new connector for SharePoint Online has been added to the available list of connectors

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Jan 12, 2026

🔍 Preview links for changed docs

/**
* Connector output schemas (from connector specs v2)
*/
export const ConnectorSpecsOutputSchemas = new Map<string, Record<string, z.ZodSchema>>(
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to reviewers - This and the code in schema.ts was added to fix the Workflow YAML editor giving warnings when stringing together the ouputs of one SPO step with other steps.

Cards on the table, I kind of let Cursor take the reigns on these particular schema code diffs. They don't seem to have any adverse side effects but if others think they'd be better served as a separate PR I can remove them.

@mattnowzari mattnowzari marked this pull request as ready for review January 22, 2026 18:58
@mattnowzari mattnowzari requested review from a team as code owners January 22, 2026 18:58
@mattnowzari mattnowzari added backport:skip This PR does not require backporting release_note:feature Makes this part of the condensed release notes labels Jan 22, 2026
Copy link
Copy Markdown
Member

@florent-leborgne florent-leborgne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving some suggestions to move the docs to a slightly different location of the Connectors section

Copy link
Copy Markdown
Member

@seanstory seanstory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice stuff! Most of my comments are from knowing too much about sharepoint specifically.

};
}

const contentUrl = `${baseUrl}/content`;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we try to use the /content URL first, before we return a link to a download URL?

This downloadUrl path feels misleading to me. I'd expect that to be the input to a downloadDriveItem tool, not the output.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Asking some clarifying questions 😄

Should we try to use the /content URL first, before we return a link to a download URL?

Are you suggesting we should have a more automated behavior, where we attempt /content and simply return downloadUrl to the user if those fail (most likely bc its a PDF or .docx or something). Not entirely opposed to this if so, just want to make sure I understand. I have a lightly held opinion that its not amazing if actions return two different types of data given a single input set - either doc contents or a URL - but again, it's lightly held.

I'd expect that to be the input to a downloadDriveItem tool, not the output.

The way I envisioned this working was it'd accept a driveId and itemId as input, as they are outputs of other tools, then attempt get the content in a format the LLM could work with, and then give the user the downloadUrl if it can't...which means, In its current form, the LLM would call this tool again with a different input set. I'm not sure how ideal that behavior is, either.

A big simplification would be to just have getDriveItems return the downloadUrl as part of its payload and then have this action only accept downloadUrls as input, but I'd have to test how well this works.

Hopefully I've understood what your concerns are, if not please course-correct me 🫡

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you suggesting we should have a more automated behavior, where we attempt /content and simply return downloadUrl to the user if those fail

I haven't done any googling or investigation here. So take this with a grain of salt. But my concern is that we might have some responses where there's a content endpoint that can:

  1. give us the actual text contents

and there's a dowloadURL where we could:

  1. download a big file
  2. ship that big file somewhere else for processing
  3. get the actual text contents

If we can jump straight to the text, I want to make sure we do, to avoid unnecessary bandwidth, compute, and latency.

A big simplification would be to just have getDriveItems return the downloadUrl as part of its payload and then have this action only accept downloadUrls as input, but I'd have to test how well this works.

I think that could be a good piece of metadata to include. The flow I'd expect is:

  1. getMany (mostly metadata)
  2. LLM introspection (do any of these look promising?)
  3. fetchOneorSeveral (full contents, using metadata from 1 to limit the number of results)

If downloadUrl isn't included in 1, and another pass has to be done to get that metadata, this feels inefficient.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I think I got it:

  1. We should make sure to return downloadUrls as part of our getMany-style actions for drive items
  2. I'm going to create a new action (downloadItemFromURL or something) that can acquire the file via a downloadUrl, which we can hopefully use as an action for shipping more complicated formats elsewhere for processing
  3. I'm going to remove the downloadUrl code path from the current downloadDriveItem action. This existing action should just be our way of getting straight to text content. I'll also remove the base64 encoding format too.

Net result will be clearer definitions:

  • An action (downloadDriveItem) that can directly get text content
  • Another action (downloadItemFromURL) can download a file via URL to potentially farm out for further processing

🤞 Hopefully I've understood the concerns and provided a good way forward!

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 this sounds like a reasonable path forward. Thanks!

Copy link
Copy Markdown
Member

@florent-leborgne florent-leborgne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for docs, thanks!

@florent-leborgne
Copy link
Copy Markdown
Member

florent-leborgne commented Jan 27, 2026

Can we add a Team or Feature label so that it's categorized nicely from the start in the release notes? See https://stunning-adventure-qrvr1k2.pages.github.io/release-notes/kibana-pr-best-practices-for-dev/ (Elastic internal) Thanks!

.array(z.enum(['site', 'list', 'listItem', 'drive', 'driveItem']))
.optional()
.describe('Entity types to search'),
region: z
Copy link
Copy Markdown
Contributor Author

@mattnowzari mattnowzari Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to reviewers - apparently region is required for Search after all :(

Image

It's the only place where it's required, so I've made it an optional input with NAM as the default, so an LLM can always pass it if necessary. This way, we don't have to ask for it at connector set-up time.

That being said, I don't feel terribly strongly about this, so if we want to revert to asking for a region at set-up time I can do so.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I like this solution.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can have search region be an optional field, something like:

schema: z.object({
    region: z
      .enum(['NAM', 'EUR', 'APC', 'LAM', 'MEA'])
      .optional()
      .default('NAM')
      .describe(
        'Search region (NAM=North America, EUR=Europe, APC=Asia Pacific, LAM=Latin America, MEA=Middle East/Africa)'
      )
      .meta({
        label: 'Search region',
      }),
  }),

Feel free to leave this for a follow-up, and only if we're interested.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, sanity check: we're doing SPO connector that does not run on behalf of the user, is it correct?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct I believe, it does not run on the behalf of a user

Copy link
Copy Markdown
Member

@seanstory seanstory Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@artem-shelkovnikov , the "on behalf of the user" bit is blocked by #246655.

Once we merge that, we can simply change this line to

type: 'oauth_authorization_code_grant'

instead of

type: 'oauth_client_credentials'

and everything should "just work".

But that PR is probably going to be blocked for a bit. Hence the temporary alternative auth mechanism.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'll be a bit different in details - for example there's an endpoint in the PR that uses region param because it's needed by application permission requests.

We'd need to update some endpoints here and there + update the instructions for setting up the delegated permissions instead of application permissions

Copy link
Copy Markdown
Member

@seanstory seanstory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I know there's still plenty of follow-up work to do, but this PR is already quite large, so I'm supportive of getting it in.

@mattnowzari
Copy link
Copy Markdown
Contributor Author

Hey there @elastic/workflows-eng 👋 - would appreciate a review of this PR whenever y'all have a moment to do so 🙏

@mattnowzari mattnowzari removed the request for review from a team January 29, 2026 17:36
@mattnowzari mattnowzari enabled auto-merge (squash) January 29, 2026 17:59
@mattnowzari mattnowzari merged commit 995c37b into elastic:main Jan 29, 2026
16 checks passed
@elasticmachine
Copy link
Copy Markdown
Contributor

💛 Build succeeded, but was flaky

Failed CI Steps

Test Failures

  • [job] [logs] FTR Configs #49 / Fleet Endpoints Integrations inputs_with_standalone_docker_agent "before all" hook for "generate a valid config for standalone agents"
  • [job] [logs] FTR Configs #21 / integrations For each artifact list under management "after all" hook in "For each artifact list under management"
  • [job] [logs] FTR Configs #21 / integrations For each artifact list under management "before all" hook in "For each artifact list under management"

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id before after diff
dataSources 96 98 +2
stackConnectors 650 653 +3
workflowsManagement 1256 1257 +1
total +6

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
dataSources 81.2KB 85.5KB +4.2KB
stackConnectors 1.1MB 1.1MB +11.7KB
workflowsManagement 1.4MB 1.4MB +7.5KB
total +23.4KB

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id before after diff
stackConnectors 77.0KB 77.0KB +72.0B
Unknown metric groups

async chunk count

id before after diff
dataSources 11 12 +1
stackConnectors 123 124 +1
total +2

History

mbondyra added a commit to mbondyra/kibana that referenced this pull request Jan 30, 2026
…iew_cps

* commit '5f7fec57cb01883038810bd735a0666683b49904': (116 commits)
  [Security Solution][Attacks/Alerts][Setup and miscellaneous] Advanced setting to control feature visibility (elastic#250157) (elastic#250830)
  Fix synthtrace `fetch` usage (elastic#250950)
  [APM] Add Nodes and Edges components and selection logic (elastic#250937)
  [Docs] Update alerting-settings.md and add serverless value for one parameter (elastic#250842)
  [Agent Builder] filestore: initial implementation (elastic#250043)
  [CPS] Support CPS in Vega ESQL (elastic#250693)
  Adjustments to cascade document esql helpers (elastic#250560)
  [Security Solutions] Trial Companion - adds ai chat and elastic agent detectors (elastic#250908)
  [Obs Presentation] Code Scanning Alert Fixes (elastic#250858)
  [performance] add return and refresh render scenarios to dashboard journeys (elastic#250939)
  skip failing test suite (elastic#245458)
  Add Cloud Forwarder onboarding tile to O11y Solution (elastic#250325)
  [Traces] Remove APM unified trace waterall embeddable registration (elastic#250808)
  [Discover] [Metrics] Fix: metrics grid titles do not update on order change (elastic#250963)
  [a11y] Fix Eui modal title annoucment (elastic#250459)
  [Cloud Security] [Fleet] Add cloud connector access scope for input or package level credential definitions (elastic#250280)
  [WorkplaceAI] SharePoint Online stack connector (elastic#248737)
  [Response Ops][Task Manager] Update functions do not handle API key invalidation (elastic#249109)
  [Osquery] Remove @kbn/timelines-plugin dependency from osquery plugin (elastic#250055)
  [One Discover][Logs UX] Update OpenTelemetry Semantic Conventions (elastic#250346)
  ...
hannahbrooks pushed a commit to hannahbrooks/kibana that referenced this pull request Jan 30, 2026
## Summary

This PR adds a SharePoint Online `v2` stack connector.

The `actions` this connector provides are:

- search (accepts a query and entity type to search)
- getAllSites (retrieve all sites, potentially large return)
- getSitePages (retrieve all pages of a given site)
- getSitePageContents (get the content of a site page)
- getSite (retrieve a single given site)
- getSiteDrives (retrieve the drives of a single given site)
- getDriveItems (retrieve all items of a single given drive)
- downloadDriveItems (download the contents of a single drive item via a
driveId and itemId)
- downloadItemFromURL (download the contents of a single drive item via
its downloadUrl)
- getSiteLists (retrieve the lists of a single given site)
- getSiteListItems (retrieve the items of single given list of a single
given site)
- callGraphAPI (transparent action that allows an LLM with knowledge of
the Microsoft Graph API to formulate and call endpoints at will.
Intended to be a power user feature)

## What this PR does not include
- Data Source registration
- Workflow YAML definitions

These will be provided in a fast-follow PR, as this one is quite
sizeable already.

## Checklist

Check the PR satisfies following conditions. 

Reviewers should verify this PR satisfies this list as well.

- [x] Any text added follows [EUI's writing
guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses
sentence case text and includes [i18n
support](https://github.com/elastic/kibana/blob/main/src/platform/packages/shared/kbn-i18n/README.md)
- [x]
[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)
was added for features that require explanation or tutorials
- [x] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios
- [x] The PR description includes the appropriate Release Notes section,
and the correct `release_note:*` label is applied per the
[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)

## Release Notes
- A new connector for SharePoint Online has been added to the available
list of connectors

---------

Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
Co-authored-by: Florent Le Borgne <florent.leborgne@elastic.co>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport:skip This PR does not require backporting release_note:feature Makes this part of the condensed release notes Team:Workchat v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants