Skip to content

Add OneDrive data connector#30

Merged
nickscamara merged 3 commits intofirecrawl:mainfrom
mogery:mog/onedrive
Apr 23, 2024
Merged

Add OneDrive data connector#30
nickscamara merged 3 commits intofirecrawl:mainfrom
mogery:mog/onedrive

Conversation

@mogery
Copy link
Copy Markdown
Member

@mogery mogery commented Apr 5, 2024

Fixes #28

/claim #28

This PR includes a OneDrive data connector, which is largely based on converting complex file formats to PDF via the OneDrive API and parsing them via LlamaParse. The following file formats are parsed into documents:

  • Word (.docx, .doc, .dot)
  • PowerPoint (.pptx, .ppt, .pps, .pot, .ppa)
  • Excel (.xls, .xlt, .xla, .xlsx, .xlsm)
  • OpenOffice documents (.odt)
  • OpenOffice presentations (.odp)
  • OpenOffice spreadsheets (.ods)
  • Saved e-mail formats (.eml, .msg)
  • Markdown (.md)
  • EPub (.epub)
  • PDF (.pdf)
  • Plaintext (.txt)
  • Rich Text Files (.rtf)
  • HTML (.html, .htm)

@nickscamara
Copy link
Copy Markdown
Member

nickscamara commented Apr 14, 2024

sorry about the wait @mogery - ccing @ericciarla on this one!

@nickscamara nickscamara requested a review from ericciarla April 14, 2024 01:49
@mogery
Copy link
Copy Markdown
Member Author

mogery commented Apr 16, 2024

any updates on this?

@nickscamara nickscamara merged commit c26abf8 into firecrawl:main Apr 23, 2024
@nickscamara
Copy link
Copy Markdown
Member

Hey @mogery, so sorry I don't think Eric had time to review. I just took a look and looks great! Thanks you and sorry for the wait again!

@nickscamara
Copy link
Copy Markdown
Member

@ericciarla can you check and make sure @mogery received the bounty? I am not sure if it triggered because I was the one that closed the pr?

@mogery
Copy link
Copy Markdown
Member Author

mogery commented Apr 23, 2024

I just took a look and looks great! Thanks you and sorry for the wait again!

Thanks & no worries about the wait :)

can you check and make sure @mogery received the bounty? I am not sure if it triggered because I was the one that closed the pr?

I have not. Eric might have to go on the dashboard and confirm.

@mogery mogery deleted the mog/onedrive branch April 23, 2024 06:30
@mogery
Copy link
Copy Markdown
Member Author

mogery commented Apr 23, 2024

Oh. I did not claim the bounty in my PR. 🤦🏻

EDIT: added it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add Microsoft OneDrive as a data connector

2 participants