Skip to content

Add support for OpenAI built-in web search tool#227

Merged
twang849 merged 13 commits intosupercog-ai:mainfrom
twang849:new-tool
Jun 20, 2025
Merged

Add support for OpenAI built-in web search tool#227
twang849 merged 13 commits intosupercog-ai:mainfrom
twang849:new-tool

Conversation

@twang849
Copy link
Contributor

@twang849 twang849 commented Jun 11, 2025

What

  • Added web search tool "openai_websearch.py"
  • Added tool to init.py so that it can be imported as "OpenAIWebSearchTool"
  • Modified the OSS Deep Researcher agent to utilize this tool instead of Tavily

Closes #212

Copy link
Contributor

@drbrady8800 drbrady8800 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, I wouldn't rely on the llm to call launch model before it tries web search (it never is called in deep research so every call fails). I would just make the client every time you call perform the web search.

@twang849
Copy link
Contributor Author

Thanks, makes sense, I will make the change.

@twang849 twang849 requested a review from drbrady8800 June 16, 2025 22:33
Copy link
Contributor

@drbrady8800 drbrady8800 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, a couple issues with naming / passing args through. Try to test deep research to make sure it works fully while you test. That is our most complex agent, so if it works, you can be reasonably sure simpler agents will too. The dedupe sources is also not working, sources are embedded in this format: text we should parse for those and use the title / url. For testing let's use gpt-4o-mini-search-preview it is much cheaper

@twang849
Copy link
Contributor Author

@drbrady8800 Fixed a bunch of the issues that you commented in the code. For the sources, I replaced the _deduplicate_and_format_sources function with format_sources, which removes the embedded links and places them into a dictionary var called sources. However I realized that the model doesnt need the code to sort out repeated sources, so right now that function is not being activated in query_web_content. Please let me know what you think

@twang849 twang849 requested a review from drbrady8800 June 17, 2025 22:35
Copy link
Contributor

@drbrady8800 drbrady8800 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am still seeing issues with the deep researcher:
image

The sources are never cited. I would either figure out a way to have the sources cited correctly (might be tough because they are enumerated by section and then all sections are appended together) or not modify the deep researcher at all.

return replacement

formatted_content = pattern.sub(replacer, content)
print(str(formatted_content) + "\n\n" + str(sources))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this print

@twang849
Copy link
Contributor Author

twang849 commented Jun 18, 2025

@drbrady8800 Okay, I will take a look at it again. Also, when you say that format_sources should be a util function, do you mean prefixing it with an underscore or something else entirely? Thanks

@drbrady8800
Copy link
Contributor

Also, when you say that format_sources should be a util function, do you mean prefixing it with an underscore or something else?

I could see an argument behind either but I would suggest just putting into a util file (probably the one in the tools directory). That way it can be referenced without importing the tool itself. Since it is not stateful at all I think that makes the most sense

@twang849
Copy link
Contributor Author

twang849 commented Jun 18, 2025

@drbrady8800 Okay, I moved the helper function to tools.utils.registry, let me know if that is what you meant. I also changed it so that the sources are deduplicated and listed at the end of each web search, like how it was with the Tavily tool. It seems to be working fine now, I've run it three times and it's correctly shown sources at the end. Though there is a weird bug where the Introduction is being shown after the Conclusion, I'm not sure if this has something to do with the new tool?
Capture1
Capture2
Capture3

Copy link
Contributor

@drbrady8800 drbrady8800 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Thanks for bearing with all the changes, excited to have native openai in here. If you could just move the util and remove the print statement we should be good!

@twang849
Copy link
Contributor Author

@drbrady8800 Awesome - just committed those changes!

@twang849 twang849 merged commit b9f6783 into supercog-ai:main Jun 20, 2025
2 checks passed
@twang849 twang849 deleted the new-tool branch June 20, 2025 15:28
ritzz26 pushed a commit that referenced this pull request Jun 23, 2025
### What
- Added web search tool "openai_websearch.py"
- Added tool to init.py so that it can be imported as "OpenAIWebSearchTool"
- Created new util py file called text_parsing
- Modified the OSS Deep Researcher agent to utilize this tool instead of Tavily
ritzz26 pushed a commit that referenced this pull request Jun 23, 2025
### What
- Added web search tool "openai_websearch.py"
- Added tool to init.py so that it can be imported as "OpenAIWebSearchTool"
- Created new util py file called text_parsing
- Modified the OSS Deep Researcher agent to utilize this tool instead of Tavily
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add support for OpenAI built-in web search tool

2 participants