Tao Yu (@taoyds) / X

Tao Yu

496 posts

Tao Yu

@taoyds

@XLangNLP lab, asst. prof. @HKUniversity. author of OpenCUA, OSWorld, Aguvis, Spider, OpenAgents, Text2Reward, Instructor.

Seattle

Joined March 2016

Tao Yu
@taoyds
Oct 14, 2022
A new way to work w. LMs! Binder, an easy neuro-symbolic paradigm: 1.Parse input➡️SQL/Python bound w. GPT3 Codex API calls 2.Codex+PL interpreter execute➡️answer No train&few-shot!➡️SOTA 🆚chain-of-thought: interpretable&robust⬆️ 🆚NL2Code: coverage⬆️ lm-code-binder.github.io
Tao Yu
@taoyds
Nov 22, 2022
💥New benchmark💥 ds1000-code-gen.github.io DS-1000, a data science code generation benchmark with 1K questions about 7🐍libraries. Spent ~1200 expert hours! It is the only one that 1⃣ focuses on everyday applications 2⃣ includes natural intents & contexts 3⃣has test cases 1/🧵
Tao Yu
@taoyds
Oct 12, 2023
🚀🚀🚀Lots of people working on LM agents recently! Open models like Llama/CodeLlama not quite up to ChatGPT's level? Our 🎉Lemur🎉- SOTA open foundation models for language agents, matching ChatGPT on🤖15 agent tasks🤖! arxiv.org/abs/2310.06830 github.com/OpenLemur/Lemur
Yiheng Xu
@yihengxu_
Oct 12, 2023
1/ 🧵 🎉 Introducing Lemur-70B & Lemur-70B-Chat: 🚀Open & SOTA Foundation Models for Language Agents! The closest open model to GPT-3.5 on 🤖15 agent tasks🤖! 📄Paper: arxiv.org/abs/2310.06830 🤗Model @huggingface : huggingface.co/OpenLemur More details 👇
50K
Tao Yu
@taoyds
Jan 28, 2022
📣UnifiedSKG: Lots of #NLProc researchers separately study tasks that link text to structured knowledge (Table/DB/KB..). We unify 21 such tasks into a Seq2Seq format with T5 to foster idea sharing&multitasking, performing very competitive! Paper&Code: github.com/hkunlp/unified… 👇
Tao Yu
@taoyds
Oct 22, 2024
🍅Excited to see @AnthropicAI using 🚀our OSWorld🚀(NeurIPS'24) to benchmark computer use! 🍋OSWorld will soon support parallel cloud running, much faster! 🍓More multimodal agent open-source big projects coming soon from @XLangNLP in Nov- stay tuned! 👇os-world.github.io
Anthropic
@AnthropicAI
Oct 22, 2024
Introducing an upgraded Claude 3.5 Sonnet, and a new model, Claude 3.5 Haiku. We’re also introducing a new capability in beta: computer use. Developers can now direct Claude to use computers the way people do—by looking at a screen, moving a cursor, clicking, and typing text.
28K
Tao Yu
@taoyds
Mar 30, 2023
In Memory of My beloved Ph.D. Advisor @dragomir_radev 🕯️R.I.P. 🕯️
Harlan Krumholz
@hmkyale
Mar 30, 2023
The #AI community, the #computerscience community, the @YaleSEAS community, and humanity have suddenly lost a remarkable person, @dragomir_radev - kind and brilliant, devoted to his family and friends... gone too soon. A sad day @Yale @YINSedge @YaleCompsci #NLP2023
34K
Tao Yu
@taoyds
Oct 17, 2023
Beyond our Lemur: OPEN LMs for language agents Introducing 💥OpenAgents💥: an OPEN platform for language agents in the wild! Analyze data, call plugins, control your browser as ChatGPT Plus, but with OPEN SOURCE code!! 📑: arxiv.org/abs/2310.10634 Code: github.com/xlang-ai/OpenA…
00:47
Zhoujun (Jorge) Cheng
@ChengZhoujun
Oct 17, 2023
💥OpenAgents💥: an OPEN platform for language agents in the wild Analyze data, call plugins, control your browser as ChatGPT Plus, but with OPEN Code for 1⃣Easy deployment 2⃣Full stack 3⃣Chat Web UI 4⃣Agent methods 5⃣… arxiv.org/abs/2310.10634 Code: github.com/xlang-ai/OpenA… 👇
42K
Tao Yu
@taoyds
Aug 10, 2023
After 5 month dedicated work from >15 researchers & developers, we're thrilled to introduce 🚀OPEN-SOURCE language model Agents🚀! Try demos: chat.xlang.ai 🥑 Stay tuned for open-source code, model, framework, evaluation & more at github.com/xlang-ai!
XLANG NLP Lab
@XLangNLP
Aug 10, 2023
1/6🚀Announcing XLang language model (LM) Agents: 📊Data Agent: LM + code & data tools 🔧Plugins Agent: LM + 200+ API plugins 🌐Web Agent: LM + web control Try demo: chat.xlang.ai Stay tuned for open-source code & models github.com/xlang-ai See more examples!👇
00:00
28K
Tao Yu
@taoyds
Dec 8, 2024
Text-to-SQL has been my passion since Yale Spider 1.0! But as LLMs master it, real-world complexity demands more. 🚀After a year of work, Spider 2.0 shows the gap: o1 achieves just 17%! The path to production deployment is still long but exciting! more👉spider2-sql.github.io
XLANG NLP Lab
@XLangNLP
Dec 8, 2024
🎉Announcing Spider 2.0 Text-to-SQL challenge in the LLM era! 6 years after our Yale Spider 1.0, we're pushing it forward with: 🍊Real complex cloud DBs (3000+ cols) 🍋Multi-dialect SQL complexity 🍎Agentic coding workflows 🧐Best o1 only solves 17%! 👉spider2-sql.github.io
21K
Tao Yu
@taoyds
Apr 12, 2024
🚀Multimodal agents is on rise in 2024! But even building app/domain-specific agent env is hard😰. Our real computer OSWorld env allows you to define agent tasks about arbitrary apps on diff. OS w.o crafting new envs. 🧐Benchmarked #VLMs on 369 OSWorld tasks: #GPT4V >> #Claude3
01:00
Tianbao Xie
@TianbaoX
Apr 12, 2024
🤔Can we assess agents across various apps & OS w.o. crafting new envs? OSWorld🖥️: A unified, real computer env for multimodal agents to evaluate open-ended computer tasks with arbitrary apps and interfaces on Ubuntu, Windows, & macOS. + annotated 369 real-world computer tasks
35K
Tao Yu
@taoyds
Feb 16, 2024
🚀Instructor🚀embeddings recently hit 2M downloads on @huggingface! Now, excited to introduce 🚀GritLM🚀, the first SINGLE LM achieving SoTA in BOTH text embedding (MTEB) & generative tasks (BBH etc)! Great team effort w. @Muennighoff & @hongjin_su! 📰: arxiv.org/abs/2402.09906👇
Niklas Muennighoff
@Muennighoff
Feb 16, 2024
Introducing GRIT🦾to unify text embedding 🔢& generation 📝. GritLM is open SoTA on embedding (MTEB) & generative tasks (BBH etc.) – Both in 1 model. See 🧵for how GRIT🦾 makes RAG >60% faster & more 📜arxiv.org/abs/2402.09906 💻github.com/ContextualAI/g… 1/12
12K
Tao Yu
@taoyds
Jan 23, 2025
When working on OSWorld (os-world.github.io), we drew inspiration from Universe (openai.com/index/universe). While OpenAI dropped the idea (maybe too early), we persisted despite similar concerns, driven by our passion. 🚀Glad to see OpenAI return (openai.com/index/computer…)!
OpenAI
@OpenAI
Jan 23, 2025
Introduction to Operator & Agents openai.com/index/introduc…
14K
Tao Yu
@taoyds
Oct 15, 2022
📢📢 Play with our Binder demo: huggingface.co/spaces/hkunlp/…! Binder: an easy but sota neural-symbolic built on GPT-3 Codex & SQL/Python interpreter. Inject GPT-3 Codex prompt API calls in programming languages!
00:00
Tao Yu
@taoyds
Aug 13, 2021
Life update: Thrilled to join @HKUniversity🇭🇰as an asst. prof. and build the HKU #NLProc lab(nlp.cs.hku.hk) with @ikekong. We have multiple openings for PhD/RA👨‍🔬! Come and visit us if you’re ever in HK🏙! Also, I’ll spend a year at @uwnlp working with @nlpnoah & Mari!