John Nay (@johnjnay) / X

John Nay

1,813 posts

John Nay

@johnjnay

founder & CEO of Norm Ai // founding CEO of Brooklyn AI (acquired by TIAA Nuveen) // more at linkedin.com/in/johnjnay/

nyc

law.stanford.edu/directory/john…

Joined December 2015

John Nay
@johnjnay
Mar 31, 2023
HuggingGPT -Human requests something -ChatGPT 1 Plans tasks 2 Selects AI models based on HuggingFace descriptions 3 Manages cooperation of expert models to execute subtasks 4 Summarizes results Covers many sophisticated tasks across modalities & domains arxiv.org/abs/2303.17580
652K
John Nay
@johnjnay
Apr 7, 2023
LLMs Are Better Than Human Data Annotators -GPT-3 was helpful but not better than humans (e.g. arxiv.org/abs/2108.13487) -GPT-3.5 is about on par w/ humans (e.g. arxiv.org/abs/2303.16854 w/ self-explanations) -GPT-4 is better than $25/hr humans (e.g. arxiv.org/abs/2304.03279)
418K
John Nay
@johnjnay
Jan 29, 2023
French researchers converted their tax code into computer code Compiles to Python & provides insights about "essence" of France's income tax computations Government is officially transitioning to this for production. Paper: arxiv.org/abs/2011.07966 Code: github.com/MLanguage/mlang
327K
John Nay
@johnjnay
Apr 10, 2023
LLMs as Generative Agents in Social Simulations Augment LLM: -Store record of agent's "life" -Synthesize its memories into reflections -Retrieve memories dynamically & plan Simulations produce human-like individual behavior & emergent social interaction arxiv.org/abs/2304.03442
315K
John Nay
@johnjnay
Feb 1, 2023
AI research output from prolific institutions: - Google - Microsoft - Stanford - Meta - Amazon - DeepMind - OpenAI
1.4M
John Nay
@johnjnay
Feb 26, 2023
LLMs are exhibiting emergent behaviors at scale (for e.g. see @_jasonwei's jasonwei.net/blog/emergence) In this context, revisiting books on emergence of social, economic & biological phenomena Complexity science may have a resurgence
190K
John Nay
@johnjnay
Apr 1, 2023
Forums for LLM Agents to Communicate Can Improve Outputs 1) Human provides task 2) "Decider" Agent produces output 3) "Researcher" & Decider Agents discuss 4) Decider decides Big improvement over base GPT4 on medical summarization & care plan generation arxiv.org/abs/2303.17071
310K
John Nay
@johnjnay
Apr 3, 2023
LLMs Can Iteratively Self-Refine -LLM creates draft -Provides its own feedback -Iteratively refines On all 7 eval tasks (review & code rewriting toxicity removal responses acronyms stories etc.) outputs are preferred by humans & by automated metrics arxiv.org/abs/2303.17651
233K
John Nay
@johnjnay
Apr 18, 2023
Gisting: 26x Compression of LLM Prompts -Trains LLM to compress prompts into smaller sets of "gist" tokens to be reused for compute efficiency -Can be easily trained as part of instruction fine-tuning -FLOPs reductions, time speedups & storage savings arxiv.org/abs/2304.08467
276K
John Nay
@johnjnay
Mar 24, 2023
Reflection-Based GPT-4 Agent is State-of-the-Art on Code Gen Iteratively refines code, shifting “accuracy bottleneck” from correct code gen to correct test gen HumanEval accuracy: -Reflexion-based GPT-4 88% -GPT-4 67.0% -CodeT 65.8% -PaLM 26.2% Code: github.com/noahshinn024/r…
John Nay
@johnjnay
Mar 23, 2023
A Self-Reflecting LLM Agent Equips LLM-based agent w/ -dynamic memory -a self-reflective LLM -a method for detecting hallucinations Challenge agent to learn from its own mistakes -Evaluate on knowledge-intensive tasks -Outperforms ReAct agents Paper: arxiv.org/abs/2303.11366
487K
John Nay
@johnjnay
Mar 9, 2023
ChatGPT for Training Data 1 ChatGPT rephrases each training sentence into multiple conceptually similar but semantically different sentences 2 Train smaller model Outperforms SoTA data augmentation methods for few-shot learning text classification Paper arxiv.org/abs/2302.13007
163K
John Nay
@johnjnay
Apr 22, 2023
LlamaAcademy: Fine-tuning LLMs to Learn How to Talk to APIs Pipeline: -Crawling -GPT-4 data gen -Fine-tuning Vicuna-13B on synthetic data LLM can then read new API docs (Stripe Notion etc), gen code Instead of hosting API docs, host API implementation github.com/danielgross/Ll…
121K
John Nay
@johnjnay
Apr 6, 2023
Simple Self-Improvement of Code LLMs 1) Pre-train & Fine-tune code LLM, gaining knowledge 2) LLM then generates pseudo outputs 3) Add that to original data & train for next epoch Significantly improves code summarization & code generation performance arxiv.org/abs/2304.01228
186K
John Nay
@johnjnay
Mar 23, 2023
A Self-Reflecting LLM Agent Equips LLM-based agent w/ -dynamic memory -a self-reflective LLM -a method for detecting hallucinations Challenge agent to learn from its own mistakes -Evaluate on knowledge-intensive tasks -Outperforms ReAct agents Paper: arxiv.org/abs/2303.11366
604K