All articles

Feb 26, 2026
pplx-embed: State-of-the-Art Embedding Models for Web-Scale Retrieval
Today we are releasing pplx-embed-v1 and pplx-embed-context-v1, two state-of-the-art text embedding models built for real-world, web-scale retrieval.

research
Feb 4, 2026
Evaluating Deep Research Performance in the Wild with the DRACO Benchmark
DRACO: a Cross-Domain Benchmark for Deep Research Accuracy, Completeness, and Objectivity

security
Dec 2, 2025
BrowseSafe: Understanding and Preventing Prompt Injection Within AI Browser Agents
Defense architecture, benchmark, and detection model for securing AI agents in open-world web environments.

systems
Nov 5, 2025
RDMA Point-to-Point Communication for LLM Systems
Elegant tool to address emerging LLM communication patterns

systems
Nov 4, 2025
Enabling Trillion-Parameter Models on AWS EFA
Make trillion-parameter models available with cloud platform portability

search
Sep 25, 2025
Architecting and Evaluating an AI-First Search API
Building a scalable Search API that handles 200 million daily queries using hybrid retrieval and intelligent context curation for AI models

systems
Sep 24, 2025
Weight Transfer for RL Post-Training in under 2 seconds
Ultra-fast cross-GPU model sync

systems
Sep 8, 2025
GPT-OSS on Day 0
Day‑0 support for GPT‑OSS on H200 by adapting ROSE with FP8, sink attention, and MoE optimizations

systems
Aug 1, 2025
Disaggregated Prefill and Decode
Separating prefill and decode across devices speeds up LLM inference and improves performance
Load more

Feb 26, 2026
pplx-embed: State-of-the-Art Embedding Models for Web-Scale Retrieval
Today we are releasing pplx-embed-v1 and pplx-embed-context-v1, two state-of-the-art text embedding models built for real-world, web-scale retrieval.

research
Feb 4, 2026
Evaluating Deep Research Performance in the Wild with the DRACO Benchmark
DRACO: a Cross-Domain Benchmark for Deep Research Accuracy, Completeness, and Objectivity

security
Dec 2, 2025
BrowseSafe: Understanding and Preventing Prompt Injection Within AI Browser Agents
Defense architecture, benchmark, and detection model for securing AI agents in open-world web environments.

systems
Nov 5, 2025
RDMA Point-to-Point Communication for LLM Systems
Elegant tool to address emerging LLM communication patterns

systems
Nov 4, 2025
Enabling Trillion-Parameter Models on AWS EFA
Make trillion-parameter models available with cloud platform portability

search
Sep 25, 2025
Architecting and Evaluating an AI-First Search API
Building a scalable Search API that handles 200 million daily queries using hybrid retrieval and intelligent context curation for AI models

systems
Sep 24, 2025
Weight Transfer for RL Post-Training in under 2 seconds
Ultra-fast cross-GPU model sync

systems
Sep 8, 2025
GPT-OSS on Day 0
Day‑0 support for GPT‑OSS on H200 by adapting ROSE with FP8, sink attention, and MoE optimizations
Load more

Feb 26, 2026
pplx-embed: State-of-the-Art Embedding Models for Web-Scale Retrieval
Today we are releasing pplx-embed-v1 and pplx-embed-context-v1, two state-of-the-art text embedding models built for real-world, web-scale retrieval.

research
Feb 4, 2026
Evaluating Deep Research Performance in the Wild with the DRACO Benchmark
DRACO: a Cross-Domain Benchmark for Deep Research Accuracy, Completeness, and Objectivity

security
Dec 2, 2025
BrowseSafe: Understanding and Preventing Prompt Injection Within AI Browser Agents
Defense architecture, benchmark, and detection model for securing AI agents in open-world web environments.

systems
Nov 5, 2025
RDMA Point-to-Point Communication for LLM Systems
Elegant tool to address emerging LLM communication patterns

systems
Nov 4, 2025
Enabling Trillion-Parameter Models on AWS EFA
Make trillion-parameter models available with cloud platform portability

search
Sep 25, 2025
Architecting and Evaluating an AI-First Search API
Building a scalable Search API that handles 200 million daily queries using hybrid retrieval and intelligent context curation for AI models
Load more