Inspiration
Memory is slowly becoming the bottleneck of AI usage. While there are many solutions out there like RAG (Retrieval Augmented Generation), they are fundamentally "band-aids." RAG relies on keyword similarity, often missing the logic connecting two distant facts. Worse, it dumps thousands of tokens into the context window, making inference slow and expensive (O(N) attention cost).
I asked myself: "Why do we need to retrieve text at all?" Humans don't re-read a whole textbook to answer a question; we traverse a mental path to the answer. I wanted to build an AI that does the same—solving the reasoning problem inside the latent vector space.
What it does
It solves the problem of RAG inaccuracy and context bloat for LLMs by generating a vector embedding representing what the answer would be based on the input sources and user's query.
How we built it
Manifold is bypasses the context window entirely. Instead of retrieving text chunks, Manifold: Compresses massive contexts into multiple high-dimensional state. Solves the user's query using a Neural ODE (Ordinary Differential Equation) solver. It physically "flows" random noise vectors along a learned trajectory until it hits the "Answer" vector. Injects this raw thought vector directly into an LLM (Qwen2.5). The result? The LLM "hallucinates" the correct answer because we manually injected the truth into its brain. This allows for O(1) generation cost—no matter how large the context is, the LLM only generates the answer tokens, never processing the context tokens.
Challenges we ran into
I had a hard time projecting the output vector from the flow model to influence the thinking process into the LLM(Qwen 2.5B).
Accomplishments that we're proud of
I was able to train the model based on synthtetic ai generated datasets of around 1,200 to effectively find the needle in a haystack, ignore the noise and use multiple similar context to generate an answer vector.
What's next for Manifold
The Projection, I need to be able too sync the output vector along with Qwen or the specific LLM using LoRA.
Log in or sign up for Devpost to join the conversation.