Bitext | We help AI understand humans Bitext. We help AI understand humans.

Some of your RAG-related issues have an easy & quick solution: decompounding

Apr 27, 2026 | AI, Decompounding, Lemmatization, Machine Learning, semantic, Stemming

Most teams working with Elasticsearch, OpenSearch or RAG pipelines focus on ranking, embeddings or model quality when trying to improve relevance. But in many cases, the issue starts much earlier: in how text is normalized before indexing. In a previous post, we...

Some of your RAG-related issues have an easy & quick solution: lemmatization

Apr 15, 2026 | AI, Lemmatization, Machine Learning, NER, NLP, semantic, Stemming

Some RAG issues have a simpler fix than people think: better text normalization. One common culprit is stemming. Stemming is a blunt, error-prone approach: it strips word endings mechanically, without properly accounting for morphology, part of speech, or context....

Unstructured Synthetic Text: Beyond Tabular Data

Oct 7, 2022 | Chatbots, Core NLP for AI engines, Generative AI, NLP, Stemming, Synthetic data, text analysis

The case for evaluation of NLU platforms Synthetic image and video have proven to be a big success for cost-cutting. Synthetic text is following suit: tabular data (that is the data organized in a table with rows and columns) is becoming mainstream already, and the...

Multilingual Synthetic Training Data For Intent Detection

Oct 6, 2022 | Chatbots, Core NLP for AI engines, Generative AI, NLP, Stemming, Synthetic data, text analysis

What Is Synthetic training data? Synthetic Training data is the data that is used to train an NLU engine. An NLU engine allows chatbots to understand the intent of user queries. The training data is enriched by data labeling or data annotation, with information about...

Evaluate the Quality of your Chatbots and Conversational Agents

Jul 4, 2019 | Chatbots, conversational, Core NLP for AI engines, Generative AI, NLP, Stemming, Synthetic data, text analysis

It is always important to evaluate the quality of your chatbots and conversational agents in order to know the its real health, accuracy and efficiency. Chatbot accuracy can only be increased by constantly evaluating and retraining it with new data that answers your...

6 keys to building a successful human-like chatbot

Feb 1, 2018 | Chatbots, conversational, Core NLP for AI engines, Generative AI, NLP, Stemming, Synthetic data, text analysis

Talking, expressing ourselves through words, using speech to exchange information is something that comes natural to humans. Then why don’t we just talk with goods and service providers on the internet, instead of using all kinds of user interfaces, buttons and...

« Older Entries

Some of your RAG-related issues have an easy & quick solution: decompounding

Some of your RAG-related issues have an easy & quick solution: lemmatization

Unstructured Synthetic Text: Beyond Tabular Data

Multilingual Synthetic Training Data For Intent Detection

Evaluate the Quality of your Chatbots and Conversational Agents

6 keys to building a successful human-like chatbot

Recent Posts

Recent Comments

Archives

Categories

Meta