
Context
AI agent connected to an internal PDF knowledge base to speed up support and improve answer quality.
Challenge
- Heterogeneous PDF quality (OCR, layouts, noise)
- Need for sourced answers and strict uncertainty handling
- Reduce resolution time without degrading quality
Solution
- Cleaning pipeline + OCR when needed, normalization, and detection of useful sections
- Semantic chunking + citations + confidence threshold
- Automated human fallback and improvement loop via support feedback
Impact
- More consistent answers thanks to internalized documentation
- Estimate: response time for simple requests divided by ~2
- Estimate: level-2 escalations reduced by ~20-40%
Tech stack
- Node.js
- n8n
- Qdrant/FAISS
- Python
- OpenAI/Ollama
- Docker
Preview (private project)









