How LanceDB Supercharged Our Knowledge Graph
How we scaled our knowledge graph by switching from pgvector to LanceDB, enabling millisecond search across millions of vectors.
Marcos Placona/Aug 4, 2025/2 min read
Quick context, LanceDB just published a deep dive on how Dosu moved from pgvector to LanceDB to keep pace with thousands of GitHub events and an ever-growing body of code. If you’re short on time, here’s the essence and why it matters.
One stubborn bottleneck
Our first production search stack ran on pgvector. It was fine until every model tweak forced a migration script. Development velocity ground to a halt, and query performance at scale was shaky.
Why LanceDB clicked
- File-based, local-first: point Dosu at a directory and start prototyping with zero migrations.
- Hybrid vector + text search out of the box, tuned for code semantics and developer intent.
- Time-travel versioning allows our agents to reason over any historical state of a repository.
Impact so far
Within weeks, we swapped in LanceDB and scaled to thousands of customers. Engineers spend time shipping features, not babysitting migrations. Teams using Dosu now enjoy millisecond search across millions of vectors and 80% automated issue labeling in days.
Explore the new LanceDB's Public Space
We’ve made LanceDB one of our first Public Spaces, a free, read-only knowledge base where anyone can chat with Dosu and get answers pulled from the project’s code, commits, discussions, and docs.
No setup, no log-in, just ask away and let Dosu fetch context for you. To try it, head over to LanceDB’s Public Space. For the full story behind Dosu Public Spaces and how they fit into our mission of making engineering knowledge accessible to everyone, check out July’s Dosu Drop announcement.
Looking ahead
Versioning and richer domain hooks are on the roadmap, turning “docs drift” into a solved problem and letting AI agents pull accurate context on demand.
Read the complete case study. LanceDB’s write-up provides an in-depth look at the architecture, benchmarks, and lessons learned. If you’re evaluating vector databases or wrestling with migration fatigue, grab a coffee and give it a read.
Do you have questions about how we implemented LanceDB or would like to see it live in Dosu? Ping me and let’s chat.
Found this article helpful?
Share it with your network to help others discover valuable insights.
Want more like this? Subscribe via RSS
Related Articles
A stale AGENTS.md is worse than no AGENTS.md
May 29, 2026 / 6 min read
Cloudflare built an internal platform to keep AGENTS.md files fresh across thousands of repos. Here are the methods to keep yours current, and how we do it at Dosu.
May Drop: New usage analytics to see Dosu's impact
May 27, 2026 / 3 min read
Plus: bulk doc generation, support for more formats, and agent-driven setup
How Fresh Are Your Docs? Score Documentation Freshness in CI
May 14, 2026 / 23 min read
A 0-100 freshness signal that catches documentation drift in CI on every PR. Three deterministic checks plus a Claude Code semantic layer for the gray zone.
Introducing better-stale-bot, an AI GitHub Stale Bot That Reads First
May 5, 2026 / 5 min read
Meet better-stale-bot, an open source GitHub stale bot alternative that reads inactive issues, summarizes context, and closes only when the thread supports it.