Post

2. Building A Roadmap

2. Building A Roadmap

Building a Roadmap

Next Steps:

Roadmap proposed by Claude:

StepTitleDescription
Step 1Upgrading Our DatastoreSwap index.pkl for pgvector Get a real vector database in place first. Everything else gets easier once you have proper persistent storage with metadata filtering. This is also where your Spring Boot background becomes relevant — pgvector is just Postgres
Step 2Adding Hybrid SearchOnce you’re on pgvector you can add keyword search alongside vector search almost for free — Postgres has full text search built in. This immediately improves retrieval quality before you add more data
Step 3Adding more data sourcesNow that the foundation is solid, add the codebase first — it’s simpler than Slack because it’s just files. Slack is the messiest source — short messages, lots of noise, threading makes chunking tricky. Save it for last
Step 4Spring Boot API layerWrap everything in an API so it’s not just scripts anymore
Step 5Slack integrationNow you have something worth wiring up to Slack

Today’s work: Step 1 has been completed - we have succesfully migrated over from a index.pkl file -> using a pgvector database which our RAG application succesfully calls to retrieve context. our previous process of loading the index file, then finding the relevant chunks is now streamlined into 1 function where we embed the query and directly use cosine similarity to find the the top_k relevant contexts.

To do: we’ll have to figure out how to avoid storing duplicate context - because everyime we run ingest_confluence.py, we will just add the same information again

This post is licensed under CC BY 4.0 by the author.

Trending Tags