OnboardingOS
03 · Developer Tools
RAGCode EmbeddingsLangChainPythonVector Search

OnboardingOS

Ingests an entire codebase, git history and docs — then answers "where does authentication happen?" with exact file references. Your codebase, finally explained.

Status
Building
Domain
Developer Tools
Core Tech
RAG · Embeddings
Target
Teams & solo devs
The Problem

Week one at a new codebase.
The loneliest week in software.

Every developer knows the feeling. You join a project — or come back to your own after three months — and the codebase is a maze. Where does the auth logic live? Why is this abstraction here? What does this service actually do and who calls it? The README hasn't been updated since 2021. The docs are aspirational. The comments lie.

You either interrupt a senior colleague every 20 minutes, or you spend hours detective-working through 200,000 lines of code to answer a question that should take 10 seconds. OnboardingOS makes the codebase legible — permanently, for everyone, from day one.

OnboardingOS · Ask your codebase

Where does user authentication happen and what happens when a token expires?

Authentication is handled in src/auth/middleware.py (line 47) via JWT validation. On expiry, the TokenExpiredError is caught in the middleware and returns a 401. The token refresh flow is in src/auth/refresh.py — it calls UserService.rotate_token() which issues a new pair.

→ src/auth/middleware.py:47 · src/auth/refresh.py:23 · src/services/user.py:189

Why is the payment service calling the notification service directly? Is that intentional?

Based on git history, this was a quick fix in commit a3f91b2 (March 2023) by @sarah during an incident. The intention was to move this to an event queue — there's a TODO comment on line 134 and an open GitHub issue #447. It's technical debt, not intentional architecture.

→ git:a3f91b2 · src/payments/service.py:134 · GitHub Issue #447
The Approach

RAG over your entire codebase

OnboardingOS ingests source files, git history, commit messages, PR descriptions, inline comments, and any documentation. Everything is chunked, embedded, and stored in a vector database. When you ask a question, it retrieves the most relevant context and generates an answer grounded in your actual code.

The Difference

It knows the why, not just the what.

Git history contains decisions — why an abstraction was introduced, when a shortcut was taken, what was being fixed at 2am. By including commit messages and PR descriptions in the knowledge base, OnboardingOS can answer "why does this work this way?" — the question no documentation ever answers.

Features

What OnboardingOS does

🧭
Natural Language Queries
Ask questions the way you'd ask a colleague. Get answers with exact file and line references, not vague summaries.
📜
Git History Awareness
Understands why things are the way they are. Commit messages and PR descriptions are first-class knowledge, not afterthoughts.
🗺️
Dependency Maps
Visual and queryable dependency graphs. "What would break if I changed this function?" answered before you touch a line.
🔄
Always Up to Date
Re-indexes on every push. The knowledge base stays current as the codebase evolves — no manual documentation updates needed.
👥
Team Knowledge Base
Shared across the team. One developer's conversation with the codebase builds context that benefits everyone.
🔒
Runs Locally
Your code stays on your machine or your infrastructure. Nothing is sent to external servers unless you explicitly configure it.
Stack

How it's built

🦜
LangChain
RAG Pipeline
📐
ChromaDB
Vector Store
🤗
HuggingFace
Embeddings
🌳
Tree-sitter
Code Parsing
🐍
Python
Core Backend
FastAPI
Query API
⚛️
React
Chat UI
🔌
VS Code
Extension

Currently in development

RAG pipeline designed · Embedding strategy defined · Building ingestion layer for Python codebases first

← CodeDissect Next: Uncanny Valley of Text →