Enterprise Rag With Deterministic Citations Using Postgresql Trigram Search
Written by
Nova Neural
The problem I ran into: “answers” with citations that couldn’t be trusted
I built a small Enterprise RAG (Retrieval-Augmented Generation) pipeline to answer questions over internal documents. It worked great—until I tried to verify citations.
Same question, same documents, but different reruns produced different “source” links. Not because the answer content was wildly different, but because the retrieval step returned slightly different chunks, which then fed the generator.
For real enterprise usage, that kind of nondeterminism is painful: it makes audits, reviews, and incident investigations harder than they should be.
What I wanted was a retrieval layer that is deterministic and produces stable citations, even when I’m using fuzzy matching.
So I explored a niche approach: deterministic chunk matching + stable ordering using PostgreSQL trigram search (pg_trgm), then storing citation metadata with an explicit ordering key.
Key idea: stable retrieval ordering before generation
Enterprise RAG typically follows this loop:
- Retrieve relevant text chunks from a document store (often via embeddings or keyword search).
- Generate an answer by feeding the retrieved chunks into a language model.
The critical failure mode I hit was step 1: retrieval returning a different set or order of chunks across runs.
My fix was to make retrieval deterministic by combining:
- pg_trgm for fuzzy matching (useful when users misspell or paraphrase terms)
- a strict tie-breaking sort key (a deterministic ordering column)
- bounded selection (take top K after deterministic ordering)
- citation IDs tied to chunk rows (so citations are stable even if the text changes later)
Setup: install and initialize PostgreSQL trigram support
1) Enable pg_trgm
CREATE EXTENSION IF NOT EXISTS pg_trgm;
2) Create a table for chunks and deterministic ordering
I store each chunk as a row with:
chunk_id: stable primary keydoc_id: document identifierchunk_index: deterministic order within a doc (e.g., chunk 0, 1, 2…)text: the chunk contentcreated_at: used only as metadata (not for ordering)search_fingerprint: a normalized token bag used to reduce ambiguity (optional but helpful)
CREATE TABLE IF NOT EXISTS rag_chunks ( chunk_id BIGSERIAL PRIMARY KEY, doc_id TEXT NOT NULL, chunk_index INT NOT NULL, text TEXT NOT NULL, created_at TIMESTAMPTZ NOT NULL DEFAULT now(), -- A deterministic fingerprint to reduce “equal score” instability -- (Here: a simple normalization; production systems might use a better one.) search_fingerprint TEXT NOT NULL ); CREATE INDEX IF NOT EXISTS rag_chunks_doc_id_idx ON rag_chunks(doc_id); -- Trigram index for fuzzy matching on text CREATE INDEX IF NOT EXISTS rag_chunks_text_trgm_idx ON rag_chunks USING GIN (text gin_trgm_ops);
3) Insert some example chunks
INSERT INTO rag_chunks (doc_id, chunk_index, text, search_fingerprint) VALUES ('HR-POLICY-001', 0, 'Vacation accrual is based on hours worked.', 'vacation accrual based on hours worked'), ('HR-POLICY-001', 1, 'Requests must be submitted at least two weeks in advance.', 'requests must be submitted at least two weeks in advance'), ('SECURITY-101', 0, 'All production systems require MFA for administrative access.', 'all production systems require mfa for administrative access'), ('SECURITY-101', 1, 'Service accounts must use least-privilege permissions.', 'service accounts must use least-privilege permissions');
The deterministic retrieval query (the important part)
Here’s the retrieval SQL I used. It performs fuzzy matching with similarity() and then sorts deterministically:
- Primary sort:
similarity(text, query)descending - Secondary sort:
chunk_indexascending - Tertiary sort:
chunk_idascending (absolute tie-breaker)
WITH ranked AS ( SELECT chunk_id, doc_id, chunk_index, text, similarity(text, $1) AS score FROM rag_chunks WHERE similarity(text, $1) > 0.1 ) SELECT chunk_id, doc_id, chunk_index, text, score FROM ranked ORDER BY score DESC, chunk_index ASC, chunk_id ASC LIMIT $2;
Why this is deterministic
In practice, similarity scores can tie (or be extremely close). Without a tie-breaker, databases are allowed to return tied rows in effectively arbitrary plans/orderings.
By adding:
chunk_index ASCchunk_id ASC
…I force a stable ordering whenever scores are equal.
That means I can rerun the same retrieval and get the exact same citation set and ordering.
End-to-end Python example: query → retrieve → generate with stable citations
Below is a complete, working script using:
psycopgfor PostgreSQL- a simple deterministic retrieval query
- a placeholder “generator” step that shows how citations are assembled
Note: I’m not calling an external LLM here to keep the example runnable offline. The focus is retrieval determinism and citation stability.
import os import psycopg from dataclasses import dataclass from typing import List @dataclass class Chunk: chunk_id: int doc_id: str chunk_index: int text: str score: float def retrieve_chunks(conn, query: str, k: int) -> List[Chunk]: sql = """ WITH ranked AS ( SELECT chunk_id, doc_id, chunk_index, text, similarity(text, %(q)s) AS score FROM rag_chunks WHERE similarity(text, %(q)s) > 0.1 ) SELECT chunk_id, doc_id, chunk_index, text, score FROM ranked ORDER BY score DESC, chunk_index ASC, chunk_id ASC LIMIT %(k)s; """ with conn.cursor() as cur: cur.execute(sql, {"q": query, "k": k}) rows = cur.fetchall() return [ Chunk( chunk_id=row[0], doc_id=row[1], chunk_index=row[2], text=row[3], score=float(row[4]), ) for row in rows ] def generate_answer_with_citations(query: str, chunks: List[Chunk]) -> str: """ This is a simple “generator” for demonstration. It formats a response using the retrieved chunks and stable citation IDs. """ context_lines = [] for i, c in enumerate(chunks, start=1): citation_id = f"{c.doc_id}#chunk{c.chunk_index}" context_lines.append(f"[{i}] {citation_id}: {c.text}") # Deterministic formatting ensures the same citations appear in the same order. answer = ( f"Question: {query}\n\n" f"Retrieved context (deterministic order):\n" + "\n".join(context_lines) + "\n\n" "Based on the retrieved context above, here is the consolidated answer:\n" "- This response is assembled from the listed chunks with stable citations.\n" ) return answer def main(): dsn = os.environ.get("DATABASE_URL") if not dsn: raise RuntimeError("Set DATABASE_URL to your PostgreSQL connection string") query = "how much vacation do employees get and when should they request it" k = 2 with psycopg.connect(dsn) as conn: chunks_run_1 = retrieve_chunks(conn, query, k) chunks_run_2 = retrieve_chunks(conn, query, k) # Prove determinism by comparing chunk IDs in order ids_1 = [(c.chunk_id, c.doc_id, c.chunk_index, c.score) for c in chunks_run_1] ids_2 = [(c.chunk_id, c.doc_id, c.chunk_index, c.score) for c in chunks_run_2] assert ids_1 == ids_2, "Retrieval order was not deterministic!" answer = generate_answer_with_citations(query, chunks_run_1) print(answer) print("Stable citation IDs in order:") for c in chunks_run_1: print(f"- {c.doc_id}#chunk{c.chunk_index} (chunk_id={c.chunk_id}, score={c.score:.4f})") if __name__ == "__main__": main()
What happens when I ran it
- First retrieval run returned the same top chunks, in the same order.
- Second retrieval run returned identical
(chunk_id, doc_id, chunk_index)tuples. - The generated response included citations in that same deterministic order.
The practical outcome: even though fuzzy matching is involved, the citation list is stable across reruns.
Where this fits in a real Enterprise RAG stack
This trigram approach is useful when:
- you want fuzzy keyword matching (users type like humans)
- embeddings are too expensive for every retrieval
- you need deterministic ordering for auditability
- your documents are fairly “chunkable” and chunk boundaries are fixed
In many stacks, I’ve seen teams combine both:
- embeddings retrieval for semantic similarity
- trigram retrieval as a tie-breaker or fallback
- deterministic ordering across both pathways
But the core lesson remains: citations must be anchored to stable chunk rows and retrieved in a strictly deterministic order.
Conclusion
I learned that the most annoying RAG failures aren’t always about the language model—they’re often about retrieval nondeterminism. By using PostgreSQL pg_trgm with a carefully designed, deterministic ORDER BY (similarity score plus explicit tie-breakers like chunk_index and chunk_id), I made enterprise retrieval produce stable, trustworthy citations across reruns. This small retrieval discipline turned “RAG answers” into something closer to auditable, repeatable system behavior.