How Aura Works

Aura is a clinical mental health assistant powered by a hybrid search pipeline, cross-encoder reranking, and structured clinical data from the DSM-5-TR. Every architectural decision is measured against a 150-query evaluation harness.

Search Pipeline

Each query passes through query expansion, dual-path retrieval (semantic + keyword), cross-encoder reranking, and LLM generation with constitutional safety principles.

Query
User input
0ms
Expansion
80+ synonyms
~1ms
Embedding
MiniLM-L6-v2
~50ms
Hybrid Search
Vector + BM25
~20ms
Reranking
Cross-encoder
~300ms
LLM
Llama 3.3 70B
~1.5s
Safety
Constitutional
In-prompt
Response
SSE stream
~2s total

Retrieval Benchmarks

Measured against 150 gold Q&A pairs across 5 categories: scope, clinical depth, differential diagnosis, safety, and edge cases.

Run the eval harness to generate benchmark data:

npm run eval

System Architecture

Knowledge Base

594
Articles
8,753
Search chunks
88
DSM-5-TR disorders
10
Personality disorders

Search Pipeline

384d
Embedding dimensions
60/40
Vector / BM25 weight
ms-marco-MiniLM-L-6-v2
Reranker model
20→8
Candidates → results

Safety Pipeline

3-tier
Safety classification
988
Crisis escalation
20/min
Rate limit burst
In-prompt
Constitutional principles

Model Stack

Llama 3.3 70B
Generation model
all-MiniLM-L6-v2
Embedding model
Groq
Inference provider
0.3
Temperature

Method Comparison

Retrieval quality across search methods, measured on queries with known expected sources.

MethodRecall@3Recall@5Recall@8MRRNDCG@10
BM25 Only
Hybrid (Vector + BM25)89.9%92.0%93.1%87.7%87.9%
Hybrid + Reranking

Knowledge Graph

Interactive visualization of relationships between conditions, screening tools, and diagnostic categories extracted from structured clinical data.

Loading knowledge graph...

Contact

Interested in our research or data? Reach out at contact@moodspan.org.