The Challenge
Legal professionals spend enormous billable hours sifting through irrelevant precedents returned by keyword-based search tools. Traditional keyword searches (like BM25) fail when key legal arguments are phrased differently, while modern vector-based neural search models blur the lines between a case's facts and the judge's final decision - acting as a "black box." Standard tools cannot distinguish between what happened in a case and why it matters, leading to irrelevant results, missed analogous precedents, and wasted effort.
Our Approach
Designed an Automated Document Structuring (Facetization) pipeline that utilizes a deterministic LLM to segment raw, unstructured judgments into distinct legal facets: facts, issues, decision, and reasoning.
Built a Hybrid Search Architecture deploying parallel lexical (BM25) and semantic (dense ANN) searches simultaneously, using Reciprocal Rank Fusion (RRF) to form a high-recall candidate pool.
Developed a Section-Aware Re-ranking stage that executes fine-grained scoring across structured case facets with innovative query-wise Z-score normalization to solve the scale mismatch between keyword scores and cosine similarities.
Implemented dynamically learned section weights that prioritize crucial elements like legal reasoning for the final ranking.
Engineered Explainable Outputs that return the exact section of text that triggered the match alongside a concise, LLM-generated rationale, plus Party-Stance Detection that labels whether the retrieved case supports, opposes, or is neutral to a specific party's position.


