#27
Monthly Rank
#85
Lifetime Rank
Share
18 June 2026
Counsel Check

Counsel Check

Grade the supervision, not the prose. A court-grounded AI-supervision assessment for lawyers.

Daniel HillWebsiteVideo

Gallery

About the Project

Verification is not correctness. Measure who on your team can catch what legal AI got wrong. Can your lawyers catch the AI's fabricated authority? A court-grounded competency drill.

Counsel Check measures whether a firm's lawyers can catch the ways legal AI fails. Reviewers work through reconstructed filings a skeleton argument, written submissions, a witness statement, an advice note drawn from real UK matters where AI-fabricated or misapplied authorities reached a court or tribunal (Ayinde, Harber, Al-Haroun) and the lawyers were referred to their regulators. They highlight, flag, and comment on suspect passages; a live citation grader checks each authority against UK government records (legislation.gov.uk and the National Archives) in real time.

What makes it different is the ground truth. There is no LLM in the grading every failure and answer is anchored in the published judgments. The engine scores on a failure-mode taxonomy (we call the underlying phenomenon context divergence, not "hallucination"): fabricated authority, plausible-citation camouflage, quotation fabrication, misapplied ratio, jurisdiction bleed, and more. The result is a personal scorecard (calibrated vs. over-trusting vs. over-flagging) and a firm-level gap view that turns "should we widen AI access?" into an evidence-based decision.

This is an assessment, not a certification this tells a firm where to train, not whether a lawyer is "qualified."

Demo build: no login, nothing stored; annotations live only in the browser tab and the grader never receives document text.

Practice Areas

Key Features

Court-grounded exercises reconstructed from real UK AI-misuse cases (Ayinde, Harber, Al-Haroun) plus a constructed "real-but-wrong" drill.

Live citation grader: real-time existence + identity checks against legislation.gov.uk and the National Archives, no LLM in the loop.

Eight-mode failure taxonomy (context divergence), including the "resolves green but isn't that case" camouflage trap.

Process-aware scoring: distinguishes calibrated reviewers from over-trusting and over-flagging.

Firm gap view: capability map by failure mode + evidence-based AI-access tiering.

Privacy-first demo: no login, no data retention, grader never sees document text.

Help Needed

Looking for UK firm design partners (L&D, risk/GC, and litigation teams) to pilot the assessment against their own workflows, plus feedback from judges and regulators on the failure-mode taxonomy and scoring rubric.

About the Creator

DH
Daniel Hill
R&D at nnLabs