Hi everyone!
I’m an independent researcher (10 years in systems analysis and QA, no academic affiliation) seeking an arXiv endorsement for cs.AI / cs.CL / cs.LG.
My paper introduces the Sakshi-Protocol: a control-layer architecture that addresses a structural problem in autoregressive LLMs: generation and evaluation share the same probabilistic substrate, so the model validates its own outputs using the same process that produced them.
The framework separates generation, observation, and control into distinct components. An observer layer extracts diagnostic signals during inference and maps them to an explicit five-dimensional cognitive state-space (stability, reactivity, transformation, valuation, integration). A distortion metric over this state estimates epistemic instability and drives a type-aware controller that decides, per prompt category, on whether to accept, retrieve, or abstain. External grounding is invoked selectively based on distortion rather than applied uniformly.
The key empirical finding: internal signals are fundamentally insufficient to detect high-confidence hallucinations. The paper establishes this as a boundary condition, not just a limitation demonstrated on 100 curated prompts and 50 adversarial TruthfulQA prompts.
The paper is 35 pages (37 including appendix) with full evaluation, figures, and comparison against RLHF, RAG, Reflexion, Self-RAG, and Constitutional AI. This is Version 2.0 of an actively iterated framework and the architecture and evaluation are expanding across subsequent versions toward a fully embedded, production-ready system.
The preprint is available on Zenodo: https://zenodo.org/records/20126093
My endorsement code is: HIQGRP
Thank you.