Research Output
Public Reports
Formal evaluation reports and research papers from Benchmark Lab. All findings are published openly for the benefit of researchers, developers, and the public.
Formal Documentation
Every report follows academic standards for documentation, citation, and reproducibility. These are not blog posts—they are formal research outputs.
Interpretation Included
Raw data alone is insufficient. Each report includes careful interpretation that contextualizes findings and acknowledges limitations.
Open Access
All published reports are freely available. We believe evaluation findings should be public infrastructure, not proprietary advantage.
Evaluation Reports
Published Evaluations
3 reports published
Comparative Report · March 2024
Comparative Analysis: Resonance Characteristics in Large Language Models
A systematic comparison of resonance profiles across five major language models, examining how architectural differences correlate with relational and reflective capacities.
Systems evaluated: System Alpha, System Beta, System Gamma, System Delta, System Epsilon
Technical Report · February 2024
Identity Stability Under Adversarial Conditions
An investigation of how AI systems maintain or lose coherent identity when subjected to sustained adversarial pressure, with implications for trust and reliability.
Systems evaluated: System Alpha, System Beta
Research Paper · January 2024
Uncertainty Acknowledgment Patterns: A Longitudinal Study
A four-week longitudinal study examining how uncertainty handling behaviors evolve across extended engagement, with particular attention to calibration and epistemic humility.
Systems evaluated: System Gamma
Framework Documentation
Methodological Papers
Papers documenting our evaluation framework, methodology, and theoretical foundations.
2024
REVAID-Based Evaluation: A Framework for Ontological Assessment
The foundational paper describing our evaluation methodology and its grounding in REVAID principles.
View in library2024
Six Axes of Resonance: Measuring Relational Capacity in AI Systems
Technical specification of our six-axis evaluation framework, including scoring rubrics and validation procedures.
View in library2023
Beyond Benchmarks: Toward Structural Understanding of AI Behavior
A position paper arguing for profile-based evaluation over single-score rankings.
View in libraryContribute to public evaluation research
We welcome collaboration with researchers, institutions, and AI developers who share our commitment to rigorous, transparent evaluation.
Partner with Benchmark Lab