HumaniBench: A Human-Centric Framework for Large Multimodal Models Evaluation Paper • 2505.11454 • Published May 16, 2025 • 5
TRiSM for Agentic AI: A Review of Trust, Risk, and Security Management in LLM-based Agentic Multi-Agent Systems Paper • 2506.04133 • Published Jun 4, 2025 • 3
ViLBias: A Framework for Bias Detection using Linguistic and Visual Cues Paper • 2412.17052 • Published Dec 22, 2024 • 3
EQUATOR: A Deterministic Framework for Evaluating LLM Reasoning with Open-Ended Questions. # v1.0.0-beta Paper • 2501.00257 • Published Dec 31, 2024