Paper
▲ 5
•
research-paper
•
advanced
- Pure LLM judges often mis‑evaluate complex, multi‑step outputs because they lack explicit reasoning and verification mechanisms.
- The paper introduces a modular “agent‑as‑judge” system that first plans an evaluation strategy, then invokes external tools (e.g., calculators, code runners) to verify intermediate claims.
Paper
▲ 6
•
research-paper
•
advanced
- Tokens with the highest predictive entropy dominate the semantic output of V‑L models; tampering only with these few tokens yields large degradations.
- Entropy‑driven attacks achieve comparable (or greater) success with far lower perturbation budgets than naïve or gradient‑based token attacks.
Paper
research-paper
•
advanced
- Traditional Transformers and RNNs reside in a “Metric Phase” where causal order can be broken by semantic noise, causing hallucinations.
- By formulating inference as a Symmetry‑Protected Topological (SPT) phase, logical operations become analogous to non‑Abelian anyon braiding, giving them immunity to local perturbations.
Paper
research-paper
•
advanced
- A shared hypernetwork generates client‑specific VAE decoders and class‑conditional latent priors from lightweight private codes, enabling personalization without exposing raw data.
- Differential‑privacy is enforced at the hypernetwork level by clipping and adding Gaussian noise to aggregated gradients, protecting against gradient‑based leakage.
Paper
research-paper
•
advanced
- Treating attention matrices as token‑level graphs lets spectral analysis separate sound from unsound mathematical proofs.
- Four graph‑spectral metrics (Fiedler value, high‑frequency energy ratio, smoothness, spectral entropy) achieve huge effect sizes (Cohen’s d ≤ 3.30) across seven models from four families, without any training or fine‑tuning.