Learning Library

← Back to Papers
Research Paper

Entropy‑Guided Token Attacks on Vision‑Language Models

Authors: Mengqi He,
Organization: Hugging Face
Published: 2026-01-09 • Added: 2026-01-09

Key Insights

  • Tokens with the highest predictive entropy dominate the semantic output of V‑L models; tampering only with these few tokens yields large degradations.
  • Entropy‑driven attacks achieve comparable (or greater) success with far lower perturbation budgets than naïve or gradient‑based token attacks.
  • The vulnerability transfers across diverse V‑L architectures (e.g., CLIP, BLIP, ViLT), indicating a systemic weakness in multimodal alignment mechanisms.
  • Computing token entropy from the model’s own output distribution provides an efficient, model‑agnostic way to select attack targets without requiring full gradient information.

Abstract

Selective adversarial attacks targeting high-entropy tokens in vision-language models achieve significant semantic degradation with reduced budgets and demonstrate transferable vulnerabilities across different architectures.

Full Analysis

# Entropy‑Guided Token Attacks on Vision‑Language Models **Authors:** Mengqi He, **Source:** [HuggingFace](https://huggingface.co/papers/2512.21815) | [arXiv](https://arxiv.org/abs/2512.21815) **Published:** 2026-01-09 **Organization:** Hugging Face ## Summary - Tokens with the highest predictive entropy dominate the semantic output of V‑L models; tampering only with these few tokens yields large degradations. - Entropy‑driven attacks achieve comparable (or greater) success with far lower perturbation budgets than naïve or gradient‑based token attacks. - The vulnerability transfers across diverse V‑L architectures (e.g., CLIP, BLIP, ViLT), indicating a systemic weakness in multimodal alignment mechanisms. - Computing token entropy from the model’s own output distribution provides an efficient, model‑agnostic way to select attack targets without requiring full gradient information. ## Abstract Selective adversarial attacks targeting high-entropy tokens in vision-language models achieve significant semantic degradation with reduced budgets and demonstrate transferable vulnerabilities across different architectures. --- *Topics: multimodal, ai-safety, computer-vision* *Difficulty: advanced* *Upvotes: 6*