Skip to content
Vol. 1 · Ed. 2026
CyberGlossary
Entry № 786

Model Denial of Service

Model Denial of Service 是什么?

Model Denial of ServiceOWASP LLM04 — driving an LLM application into runaway resource consumption (long contexts, infinite loops, expensive tool fan-out) so it slows, becomes unavailable, or generates a ruinous cloud bill.


Model Denial of Service (LLM04 in the OWASP Top 10 for LLM Applications) covers attacks that exhaust the resources behind an LLM-powered system rather than knock down a network. Specific patterns include flooding the model with maximum-context inputs to drive up token cost; crafting recursive or self-referential prompts that trigger long generations; abusing tool-calling agents to cascade dozens of expensive sub-calls; submitting inputs that defeat caching; and exploiting retrieval pipelines to pull massive documents into every request. The blast radius is operational (the chatbot becomes unusable) and financial (a single attacker can burn five- or six-figure inference bills in hours). Mitigations include strict per-user input/output token caps, max-step limits on agent loops, semantic and exact-match caching, rate-limit on tool fan-out, async queueing with budget guards, and observability dashboards keyed to spend per tenant.

示例

  1. 01

    An attacker scripts thousands of requests with maximum-allowed context windows, generating six-figure cloud bills before quotas trip.

  2. 02

    An agent prompt-injection convinces the model to enter a tool-use loop that calls the expensive document-summarization API hundreds of times per session.

常见问题

Model Denial of Service 是什么?

OWASP LLM04 — driving an LLM application into runaway resource consumption (long contexts, infinite loops, expensive tool fan-out) so it slows, becomes unavailable, or generates a ruinous cloud bill. 它属于网络安全的 AI 与机器学习安全 分类。

Model Denial of Service 是什么意思?

OWASP LLM04 — driving an LLM application into runaway resource consumption (long contexts, infinite loops, expensive tool fan-out) so it slows, becomes unavailable, or generates a ruinous cloud bill.

Model Denial of Service 是如何工作的?

Model Denial of Service (LLM04 in the OWASP Top 10 for LLM Applications) covers attacks that exhaust the resources behind an LLM-powered system rather than knock down a network. Specific patterns include flooding the model with maximum-context inputs to drive up token cost; crafting recursive or self-referential prompts that trigger long generations; abusing tool-calling agents to cascade dozens of expensive sub-calls; submitting inputs that defeat caching; and exploiting retrieval pipelines to pull massive documents into every request. The blast radius is operational (the chatbot becomes unusable) and financial (a single attacker can burn five- or six-figure inference bills in hours). Mitigations include strict per-user input/output token caps, max-step limits on agent loops, semantic and exact-match caching, rate-limit on tool fan-out, async queueing with budget guards, and observability dashboards keyed to spend per tenant.

如何防御 Model Denial of Service?

针对 Model Denial of Service 的防御通常结合技术控制与运营实践,详见上方完整定义。

Model Denial of Service 还有哪些其他名称?

常见的别称包括: LLM04, LLM DoS, Token-burn attack。

相关术语