Unbounded Consumption
Unbounded Consumption occurs when a Large Language Model (LLM) application permits excessive and uncontrolled inferences, allowing attackers to exploit resources for malicious purposes. This vulnerability can lead to denial of service (DoS) attacks, financial losses due to inflated operational costs, intellectual property theft through model behaviour replication, and degraded service performance. Given the significant computational demands of LLMs, especially in cloud-based environments, this vulnerability presents a critical risk to service availability, security, and cost efficiency.
Remediation
- Implement strict rate limits on the number of requests a user can make within a defined time period.
- Enforce user-specific quotas to control the volume of inferences each user is allowed.
- Deploy monitoring systems to detect unusual patterns of activity, such as rapid bursts of requests or excessive usage by a single user.
- Integrate billing and cost control features to limit financial exposure from excessive resource consumption.
- Use caching to reduce redundant computations for repeated queries.
Metadata
- Severity: high
- Slug: unbounded-consumption
CWEs
- 400: Uncontrolled Resource Consumption
OWASP
- LLM10:2025: Unbounded Consumption