-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Is your feature request related to a problem? Please describe.
Currently, broom updates the memory limit after a single Out Of Memory (OOM) event. This immediate adjustment might lead to unnecessary resource allocation, especially when the OOM was caused by a temporary spike in memory usage. Such behavior can result in increased costs due to over-provisioning.
Describe the solution you'd like
Introduce configurable strategies for managing memory limits post-OOM events:
- Revert to Original Limit: After a defined cooldown period or a specified number of successful job runs, automatically revert the memory limit to its original value.
- Statistical-Based Adjustment: Calculate the new memory limit based on historical usage, such as setting it to the maximum memory usage observed over the last N runs plus a safety margin (e.g., 120%), ensuring it doesn’t fall below the original setting.
- Permanent Increase: Maintain the current behavior where the memory limit remains elevated after an OOM event.
Describe alternatives you've considered
Configurable Sidecar Container in Controller-Manager (Future Enhancement): Introduce the capability for users to specify a custom sidecar container image within the controller-manager. This sidecar can implement tailored logic for recommending memory limits, providing flexibility for organizations with unique requirements.
Additional context
N/A