-
Notifications
You must be signed in to change notification settings - Fork 31
feat(config): Allow disabling or adjusting thresholds for specific events #108
Description
What would you like to be added:
- Ability to disable specific event types via addon
configurationValues - Ability to configure per-event thresholds
Example:
{
"monitoringAgent": {
"disabledEvents": ["LargeEnvironment"],
"eventThresholds": {
"LargeEnvironment": 2000
}
}
}Why is this needed:
In Istio service mesh environments with enableServiceLinks: true (Kubernetes default), most processes exceed ~1000 environment variables. This causes LargeEnvironment events to fire on every node, hundreds of times per day. These events are rarely actionable in this context — the default threshold (~1000) does not align with real-world Kubernetes environments using service meshes — but they drown out actually important events like EBSVolumeIOPSExceeded or ForkFailedOutOfPIDs.
LargeEnvironment accounts for >90% of all agent events in our cluster.
Filtering downstream is inefficient and error-prone, as every consumer must implement the same exclusion logic while the agent continues to generate unnecessary events. Controlling this at the agent level provides a single source of control.
Are you currently working around this issue?:
Yes, filtering at every downstream consumer independently:
- Alloy log pipeline:
|!= "LargeEnvironment" - Event Exporter route exclusion
- Grafana dashboard query filters
Each new consumer must remember to add the same filter. The agent still generates and publishes these events regardless.
Additional context:
- EKS 1.33, eks-node-monitoring-agent v1.6.1
- Under 100 nodes, all with Istio sidecar injection
- The current
monitoringAgent.additionalArgsonly supportshostname-overrideandverbosity