Arbiter now running on redwood general Interactive nodes
Date Posted: May 5th, 2020
In the Fall 2018 CHPC Newsletter there was an article on Arbiter – the service that monitors the usage of the cluster login nodes, emails users when they are making excessive use of these resources, and applies penalties for excessive usage (see details below). We have been running this in the general cluster login nodes for over an year and this week we have taken Arbiter live on redwood1 and redwood2.
Major changes:
- Users are limited, via the use of cgroups, to 4 cores and 8GB memory on both redwood1 and redwood2 interactive nodes. While the aggregate cpu usage can never exceed the cgroup core limit, when the aggregate total memory usage reaches this cgroup memory limit the out of memory (OOM) killer will start to kill processes to reduce the memory usage.
- Usage above the threshold (or trigger) levels of 1 core and 4 GB memory is tracked.
- The goal is to limit cpu usage to the equivalent of 15 core minutes at or under 4 GB memory usage.
- When these levels are reached – you will be put into a penalty condition, and your usage will be throttled to progressively lower levels.
For more detailed information, please see the CHPC General Login Node Policy
https://www.chpc.utah.edu/documentation/policies/2.1GeneralHPCClusterPolicies.php#Pol2.1.1
If you get any arbiter violation messages and have questions, please send them to helpdesk@chpc.utah.edu