Skip to content

CHPC DOWNTIME: Two targeted outages on February 17, 2021 starting at 8am

Posted January 28th, 2021
Updated February 17th, 2021

Update #2 - February 17th @11:41am

The second half of today's CHPC downtime has been completed.  

The Nvidia driver update on the  notchpeak gpu compute nodes has been completed and the reservation has been removed.  The nodes are now running jobs. 

If you have any issues with these nodes, please send a report to helpdesk@chpc.utah.edu.  


Update #1 - February 17th @9:38am

The replacement of the infiniband switch serving the listed kingspeak interactive nodes has been completed. The nodes are back in service.  If you have any issues with these nodes, please send a report to helpdesk@chpc.utah.edu.  

Work is being started on the netochpeak gpu compute nodes.


Original Announcement

On Wednesday, February 17, 2021 CHPC will have two targeted outages, described below.

The first outage is for select kingspeak interactive nodesThis outage will start at 8am and is expected to take 1-2 hours. During this downtime, the infiniband switch that has been having issues which have lead to multiple outages of these nodes, the latest of which occurred on 1/15, will be replaced. 

The kingspeak interactive nodes impacted by this outage are:
kingspeak[5-10,12,14-18,21-24] and elmo

The second outage is for notchpeak compute nodes with gpus. This downtime is needed to update the nvidia drivers to allow for support of the new RTX30x0 gpus.  The downtime will start at 9am and is expected to take several hours.  A reservation is in place to drain the notchpeak gpu compute nodes of batch jobs before the start of the downtime.  The non-gpu nodes will continue to run jobs as normal.

The impacted notchpeak compute nodes are:
notch[001-004,055,060,081-089,103,136,168-169,204,215,271,293-294]

Last Updated: 1/4/22