Storage Services in the Protected Environment
The Center for High Performance Computing (CHPC) offers four types of encrypted storage within the Protected Environment (PE) based on your project's needs: home directories, project space, scratch file systems, and an archive storage system.
See the Data Transfer Services page for information on moving data to and from the CHPC PE storage.
| Please remember that you should always have a additional copies of any critical data on independent storage systems. While storage systems built with data resiliency mechanisms (such as RAID and erasure coding mentioned in the offerings listed below or other, similar technologies) allow for multiple component failures, they do not offer any protection against large-scale hardware failures, software failures leading to corruption, or the accidental deletion or overwriting of data. Please take the necessary steps to protect your data to the level you deem necessary. |
On this page
The table of contents requires JavaScript to load.
Home Directories
The Center for High Performance Computing (CHPC) provides all people in the Protected Environment (PE) with a free 50 GB home directory. This space is backed up; for details on the backup schedule, see 3.1 File Storage Policies.
The CHPC does not offer larger home directories in the PE. Instead, users should make use of project spaces to store data.
The 50GB cap on a person's home directory space is enforced with a two-level quota: a soft quota of 50 GB, which gives the person a maximum of seven days to clean up their home directory to be under 50GB, and a hard quota of 75 GB, which prevents any write-access to the home directory until it is under the 50GB cap.
When over quota, you will not be able to start a FastX or Open OnDemand session, but
an SSH session can be used to connect to the CHPC and clean up your home directory.
To find which files are taking up space in your home directory, use the command ncdu from your home directory. |
Project Storage
The Center for High Performance Computing (CHPC) offers project space, which is equivalent to group space in the General Environment, for groups needing to store project-specific research data that is sensitive in nature.
The CHPC offers a free 250GB storage tier for all project storage within the PE. If your project requires more than 250GB of project space, the CHPC can provide your project additional storage sold by the TB, at a rate of $150/TB. A single purchase of storage is good for 5 years.
If your project data requires automatic backups, the CHPC offers storage and automatic backups at a rate of $450/TB. A single purchase of storage and automatic backups is good for 5 years. For details on the current backup policy of the PE project space, see 3.1 File Storage Policies.
Access to project space is controlled such that only people that are part of the project are allowed access to the space. Only the project PI, or their designated delegates, can add or remove persons from their CHPC-hosted projects.
| For IRB-governed projects, the persons given access must be listed as having access to data on the IRB record. |
Project space is only intended for storing data and data outputs, not for handling I/O from computational jobs. All computational jobs making use of data stored in project space should make use of the scratch space for the duration of the job. Methods for utilizing the scratch space are described here.
Purchasing Project Storage
If your project is already hosted in the CHPC PE, you can request additional storage (and backups) by filling out the storage request form in Portal. When submitting the request, please indicate the PE project this is in reference to.
If your project is new to the CHPC and does not yet have any project storage, please fill out a new project request form in Portal and, in that request, let us know what your storage requirements are.
Scratch File Systems
The Center for High Performance Computing (CHPC) provides a high-performance scratch space that is freely available to all persons with accounts in the CHPC Protected Environment (PE). There are two scratch file systems available:
- /scratch/general/pe-nfs1, a 280 TB NFS system accessible from all PE resources
- There is a per-user quota of 100TB on this scratch file system
- /scratch/general/pevast, a 100 TB flash-based file system available from all PEresources
- There is a per-user quota of 10 TB on this scratch file system
| The scratch file systems are not backed up. |
It is recommended to use the scratch space for the duration of all computational jobs. Data should be transferred from the project to scratch spaces when running jobs, as the scratch systems are designed for better performance and this prevents project spaces from becoming overwhelmed.
| Scratch space is not intended for long-term file storage. Files in scratch spaces are deleted automatically after a period of inactivity. |
If you have questions about using the scratch file systems or IO-intensive jobs, please contact the CHPC at helpdesk@chpc.utah.edu.
Temporary File Systems
/scratch/local
Each node on the cluster has a local disk mounted at /scratch/local that can be used for storing intermediate files during calculation. Usage of /scratch/local is beneficial in some cases because I/O can have a lower-latency.
Files on /scratch/local should be moved to another shared file system (home, group, scratch) before the end of the job if they are needed after job completion.
Access permissions to /scratch/local have been set such that users cannot create directories in the top-level /scratch/local directory. Instead, as part of the Slurm job prolog (before the job is started), a job level directory, /scratch/local/$USER/$SLURM_JOB_ID, will be created. Only the job owner will have access to this directory. At the end of the job, in the Slurm job epilog, this job level directory will be removed.
All Slurm scripts that make use of /scratch/local must be adapted to accommodate this change. Additional updated information is provided on the CHPC Slurm page.
/scratch/local is now software-encrypted. Each time a node is rebooted, this software encryption is set up again, purging anything within the content of this space. There is also a cron job in place to scrub /scratch/local of content that has not been accessed for over 2 weeks. This scrub policy can be adjusted on a per-host basis. A group can opt to have us disable this on a group-owned node, and it will not run on that host.
/tmp and /var/tmp
Linux defines temporary file systems at /tmp or /var/tmp. CHPC cluster nodes set up temporary file systems as a RAM disk with limited capacity. All interactive and compute nodes also have a spinning disk local storage at /scratch/local. If a user program is known to need temporary storage, it is advantageous to define the location of the temporary storage by setting the environmental variable TMPDIR to point to /scratch/local. Local disk drives range from 40 to 500 GB depending on the node, which is much more than the default /tmp size.
Archive Storage
Elm
The CHPC offers an archive storage solution based around object storage, specifically Ceph, a distributed object store suite developed at UC Santa Cruz. With the current cluster configuration we offer $150/TB for the 7-year lifetime of the hardware. In alignment with our current project space offering, we will operate this space in a condominium-style model by reselling this space in TB chunks. If interested, a more detailed description of this storage offering is available.
One of the key features of the archive system is that users can manage the archive directly. Users can move data in and out of the archive storage as needed: they can archive milestone moments in their research, store an additional copy of crucial instrument data, and retrieve data as needed. Ceph presents the storage as a S3 endpoint which allows the archive storage solution to be accessed via applications that use Amazon’s S3 API, such as s3cmd and rclone.
This space is a standalone entity and is not mounted on other CHPC PE resources. Elm is currently the backend storage used for CHPC-provided automatic backups (e.g., backed-up project or home space); as such, groups looking for additional data resiliency that already have spaces backed up by the CHPC may want to look for other options.
User-Driven Backup Options
Campus-level options for a backup location include Box and Microsoft OneDrive.
| There is a UIT Knowledge Base article with information on the suitability of the campus level options for different types of data (public/sensitive/restricted). Please follow these university guidelines to determine a suitable location for your data. |
Owner backup to University of Utah Box: This is an option suitable for sensitive/restricted data. See the link above to get more information about the limitations. If using rclone, the credentials expire and have to be reset periodically.
Owner backup to University of Utah Microsoft OneDrive: As with box, this option is suitable for sensitive/restricted data. See the link above to get more information about the limitations.
Owner backup to CHPC archive storage (Elm in the Protected Environment): This choice, mentioned in the archive storage section above, requires that the group purchase the required space on the CHPC's archive storage options.
Owner backup to other storage external to CHPC: Some groups have access to other storage resources, external to the CHPC, whether at the University of Utah or at other sites. The tools that can be used for doing this are dependent on the nature of the target storage. It is the researcher's responsibility to ensure data is stored in a location appropriate for the type of data being stored.
There are a number of tools, mentioned on our Data Transfer Services page, that can be used to transfer data for backup. The tool best suited for transfers to object storage file systems is rclone. Other tools include fpsync, a parallel version of rsync suited for transfers between typical Linux "POSIX-like" file systems, and Globus, best suited for transfers to and from resources outside of the CHPC.
If you are considering a user driven backup option for your data, CHPC staff are available for consultation at helpdesk@chpc.utah.edu.
Additional Information
For more information on CHPC data policies, visit the File Storage Policies page.