Usage rights on the cluster are currently driven by fairshare allocations. These allocations are based on the number of computer cores paid for with a purchase or service fee option. Job scheduler policies help insure that all researchers then receive their fair use of the cluster.
Finally, for optimal cluster operating efficiency, disk space must be managed carefully and efficiently. Please familiarize yourself with the ACCRE disk quota and backup policies. In addition to cluster storage and backup, ACCRE offers remote storage services to Vanderbilt departments and laboratories regardless of whether they also use the cluster.
Member and Guest Usage
ACCRE staff work with research groups to determine the best way for them to contribute to ACCRE and participate in this shared resource. Research groups have two options for being contributing members:
- Option 1: Pay a service fee for use of University purchased hardware that includes hardware and support costs. Service fees are paid monthly.
- Option 2: Buy hardware to contribute to the cluster by paying for the full cost of the hardware up front and then paying support costs. Support costs may be paid monthly.
Your contribution level, or fairshare, establishes your priority for how much of the cluster you can effectively use at any given time. Larger account fairshares result in more processing time and sometimes in shorter average job wait times in the idle queue and more processing time.
Commitments for either Option 1 or 2 are made per fiscal year.
Please contact ACCRE Administration to determine the arrangement which best meets the needs of your research group. You may also begin with a guest account, and in fact, most users begin their affiliation with ACCRE as guest users.
Guest accounts are available for two purposes: to allow users to “try out” the environment and to encourage use by research groups at Vanderbilt that have not previously used the ACCRE cluster. Therefore, guest access is granted for the lessor of 240 compute core days (or 5,760 processing hours) or six months duration per research group.
Guest accounts for staff, post-docs, students, and visiting scholars must be approved and sponsored by a faculty member. Before applying for an account, please verify which faculty member will be willing to sponsor your account. We will contact this faculty member to accept sponsorship of this account and complete the VAMPIRE Cluster Disclosure. Please note the groups with ITAR/EAR data cannot open guest accounts. We will also inform the sponsor they will be added to the ACCRE Forum mailing list (as will you when your account is opened) so they will also receive major news pertinent to their research group’s use of the ACCRE facility. On campus guest users should also meet with ACCRE Staff before the account can be opened.
After all of the above steps are completed and ACCRE has completed their review of the request, we will notify the requestor if their guest account request has been approved.
These accounts have a small fairshare allocation which is shared by all guest account users, with no guarantees of actual compute time or priority in the job execution queue. The maximum number of processors each guest user can use at the same time is set to 20. (If you need to run a parallel job requiring larger than 20 processors, please contact ACCRE Staff). Most users, however, find this a highly effective way to begin using the cluster. Guest usage is reviewed on a quarterly basis.
Current Contact Information Required of all Users
Anyone with an ACCRE cluster account must notify ACCRE via a helpdesk ticket when there are changes to their contact information. Since helpdesk usage requires an email address, if your email address is changing please send ACCRE your new email address in advance of the old email address being discontinued. Users without a correct email address on file with ACCRE may have their accounts suspended or closed.
The job scheduler software determines which processors to send each job to and when. It monitors the entire job queue, prioritizing waiting jobs based on requested versus available resources and current usage versus fairshare. Assignments of fairshare allocations for the cluster are made at the level of an ACCRE account. Multiple groups may exist under an umbrella ACCRE account; users are subsequently assigned to a group. Obtaining account fairshare is explained in Purchasing and/or Service Use Fee to Obtain Cluster Fairshare.
The SLURM resource manager works to schedule the use of compute processors running batch/interactive jobs in the ACCRE cluster environment. Parameters and policy settings can be tuned to efficiently handle a wide range of system workloads (see Getting Started on the Cluster to learn more about submitting jobs to the queue). ACCRE has also developed its own SLURM documentation where you will find details for more advanced use cases.
SLURM Scheduler Limits and User Etiquette
- For groups with large fairshares that need to run large numbers of jobs, please submit jobs in batches of 3,000 jobs or less at a time. We highly encourage the use of job arrays for large batches of jobs, as this will improve the scheduling of the jobs as well as the overall responsiveness of the scheduler. With advance approval, users in groups with large fairshares may have more than one batch of 3,000 jobs in the blocked queue–but only with advance approval and only for users in groups with large fairshares. Users that submit large batches without advance approval may have their jobs deleted without advance notice and/or restrictions put on their accounts. To request approval, please submit a request via ACCRE’s helpdesk.
- Combine smaller jobs together into longer jobs when you need to run a large number of small jobs. It is better on the scheduler to submit 200 three-hour jobs then to submit 1200 thirty minute jobs but both combinations require the same total wall clock time. It is much better to submit 2000 three-hour jobs than to submit 12,000 thirty minute jobs. Users that submit large batches of short jobs without advance approval may have their jobs deleted without advance notice and/or restrictions put on their accounts. If you have questions about how to combine your jobs, again, please submit a helpdesk ticket.
Limits on the Number of Jobs in the Queue that are enforced by the scheduler
- The largest factor in determining limits on numbers of jobs is the Maximum Processor Second (MaxPS) for each account. The MaxPS is the number of processor core seconds for each account based on fairshare number times 2,592,000 seconds per month (one processor core running 24/7 for 30 days). This is best explained by an example. An account with a fairshare of five processor cores has a MaxPS of 12,960,000 seconds. This account could start fifteen 10 day one-processor core jobs and use their entire MaxPS. Alternatively, they could start thirty 5 day one-processor core jobs or fifteen 5 day two-processor core jobs or any other combination adding up to the MaxPS for that account. The number of jobs allowed increases as the jobs are shorter. Once jobs are started, the maximum seconds in use decreases as the remaining time for the job to finish shortens. The MaxPS has several significant benefits:
- Jobs can run for up to 14 days on the cluster so accounts are allowed to run jobs that allow them to use their entire fairshare.
- The MaxPS allows accounts to have burst usage onto processor cores that belong to the fairshare of other users as long as they are not having a long term negative impact on other users. The shorter the job length, the greater the burst.
- The MaxPS encourages all users to set reasonably accurate Job Wall Clock times. If a user requests 14 days when they only need a day, their usage will be limited by MaxPS. The scheduler cannot know if a job is to complete early, so it must schedule time based on the Job Wall Clock request. Users should also be aware that they should not set their Job Wall Clock too short since jobs are killed when they run out of Job Wall Clock time.
- The default maximum allowed number of processors in use at any one time time is set for each group based on the group’s fairshare. Groups with higher fairshares have higher maximum number of processors. Depending on the account fairshare, the MaxPS may limit group to significantly less cores running.The Prinicipal Investigator (PI) in charge of an individual account may also request upper limits on users in that account. New users will have lower job limits if they do not promptly attend the Introduction to the Cluster and Job Scheduler classes.
Limits on the Length of Running Jobs
- The maximum allowed job length is 14 days (except when there is less than 14 days before a scheduled downtime).
- User jobs should be at least 30 minutes, though over an hour in length is preferable (exceptions will be made for a small number of test jobs). This minimum job length is required because for each and every job there is overhead time for job staging and tear down. This overhead is time that the processors remain idle and not utilized. Many very short jobs results in more wasted processing time (which results in wasted money).
Limits on the Memory Use of Running Jobs
- SLURM kills jobs that exceed their memory allocations by a significant amount. SLURM is currently configured to swap up to 10% of a job’s total memory allocation out to disk in the event that a job exceeds its memory allocation. This allows some leniency for running jobs, however there is a performance penalty from reading/writing to disk so it is advantageous for a job to stay under its memory allocation.
- Although you can request the maximum memory on any node, each node uses some memory to run the operating system. Learn more information about the nodes here.
- Learn how to monitor your memory usage as part of checking the status of a submitted job.
Bursting and Fairshare
Bursting and fairshare balancing are impacted by many factors including:
- Bursting is an intended feature of the cluster. Since users usually burst at different periods, users are allowed to burst or use substantially more than their fairshare to balance with periods when they are using less than their fairshares.
- If a group consistently runs substantially above its fairshare, it will be contacted to increase its fairshare. If the group chooses not to increase its fairshare, stricter limits on usage will be applied to balance its usuage against its fairshare.
- Size of each account. For example, an account with 1 core of fairshare can easily burst to ten times fairshare level. However, an account that is 15% of the cluster can never burst to ten times its fairshare level. Nevertheless, in this case even a small percentage increase in bursting can represent a significant amount of additional usage.
- Multiple groups within an account. Each research group with fairshare is considered as an account for the scheduler. For some research groups, there is only one group within the account. For others, the account may be at a shared level, such as department fairshare, with multiple research groups within the account. In this situation, each research group may have limits below the account limits so that no group can use all the fairshare and therefore prevent other groups within the account from running jobs.
- Memory requirements for jobs. Currently, every 21 GB per core counts as one core against fairshare usage, rounding up if there is any memory remaining. For instance, if a user runs a 48 GB single core job, the job will count for three cores (48 GB divided by 21 GB, rounded up). Memory limits help prevent a user from running too many memory intensive jobs and memory starving the cluster.
- Total activity level of the cluster. The cluster has periods of different usage levels. When the cluster is in high demands, bursting limits are usually lower than periods when the cluster demand is low.
- An account’s usage vs fairshare payment. Accounts that have high usage in comparison to their fairshare will have lower bursting levels.
- Usage patterns within an account. Some accounts can easily cover all of the group or individual user needs. Other accounts have more contention for resources. In the later case, limits may be placed to allow all groups and users within an account to have reasonable use of the fairshare.
- Special requests. P.I.s may make requests for special bursting to meet research deadlines or special projects. When possible, we try to meet all reasonable requests made in advance. In some cases, groups may have limited bursting for a period of time while another group’s special request is met. Special requests by research groups are normally met with minimal impact on other research groups.
- Length of jobs. If users from an account run mostly long jobs (7-30 days), then the maximum processor second limit may reduce the number of jobs running simultaneously. If you are running for 14 to 30 days, you are no longer bursting but having steady usage at that level.
- Size of job. For groups that run multi-processor jobs, it is necessary to maintain a fairshare that is reasonable in light of their job size. Running jobs requesting 100 cores per job would not be reasonable for a group with a single core of fairshare.
- P.I. preferences. When P.I.’s make special requests on how they want their fairshre split between groups and/or users, we work with them to achieve their requests.
- Wait time requirements. The cluster is in operation on a 24/7 basis. To utilize all computing time, jobs may run overnight or over the weekend. Groups that prefer their jobs to complete faster can increase their fairshare to primarily use Monday – Friday time.
- The cluster is dynamic. The mixture of running jobs is constantly changing so limits and settings must also change over time to meet researcher needs.
For more information on the types of nodes available, see our detailed description of the cluster. Please be aware that restrictions may be placed at any time on a user account if jobs are causing any problem with the cluster hardware or are interfering with the jobs of other users. In such cases, we notify the user as soon as possible (although in extreme cases, we must sometimes kill problem jobs before we have made contact with the user). We then work with the user to monitor the progress of their jobs until they can be run normally on the system without causing problems. Many times this means restricting an account to one running job until each new job runs without encountering problems. We will then incrementally increase the number of jobs a user can have running simultaneously, only if their jobs cause no issues at that level. Eventually, we reset the account to its normal maximum. The main intent of our ACCRE Cluster Computing Classes is to educate users in order to help avoid such issues. Even if you are not a new user, if you have never attended our classes we invite you to do so.
Disk Quotas and Backups
Why We Impose Quotas
Disk usage policies on the ACCRE cluster establish individual account limits on both the amount of disk space used and the number of files stored on disk. Excessive numbers of files per unit of disk space have a negative impact on disk input/output procedures and significantly increase the tape backup load of files on home directories. For these reasons, users that need additional file quotas (without additional disk usage needs) must make special arrangements with ACCRE.
Automated enforcement of soft and hard limits have been established for both elements of disk usage. The following general definitions apply:
- Soft quota: Baseline value for general usage limits. Soft quotas may be exceeded for brief periods of time, after which write permissions are automatically suspended until the user deletes enough to bring the account back under quota. Further details below.
- Hard quota: Absolute maximum value. The system does not allow you to exceed this, i. e., you will receive errors and/or your jobs will die if you attempt further writes to disk.
To help keep the system running smoothly, you should be in the habit of checking your usage level. Our FAQ explains how a user may check current disk space usage and quotas.
A summary of the soft and hard limits on disk space and files are outlined below, followed by a more detailed presentation of those policies. Subsequently, options for purchasing additional disk space and file limits are presented. Disk space will be allocated for the home and data directories with the assumption that all of the space could theoretically be in use (i.e., home and data disk space will not be oversold). However, because scratch space use is (by definition) assumed to be sporadic and temporary in nature, we may allocate more hard quota scratch disk space than is actually available on the system (i.e., oversell hard quota scratch disk space). We will not oversell the soft quota allocations of scratch disk space. Because of the special nature of the scratch environment, we devote a specific section to detailing the management policies for that space below.
Cluster Tape Backups
The True incremental Backup System (TiBS) is used to backup data for these departments and labs. Currently an IBM TS3500 Tape Library is used to store backup data and is located in a different building than the ACCRE cluster for disaster protection. An important advantage of TiBS is that it minimizes the time and network resources required for backups, even full backups. After an initial full backup, TiBS performs incremental backups from the client. To create subsequent full backups, an incremental backup is taken from the client. Then, on the server side, all incrementals since the last full backup are merged into the previous full backup to create a new full backup. This takes the load off the client machine and network. The integrity of the previous full backup is also verified by this process. Backup data is stored in a fault tolerant manner so the rare data integrity flaw can be corrected. TiBS support is available for all current operating systems. If you are interested, please see our Tape Backup Services page and contact ACCRE Administration for more details.
File Recovery from Backup
- ACCRE provides tape backup services for files stored on home and data. Files stored on scratch are not backed up. Backups are normally completed daily and restores are done upon request via Request Tracker. ACCRE does not guarantee that backups will occur daily and is not responsible for bad tapes.
- If demand is unusually high on a given day (resulting in the inability to complete the daily backup within 24 hours), the next day’s backup may not occur. However, prior incremental backups will continue to be available for restoration requests. This backup service is not designed for archival storage; rather, this service is for the recovery of user files lost due to corruption and/or accidental deletions.
- ACCRE offers restores for cluster users of files on home Monday through Friday, excluding Vanderbilt holidays, from 8:00 a.m. until 4:30 p.m. File restore requests are processed during these same hours by submitting a request. Unless notified otherwise, restores will occur by the end of the next business day. In some cases, a restore may be completed on the same business day.
- A user may receive one free one-hour restore in any given fiscal year. Additional restores in any given fiscal year cost $60 per one-hour restore.
Summary of Disk Storage and Backup Policies
|Quota Type||Soft||Grace Period||Hard||Backup|
|Disk storage space in /home||15GB||7 days||20 GB||nightly beginning at 11PM|
|Number of files in /home||200,000||7 days||300,000||nightly beginning at 11PM|
|Disk storage space in /scratch||50GB||14 days||200 GB||not backed up|
|Number of files in /scratch||200,000||14 days||1,000,000||not backed up|
Home (/home) Directory Disk Storage
Files in home directories are backed up nightly (beginning at 11pm). Three full backups are retained. Therefore, restorations of files are available for at least two months prior to the current date.
The following disk space usage limits are implemented by default:
- Soft quota per individual account: 15 GB
- Hard quota per individual account: 20 GB
The soft quota may be exceeded for up to 7 days. After 7 days, the current disk space used becomes the hard limit. That is, you will only be able to read and delete files on your account until the total amount of disk used falls below the soft quota for the account.
Please be aware that excessive numbers of files per unit of disk space have a negative impact on disk read/write procedures and significantly increase the tape backup load of files on home directories. The following limits on the maximum number of files in home directories are implemented by default:
- Soft quota per individual account: 200,000 files
- Hard quota per individual account: 300,000 files
The soft quota may be exceeded for up to 7 days. After 7 days, the current number of files becomes the hard limit. That is, you will only be able to read and delete files on your account until the total amount of files in your home directories falls below the soft quota for the account.
One other important detail about quota is data replication. ACCRE currently has data replication set to two for /home and /data. This means that the disk usage of a file stored in /home will be approximately twice that of a file outside the cluster.
Scratch (/scratch) Disk Storage
Files on ACCRE scratch disk space are NOT backed up.
The following scratch disk space usage limits are implemented by default:
- Soft quota across all scratch directories owned by your account: 50 GB
- Hard quota across all scratch directories owned by your account: 200 GB
The soft quota may be exceeded for up to 14 days. After 14 days, the currently used scratch disk space becomes the hard limit. That is, you will only be able to read and delete files on your scratch disk space until the total amount you are using falls below the soft quota limit for that space.
As noted above, excessive numbers of files per unit of disk space have a negative impact on disk read/write procedures and significantly increase the tape backup load of files on home directories. The following limits on the maximum number of files in scratch directories are implemented by default:
- Soft quota across all scratch directories owned by your account: 200,000 files
- Hard quota across all scratch directories owned by your account: 1,000,000 files
The soft quota may be exceeded for up to 14 days. After 14 days, the total number of files currently in your scratch space becomes the hard limit. That is, you will only be able to read and delete files from that space until the total number of files falls below the soft quota limit for that space.
Options for Increasing Disk Space and File Limits
Options for increasing disk quotas and file limits, as well as their associated costs, are described below. Requests for increases in file limits are evaluated by ACCRE personnel on a case-by-case basis. Request for increases in disk space are based on availability. Upon approval and receipt of funds, increases will be in effect for up to the end of the current fiscal year. In addition, groups with large fairshare may contact ACCRE to determine if they qualify for larger disk quotas. Please initiate your request via Request Tracker.
Data (/data) Directory Disk Storage
Users that need large allocations of backed up space may pay for a quota on the data directory. Quota is sold in 1 TB increments, see our pricing page for details. Users pay for the hard quota allocation and soft quota is set at 5% less than the hard quota.
Files on data directories are backed up nightly (beginning at 11pm). Two full backups are retained. Therefore, restorations of files are available for one month prior to the current date.
The soft quota may be exceeded for up to 7 days. After 7 days, the current disk space used becomes the hard limit. That is, you will only be able to read and delete files on all accounts in your group until the total amount of disk used falls below the soft quota for the group.
Scratch (/scratch) Directory Disk Storage
Users that need large allocations of scratch disk space may pay for quota in 1 TB increments, see our pricing page for details. Users pay for the hard quota allocation and soft quota is set at 10% less than the hard quota.
The soft quota may be exceeded for up to 14 days. After 14 days, the current disk space used becomes the hard limit. That is, you will only be able to read and delete files in all accounts in your group until the total amount of disk usage falls below the soft quota for the group.
Software Installation Policies
- Commitment to research involves a commitment to keep software up to date by both end users and ACCRE staff.
- End users should become reasonably familiar with the software they use. If it is open source, end users should know how to compile, install and maintain their software.
- ACCRE staff are responsible for ensuring that failure in software installed either by end users of ACCRE staff is not caused by the normal operation of the cluster. Prima Facie evidence that the cluster is not the problem is exit_status=0 from the scheduler. There may be other indicators. All of these indicators should be in a FAQ entry.
- ACCRE staff will develop a FAQ entry for the most common software installs to include R, Perl, Python and Ruby.
- End users needing to compile and install software must make a good faith effort to install it locally themselves.
- End users needing to compile and install software should be required to meet with ACCRE staff. Moreover, ACCRE staff may ask end users to come to ACCRE offices where the staff may assist end users at particular problem points in the download, compile and install process.
- PI’s may be asked to pay ACCRE staff time and support for software compiled and installed for a single user or group.
- If it is recognized that some set of software is, or will be, widely used by the research community, the ACCRE staff should be responsible for installing and maintaining that set.