This feature was launched in AWS ParallelCluster 3.2.0 and can be enabled by enabling Slurm Memory Based Scheduling Enabled in the HeadNode configuration screen. See Slurm memory-based scheduling in the AWS ParallelCluster docs for more info.
Slurm supports memory based scheduling via a --mem
or --mem-per-cpu
flag provided at job submission time. This allows scheduling of jobs with high memory requirements, allowing users to guarantee a set amount of memory per-job or per-process.
For example users can run:
sbatch --mem-per-cpu=64G -n 8 ...
To get 8 vcpus and 64 gigs of memory.
In order to add in memory information, we have a managed post-install script that can be setup with Pcluster Manager. This script sets the RealMemory
to 85% of the available system memory, allowing 15% to system processes.
When setting up a cluster with version > 3.2.0, simply toggle Slurm Memory Based Scheduling Enabled to on:
Optionally you can setup the specific amount of memory that Slurm configures on each node, however I don’t reccomend doing this as it may results in a job over-allocating memory.
To enable this in versions < 3.2.0, create a new cluster and in the HeadNode configuration screen, click on the “Advanced” dropdown and add in the managed Memory
script:
Then add the following managed IAM policy to the head node:
arn:aws:iam::aws:policy/AmazonEC2ReadOnlyAccess
On the last screen your config should look similar to the following, note you’ll minimally need AmazonEC2ReadOnlyAccess
and https://raw.githubusercontent.com/aws-samples/pcluster-manager/main/resources/scripts/mem.sh
script.
HeadNode:
InstanceType: c5a.xlarge
Ssh:
KeyName: keypair
Networking:
SubnetId: subnet-123456789
Iam:
AdditionalIamPolicies:
- Policy: arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
- Policy: arn:aws:iam::aws:policy/AmazonEC2ReadOnlyAccess
CustomActions:
OnNodeConfigured:
Script: >-
https://raw.githubusercontent.com/aws-samples/pcluster-manager/main/resources/scripts/multi-runner.py
Args:
- >-
https://raw.githubusercontent.com/aws-samples/pcluster-manager/main/resources/scripts/mem.sh
Scheduling:
Scheduler: slurm
SlurmQueues:
- Name: cpu
ComputeResources:
- Name: cpu-hpc6a48xlarge
MinCount: 0
MaxCount: 100
Instances:
- InstanceType: hpc6a.48xlarge
Efa:
Enabled: true
Networking:
SubnetIds:
- subnet-123456789
PlacementGroup:
Enabled: true
Region: us-east-2
Image:
Os: alinux2
When the cluster has been created you can check the memory settings for each instance:
$ scontrol show nodes | grep RealMemory
NodeName=cpu-dy-cpu-hpc6a48xlarge-1 CoresPerSocket=1
...
RealMemory=334233 AllocMem=0 FreeMem=N/A Sockets=96 Boards=1
...
You’ll see that for the hpc6a.48xlarge instance, which has 384 GB of memory that RealMemory=334233
or 384 GB * .85 = 334.2 GB
.
To schedule a job with memory constraints you can use the --mem
flag. See the Slurm sbatch docs for more info.
$ salloc --mem 8GB
You can see the requested memory for that job by running:
squeue -o "%.18i %.9P %.8j %.8u %.2t %.10M %.6D %.5m %.5c %R"
JOBID PARTITION NAME USER ST TIME NODES MIN_M MIN_C NODELIST(REASON)
3 cpu interact ec2-user R 12:25 1 8G 1 cpu-dy-cpu-hpc6a48xlarge-1