-
Notifications
You must be signed in to change notification settings - Fork 37
Updates to use amazon linux 2023 ami and reflect the actual RealMemory of the compute nodes #34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: plugin-v2
Are you sure you want to change the base?
Conversation
cd /home/ec2-user/slurm-* | ||
/home/ec2-user/slurm-*/configure --prefix=/nfs/slurm | ||
make -j 4 | ||
tar -xf slurm-*.tar.bz2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if SlurmPackageUrl is not .tar.bz2?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bollig - if its changes the script will break. Its an upstream packaging decision that devs normally doesn't change in a whim. The archive extension appears to be consistently using .tar.bz2. https://download.schedmd.com/slurm/.
Now we can try to handle some of them but will not be full proof.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
something ugly like this
wget -q ${SlurmPackageUrl}
# Extract based on file extension
if ls slurm-*.tar.gz >/dev/null 2>&1; then
tar -xzf slurm-*.tar.gz
elif ls slurm-*.tar.bz2 >/dev/null 2>&1; then
tar -xjf slurm-*.tar.bz2
elif ls slurm-*.tgz >/dev/null 2>&1; then
tar -xzf slurm-*.tgz
elif ls slurm-*.tar >/dev/null 2>&1; then
tar -xf slurm-*.tar
else
echo "No recognized Slurm archive found"
exit 1
fi
# Change to the extracted directory, excluding any archive files
cd "$(ls -d /home/ec2-user/slurm-* | grep -v -E '\.tar\.gz$|\.tar\.bz2$|\.tgz$|\.tar$')"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fair. this is ok as is, just thinking about people who may roll/distribute their own patch-fixed version of slurm.
@@ -47,6 +47,12 @@ Parameters: | |||
Default: 2 | |||
Description: Number of vCPUs for the compute node instance type | |||
|
|||
ComputeNodeMemory: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why statically define this? Why not auto-detect RealMemory and have users provide a SchedulableMemory percentage (see ParallelCluster for example)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SchedulableMemory
'll be something that needs further research and for my next PR. :)
IFAIK, SchedulableMemory
is a Slurm configuration that's shipped with AWS ParallelCluster. And we are not dealing with parallelcluster here. As for the auto-detect for RealMemory
; we need an ec2 describe call for the instance specified. This will also need more broader change which is beyond the scope of this PR.
Issue #, if available:
Update the quickstart
template.yaml
to use Amazon Linux 2023 and include realmemory in thepartition.json
Description of changes:
yum
todnf
RealMemory
in thepartitions.json
to reflect the available memory of the compute nodes. If not specified, slurm failed to fetch the actual available memory.By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.