rockSlayer@lemmy.blahaj.zone to

iiiiiiitttttttttttt@programming.devEnglish · 5 days ago

A user attempted to allocate 32 TERABYTES of memory for a job

6

50

A user attempted to allocate 32 TERABYTES of memory for a job

rockSlayer@lemmy.blahaj.zone to

iiiiiiitttttttttttt@programming.devEnglish · 5 days ago

6

I like my job, so no screenshots. Sorry.

Notes:

sbatch is a command for submitting jobs on high performance compute nodes
the huge-n128-512g node uses 128 cores and has 512GiB of memory
This is occurring in a medical research nonprofit

User: Hello everyone, this is the first time I’m using GCP. I’m trying to run a job, but it keeps failing. These are the sbatch headers I’m using:

#SBATCH --partition=huge-n128-512g
#SBATCH --nodes=8
#SBATCH [email protected]
#SBATCH --mail-type=FAIL
#SBATCH --mem-per-cpu=32G

IT: Please make sure you need to use that node, each one costs $4500/month to use. Can you describe the job you’re trying to do?

User: I’m doing high-depth genetic sequencing using 3gb bam files.

(additional note: there’s usually only 1 bam file per chromosome, so 69gb total. Nice.)

IT: Those bam files are pretty small. I’d recommend starting with the med-n16-64g node and moving up if needed. We’re only billed for run time. If the jobs take the same amount of time, it would be 13% of the cost.

The astute among you will notice that an 8 node swarm of 32GiB of memory per core is 32TiB total. The job was failing because the --mem-per-cpu flag was going above the available memory on each node. Even without that flag, the swarm would have used 4TiB memory. Holy overallocation, Batman!

Chat

fuckwit_mcbumcrumble@lemmy.dbzer0.com
link
fedilink
arrow-up
10·
5 days ago
Also memory per core is an interesting way to allocate that. Total? Absolutely. Per node? Yeah. But per CPU?

iiiiiiitttttttttttt@programming.dev

iiiiiiitttttttttttt@programming.dev

You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: [email protected]

you know the computer thing is it plugged in?

A community for memes and posts about tech and IT related rage.

Visibility: Public

This community can be federated to other instances and be posted/commented in by their users.

3 users / day
96 users / week
110 users / month
143 users / 6 months
1 local subscriber
1.7K subscribers
50 Posts
69 Comments
Modlog