From 2995c3c9b3cffcc47a0b8c5638173b1490ef0f19 Mon Sep 17 00:00:00 2001 From: Sajid Ali Date: Thu, 26 Mar 2026 11:24:56 -0400 Subject: [PATCH 1/2] preemption 1hr -> 30min --- docs/hpc/05_submitting_jobs/01_slurm_submitting_jobs.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/hpc/05_submitting_jobs/01_slurm_submitting_jobs.md b/docs/hpc/05_submitting_jobs/01_slurm_submitting_jobs.md index 8c9a47aa17..eab8d9b13a 100644 --- a/docs/hpc/05_submitting_jobs/01_slurm_submitting_jobs.md +++ b/docs/hpc/05_submitting_jobs/01_slurm_submitting_jobs.md @@ -38,7 +38,7 @@ Jobs with low GPU utilization will be automatically canceled. The exact threshol On Torch, users may run "preemptible" jobs on stakeholder resources that their group does not own. This allows the stakeholder resources to be utilized by non-stakeholders which may otherwise be idle. To make the best use of these resources, you are encouraged to adopt checkpoint/restart to allow for resumption of the workload in subsequent jobs. :::warning Preemption Policy -Jobs become eligible for preemption after 1 hour of runtime. Jobs will not be canceled within the first hour. +Jobs become eligible for preemption after 1 hour of runtime. Jobs will not be canceled within the first 30 minutes. ::: The preemption order is: From 156e6453f2bb3ceb786800606ac2ca3a8173e1a7 Mon Sep 17 00:00:00 2001 From: Sajid Ali Date: Thu, 26 Mar 2026 11:25:43 -0400 Subject: [PATCH 2/2] preemption 1hr -> 30min --- docs/hpc/05_submitting_jobs/01_slurm_submitting_jobs.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/hpc/05_submitting_jobs/01_slurm_submitting_jobs.md b/docs/hpc/05_submitting_jobs/01_slurm_submitting_jobs.md index eab8d9b13a..29a0cfc546 100644 --- a/docs/hpc/05_submitting_jobs/01_slurm_submitting_jobs.md +++ b/docs/hpc/05_submitting_jobs/01_slurm_submitting_jobs.md @@ -38,7 +38,7 @@ Jobs with low GPU utilization will be automatically canceled. The exact threshol On Torch, users may run "preemptible" jobs on stakeholder resources that their group does not own. This allows the stakeholder resources to be utilized by non-stakeholders which may otherwise be idle. To make the best use of these resources, you are encouraged to adopt checkpoint/restart to allow for resumption of the workload in subsequent jobs. :::warning Preemption Policy -Jobs become eligible for preemption after 1 hour of runtime. Jobs will not be canceled within the first 30 minutes. +Jobs become eligible for preemption after 30 minutes of runtime. Jobs will not be canceled within the first 30 minutes. ::: The preemption order is: