Page MenuHomePhabricator

GPUH Cluster
Updated 2,635 Days AgoPublic

Version 3 of 8: You are viewing an older version of this document, as it appeared on Feb 27 2017, 12:01 PM.

This server is behind the campus firewall, so it is not directly accessible from off-campus. If you are off-campus, you will need to ssh into a jumphost first (e.g. alpine.cse.unr.edu).

ssh $CSE-ID@h1.cse.unr.edu

If you are unable to run jobs across multiple nodes following the instructions below, please email ehelp@cse.unr.edu.

Libraries

OpenMPI > /opt/openmpi
Compiled with SLURM PMI and CUDA

CUDA > /usr/local/cuda

dpkg -l | grep $WHATEVER_YOU_ARE_LOOKING_FOR

Compiling SLIURM Jobs

#/bin/bash

#We Storage some example code from Lawrence Livermore National lab in
#/llnl/mpi

#Copy it to your home directory
cp -r /opt/llnl/tutorials/mpi/samples/C ~/mpi

cd ~/mpi

#Compile an example
mpicc -lpmi -o mpi_hello mpi_hello.c

#Run the example
srun -n16 mpi_hello

Output

$ srun -n16 mpi_hello

Running Tasks

SRUN

https://slurm.schedmd.com/srun.html

srun is synchronous and blocking. Use sbatch to submit a job to the queue.

#-n indicates the number of cores
#--mem indicates the memory needed per node in megabytes
#--time indicates the specified run time of the job
$ srun -n16 --mem=2048 --time=00:05:00 ~/mpi/mpi_hello

SBATCH

https://slurm.schedmd.com/sbatch.html

$ cat ~/mpi/run.sh

#!/bin/bash
#SBATCH -n 16
#SBATCH --mem=2048MB
#SBATCH --time=00:30:00
#SBATCH --mail-user=YOUR_EMAIL@DOMAIN.COM
#SBATCH --mail-type=ALL

srun ~/mpi/mpi_hello

batch the job:

$ sbatch ~/mpi/run.sh 
Submitted batch job 536

$ squeue
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
               536      main   run.sh cse-admi  R       0:03      2 head,node[01-03]

Check the Cluster status:

$ sinfo
PARTITION  AVAIL  TIMELIMIT  NODES  STATE NODELIST
main*         up   infinite      2  alloc node[01-02]
main*         up   infinite      6   idle node[03], head

Node Hardware:

The cluster consists of 4 nodes, each with 64GB of RAM, 2x10 core CPU and 4x NVIDIA GTX1080s
Mellanox Technologies MT27520 Family [ConnectX-3 Pro]

HOWTO: Setup SLURM on your personal computer

https://source2.cse.unr.edu/w/cse/tutorials/slurm-mpi-setup/

Last Author
newellz2
Last Edited
Feb 27 2017, 12:01 PM

Event Timeline

newellz2 edited the content of this document. (Show Details)
newellz2 edited the content of this document. (Show Details)
newellz2 edited the content of this document. (Show Details)
newellz2 edited the content of this document. (Show Details)
newellz2 edited the content of this document. (Show Details)