Running on Ranger

From ScorecWiki

Jump to: navigation, search

Contents

Overview

This page describes some of the steps needed to checkout PHASTA and run it on Ranger.

Setting your account for BASH

  1. Login into Ranger by SSH username@tg-login.ranger.tacc.teragrid.org
  2. To change to bash shell under your account (for future sessions) use: 'chsh -s /bin/bash'. To change to bash shell on current terminal use '/bin/bash'.
  3. Copy ranger_scripts.tar into your home directory. This file contains the following files:
    • bash_profile
    • bashrc
    • svn_phasta.sh
    • make_phasta.sh
    • make_phasta_dbg.sh
    • pgi_mvapich-devel_evn.sh
    • ranger_phSolverIC_pgi_mvapich-devel.csh
    • ranger_phSolverIC_pgi_mvapich-devel_rerun.csh
  4. Decompress ranger_scripts.tar.
  5. Copy bashrc from ranger_scripts directory into your home directory and change the name to .bashrc
  6. Copy bash_profile from ranger_scripts directory into your home directory and change the name to .profile_user (.bash_profile instead??)
  7. Logout from the system and log in again. Now your account should be set for BASH shell on your log in.
 Instead of logging out and logging in again, you could source the bash files.

Installing PHASTA

  1. Create directory dev_pgi_mvapich-devel.
  2. Move to dev_pgi_mvapich-devel.
  3. Copy pgi_mvapich-devel_evn.sh from ranger_scripts and run it. This script will create three files: mpicc, mpicxx, and mpif90.
  4. Copy svn_phasta.sh from ranger_scripts and run it. You will need to enter your gforge username and password. This script will checkout PHASTA from the top of the trunk.
  5. If you are running the incompressible code, before compiling PHASTA you will need to get the LIBLES directory into this directory, which contains the ACUSIM libraries. If you are part of the PHASTA group you can contact Onkar Sahni or Victor Marrero. Otherwise you will need to contact Prof. Kenneth Jansen. If you are running the compressible code, you can continue to the next step.
  6. Copy make_phasta_dbg.sh or/and make_phasta.sh from ranger_scripts and run it. The first script will compile PHASTA for debugging and the second script for production runs. Once the script is finish, you should have an executable build.
    • NOTE: These files are set default for the incompressible code. For the compressible code you will need to remove the comment from
export COMPRESSIBLE=1

Running PHASTA

Once you have your case files setup (pre-Processing) on Ranger, you will need too run ranger_phSolverIC_pgi_mvapich-devel.csh. This file looks like this:

#!/bin/csh
#$ -V                        # Inherit the submission environment
#$ -cwd                      # Start job in  submission directory
## COMMENTED  ranger_job     # Job Name
#$ -o $JOB_NAME.o$JOB_ID     # Name of the output file (eg. JobName.oJobID)
#$ -e $JOB_NAME.e$JOB_ID     # Name of the err. output file (eg. JobName.eJobID)
#$ -pe 16way 32              # Requests 16 cores/node, 32 cores total
#$ -q normal                 # Queue name
#$ -l h_rt=00:02:00          # Run time (hh:mm:ss)
## COMMENTED -M xyz@rpi.edu  # Email notification address
#$ -m e                      # Email at Begin/End of job
set echo                     #{echo cmds}
echo 'Staring Job'
date
# echo
# echo '==========================================='
# echo 'Record of solver.inp'
# cat solver.inp
# echo '==========================================='
# echo
 cp solver.inp solver.inp.$JOB_ID
 cp $JOB_NAME $JOB_NAME.$JOB_ID
# run with pgi
 module del mvapich2
 module load pgi/7.1
 module load mvapich-devel
 ibrun tacc_affinity $HOME/dev_pgi_mvapich-devel/phasta/phSolver/phSolver/bin/x86_64_linux-pgi/phastaIC.exe-O
 echo 'Job completed'
 date

There are few things that have to be change for different runs. For example, if you want to run 32 processors, you will have to set -pe 16way 32. Every node on Ranger have 16 cores. It is recommended to use all the cores in every nodes. Another example of running on 32 processor is -pe 8way 64 which will run 8 cores per node. So the first part of the command is the amount of cores per node and the second is the total number of processors to be used times 16 divided by cores per node.

The second thing you can change is the running time for the simulation. Set the time according to the estimate time the simulation will run.

Once all this variables are set you can run the code. On command line you will type the following:

qsub ranger_phSolverIC_pgi_mvapich-devel.csh

To check the status of your simulation type

qstat or showq -u

Multiple Jobs Submission

This section explain the case when the user needs to submit several runs of PHASTA, but the first run needs to finish before the second run start. The way to handle this is as follow:

qsub -N jobname1 script.csh
qsub -hold_jid jobname1 -N jobname2 script.csh

What this does is to submit jobname1 and hold jobname2 until jobname1 is finish.

Archiving results

This is a rough draft and will be updated in next few days (with some scripts to show how to automate such tasks).

Large runs produce large data sets that needs to be periodically archived and logged.

While archiving one should try to be modular, organized, descriptive and careful for multiple reasons:

  1. Typically tar balls of certain size (around 10G as of 2008) is recommended as fitting all simulation results into one giant tar ball of >> 10G is not reliable.
  2. Giant tar balls may run into problems if transfer was interrupted due to lost connection or shutdown of either compute or tape or both systems for emergency or maintenance (one may have to restart transfer from beginning).
  3. Giant tar balls can also be problem when pulling archived data back at a later time for many reasons, it would be easier to have few tar relatively smaller balls than having one super giant tar ball as total transfer time will be very high (for giant tar balls) and one may only want a small subset of data from a giant tar ball. Moreover, pulling of a giant tar ball would require big disk space on receiving side (typically twice the amount of space as one for the tar ball copy and one for unfolding of the tar ball).
  4. One should also be careful in moving data around while creating modular folders during archiving process, especially from a (limited disk) space with-no-purge (referred as $WORK) to (unlimited or extremely large) space with-purge (referred as $SCRATCH) as simulation results could be old in terms of date stamp on files and may get purged (soon) when moved to $SCRATCH due to older date stamps, one way to overcome this problem is to follow two steps within scripts: 'mv $WORK/RunABC/Arch-1024.110-200.1-512/file1 $SCRTACH/RunABC/1024-procs_case/file1', followed by 'touch $SCRTACH/RunABC/Arch-1024.110-200.1-512/file1'.
  5. One should also be very carefully in checking 'commands' (instructions in the script files) such that one avoids overwrite on other data due to typos or copy-paste-and-partial-update errors (or previous-command-and-partial-update), for example, if one first archives certain data: 'archive $SCRTACH/RunABC/Arch-1024.110-200.1-512 $ARCHIVE/RunABC/Arch-1024.110-200.1-512.tar', then use previous command for data associated with next steps and partially update it as: 'archive $SCRTACH/RunABC/Arch-1024.210-300.1-512 $ARCHIVE/RunABC/Arch-1024.110-200.1-512.tar' it can possibly overwrite 210-300 dataset onto 110-200 dataset.

Useful links

Quick Start Guide for Experienced Users: http://www.tacc.utexas.edu/services/userguides/ranger/quickstart.php

Get pdf version at: http://www.tacc.utexas.edu/services/userguides/ranger/

Personal tools