Cluster Overview
From SOMWiki
Contents |
Tutorial and Overview of the SOM Clusters
The Yale School of Management High Performance Computing Cluster
SOM currently runs three distinct computing architectures:
These are divided into seperate clusters, however, when combined they are referred to as the SOM Computing Grid.
The Solaris Cluster
The first cluster is comprised of three primary systems that run Solaris. There are several additional machines available in the Solaris cluster (9-15 machines at any given time). These machines provide 28 processors for computational use. These machines also provide a complete 64 bit development/run-time environment. These machines use Solaris 9 as their operating system, and Sun's gridengine to distribute the computational workload.
Jobs are submitted to the cluster by creating a script that specifies certain atttributes of the job. This script is then submitted to a queue engine which schedules the job (based in part on the attributes specified). Information about the job can then be mailed to the user, and/or stored in log file. In the case of certain scientific and financial research applications at SOM, wrapper shell scripts have been created which will run batch jobs easily from the UNIX shell. These shell scripts perform all of the background gridengine requirements without the need for users to actually create the gridengine scripting when submitting certain type of research application jobs.
Logging into the Solaris Cluster
In order to login to any SOM Research machine, you will need an SSH client. Please [The_Yale_SOM_Support_Pack | visit the SOM Support Pack Page] for more info.
Logins to the Solaris cluster can be divided into two seperate requirements:
Interactive Jobs (Such as X-Windows applications, Graphing programs or web browsing) Batch Jobs (Such as a Matlab, or SAS job that requires no user input. We have configured the three primary Solaris machines into this structure:
- tiamat.som.yale.edu (8 - 900MHz UltraSparc III Processors, 32GB RAM) - Primary Head Node
- moloch.som.yale.edu (4 - 400MHz UltraSparc II Processors, 8GB RAM) - Secondary Head Node
- baal.som.yale.edu (4 - 400MHz UltraSparc II Processors, 8GB RAM) - Secondary Head Node
These machines can be used interactive work such as:
- Source Code Editing
- Source Compilation (C/C++/Fortran)
- Interactive Scientific Applications(Matlab, Mathematica, SAS, R, SPlus, Stata etc.)
- Data Mining (via local data sets, or those transfered from WRDS)
For batch jobs within the Solaris cluster, you should login to: tiamat.som.yale.edu
For interactive jobs within the Solaris cluster, you should login to:
- tiamat.som.yale.edu
The Xserve Cluster
The second cluster is comprised of 45 Apple Xserve rackmount servers. These machines are equipped with two (2) 1.3GHz processors and 2GB of memory. These machines run OS X version 10.3. OS X is based on FreeBSD and is fully POSIX compliant. All machines in the cluster are equipped with the standard array of UNIX utilities and applications. If there is a utility or application you would like installed, please contact us.
Logging into the XServe Cluster
If you are using scientific and financial computing applications for batch and interactive jobs, it is not necessary to login to the Xserve cluster directly. Instead, you may log into one of the primary research servers and then submit your job from a UNIX shell there. The grid engine software will automatically select the machine least utilized and capable of running the application for your job.
In order to login to any SOM Research machine, you will need an SSH client. Please [The_Yale_SOM_Support_Pack | visit the SOM Support Pack Page] for more info.
Logins to the XServe cluster can be divided into two seperate requirements: Interactive Jobs (Such as X-Windows applications, Graphing programs or web browsing) Batch Jobs (Such as a matlab, or stata job that requires no user input. The cluster is resides on Yale's non-routable 172 subnet, and in order to access these machines, you will need to attach to tiamat first.
These machines can be used for interactive jobs or for batch job submission :
- Source Code Editing
- Source Compilation (C/C++/Fortran)
- Interactive Scientific Applications(Matlab, Mathematica, R, Octave, etc.)
- Data Mining (via local data sets, or those transfered from WRDS)
- Job submission
- Job Management
- Parallel Application Execution Once you are logged into one of the head node machines, you can rsh/ssh into a free node (to view a list of available nodes and queues, type - qstat -f).
To utilize these machines to the fullest potential, please view the Gridengine tutorial located [Here]. If you would like to schedule a time for tutorial, please email somresearch@yale.edu.
The Linux Cluster
The linux cluster is currently comprised of two seperate machines:
- ganesh.som.yale.edu
- hanuman.som.yale.edu
These machines run [CentOS] and are designed specifically to run SAS and linux applications.
These three clusters allow you to run your jobs on multiple-machines and multiple-processors simultaneously. The system also allows you to submit your job(s) to a machine that is less busy.
Message Passing Interface Implementation
Our current default implementation for message passing is MPICH. MPICH is a portable implementation of the MPI libraries. LAM-MPI is also available for use.
--Djh29 10:04, 13 Oct 2005 (EDT)

