Server information

The compute servers are all multi processor machines running Linux. See the current resource usage page for the current status. All machines have the same operating system (Linux; CentOS 7 64-bits), configuration and software.

CPUs and memory

Compute nodes
NodeFeaturesCPUsCoresMemory
100plusht, avx, avx26432768 Gb
hopperht, avx, avx26432512 Gb
maxwellht, avx, avx25628256 Gb
parzenht, avx, avx25628256 Gb
viterbiht, avx, avx25628256 Gb
watsonht, avx3216256 Gb
gaussht, avx3216192 Gb
markovht, avx3216192 Gb
neumannht, avx3216192 Gb
sanger88144 Gb
Login nodes
NodeFeaturesCPUsCoresMemory
fisher8832 Gb
newton8832 Gb
bayes8816 Gb
insy-login114 Gb

The nodes in the INSY cluster are heterogeneous, i.e. they have different types of hardware (processors, memory, GPUs), different functionality (some more advanced than others) and different performance characteristics. If a program requires specific features, you need to specifically request those for that job.

Features
ht: Hyper-threading processors (two CPUs per core, allocated in pairs)
avx: Advanced Vector Extensions (AVX) support
avx2: Advanced Vector Extensions 2 (AVX2) support
CPUs
All machines have multiple central processing units (CPUs) that perform all the computations. Each CPU can process one thread (i.e. a separate string of computer code) at a time. A computer program consist of one or multiple threads, and thus needs one or multiple CPUs simultaneously to do its computations.
Most programs use a fixed number of threads. Giving the program access to more CPUs than its number of threads will not make it any faster because it simply isn't capable of using the extra CPUs. When a program has less CPUs available than its number of threads, the threads will have to time-share the available CPUs (i.e. each thread only gets part-time use of a CPU), and the program will run slower. (And even slower because of the added overhead of the switching of the threads.) So it's always necessary to match the number of CPUs to the number of threads, or the other way around.
The number of threads running simultaneously determines the load of a server. If the number of running threads is equal to the number of available CPUs, the server is loaded 100% (or 1.00). When the number of threads that want to run exceed the number of available CPUs, the load rises above 100%.
Cores
The CPU functionality is provided by the cores in the processor chips in the machines. Traditionally, one physical core contained one logical CPU, thus the CPUs operated completely independent. Most current chips feature hyper-threading: one core contains two logical CPUs. These CPUs share parts of the core and thus have some dependencies. Therefore these CPUs are always allocated in pairs by the job scheduler.
Memory
All machines have large main memories for performing computations on big data sets. All programs (and users) share this memory, and must (together) make sure that they do not try to use more than the available amount of memory.
32-bit programs can only address (use) up to 3Gb (gigabytes) of memory.

GPUs

TypeArch.CUDACoresMemory
GTX 680Kepler3.015362 GB
K2200Maxwell5.06404 GB

Some nodes also have additional Graphics Processing Units (GPUs) which support the CUDA-platform for General-Purpose computing on GPUs (GPGPU). Two different types of GPUs are available, so you should use the one that best matches the requirements of your program. The NVIDIA GeForce GTX 680 offers the best performance while the NVIDIA Quadro K2200 has more advanced functionality.

Architecture
The architecture defines the hardware functionality and performance characteristics of the GPU.
CUDA
There are several versions of CUDA, with each higher version supporting more advanced functionality. The CUDA Compute Capability specifies the highest CUDA-version supported by the GPU.
Cores
The cores perform the computations. The more cores, the higher the potential parallelization of the algorithm.
Memory
The GPUs provide their own internal (fixed-size) memory for storing data for GPU computations. All required data needs to fit in the internal memory or your computations will suffer a big performance penalty.