Whenever possible, I choose to use the idle time of computing power by taking part in distributed computing projects. I started off with the Einstein@Home project, but have since joined other projects, which can be seen below and in the links to the right of this page.
To quote the Einstein@Home article on Wikipedia, Einstein@Home "searches through data from the LIGO experiment for evidence of gravitational waves from continuous wave sources, which may include pulsars." And that "as of July 2009, over 234,000 volunteers in 209 countries have participated in the project, making it the third most popular BOINC project. About 39,000 active users contribute about 190 teraFLOPS of computational power."
Getting Involved?
Firstly, for most home users you should simply be able to install BOINC and attach yourself to a project. If you are using BOINC on a cluster then make sure that the compute nodes can run it successfully. There are 32-bit and 64-bit versions of BOINC for most of the common operating systems.
However, as you may be aware, I ran the Condor and PBS-based schedulers simultaneously on HTC and HPC Beowulf clusters for some complicated reasons, so the following code will explain how I backfilled BOINC-based jobs using Condor and stole CPU cycles from spare designated PBS nodes.
After testing BOINC then make the following amendments to your condor_config file on your local node.
Backfill Condor with BOINC:
ENABLE_BACKFILL = TRUE
BACKFILL_SYSTEM = BOINC
START_BACKFILL = TRUE
EVICT_BACKFILL = FALSE
BOINC_HOME = /path/to/boinc
BOINC_Executable = /path/to/boinc/boinc
BOINC_Universe = vanilla
BOINC_InitialDir = $(BOINC_HOME)
BOINC_Output = $(BOINC_HOME)/boinc.out
BOINC_Error = $(BOINC_HOME)/boinc.err
BOINC_Owner = nobody
Steal CPU Cycles from PBS using Condor:
ENABLE_RUNTIME_CONFIG = TRUE
STARTD_SETTABLE_ATTRS_OWNER = PBSRunning
PBSRunning = False
START_NOPBS = ($(PBSRunning) == False)
START = $(START) && $(START_NOPBS)
Add this prologue script to the Torque directory: /path/to/torque/mom_priv
#!/bin/sh
condor="/path/to/condor/"
CONDOR_CONFIG="${condor}/etc/condor_config"
export CONDOR_CONFIG
if (ps uawx | grep -v grep | grep -q "condor_master")
then
if (${condor}/bin/condor_status | grep ${HOSTNAME} | egrep -q "(Claimed|Backfill)")
then
${condor}/sbin/condor_vacate >/dev/null
${condor}/bin/condor_config_val -rset -startd PBSRunning=True >/dev/null
${condor}/sbin/condor_reconfig -startd >/dev/null
sleep 2
fi
fi
Add this epilogue script to the Torque directory: /path/to/torque/mom_priv
#!/bin/sh
condor="/path/to/condor/"
CONDOR_CONFIG="${condor}/etc/condor_config"
export CONDOR_CONFIG
if [ -x ${condor}/bin/condor_config_val ];
then
${condor}/bin/condor_config_val -rset -startd PBSRunning=False >/dev/null
${condor}/sbin/condor_reconfig -startd >/dev/null
fi
Use High-End Nodes As A Single Memory Resource Per Node:
For example, 8 CPUs with 8GB RAM, but you want to use the RAM, not the CPUs.
Add the following to your local condor_config per node:
NUM_CPUS = 1
Finally, you may find you have to restart Condor and/or PBS.
12:24:51 BST Thursday 08 April 2010
Copyright © Gerald Davies.

