A cluster is used to make a collection
of 2 or more computers run as a single super computer. Clusters can be
used to increase reliability and/or increase performance and resources
available. A Beowulf cluster is a group of usually identical PC
computers that are networked together into a TCP/IP LAN, and have
libraries and programs installed which allow processing to be shared
among them.
Now before you get all happy here it's important to know that the
applications need to be written for mpicc in order to utalize a
cluster resource. you can consult the lam (http://lam-mpi.org/) website
for information and tutorials on it.
Lets begin this quick and dirty howto.
The first thing you need to take care of is each node on the cluster
needs a DNS name. If your not running a DNS server using the
/etc/hosts file will work just fine. I'm not going to get into the
configuration of bind; Ill save that for a later date.
Next our server needs to be configured as a NFS Server.
Server:
($:~)=> vi /etc/rc.conf
nfs_server_flags="-u -t -n 4 -h 10.0.5.100" #Replace with your internal ip address.
mountd_enable="YES"
mountd_flags="-l -r"
rpcbind_enable="YES"
rpcbind_flags="-l -h 10.0.5.100" #Replace with your internal ip address.
nfs_server_enable="YES"
Then our client nodes need to be configured as a NFS client.
Client:
($:~)=> vi /etc/rc.conf
nfs_client_enable="YES"
Next thing we need to export our /home directory
Server:
($:~)=> vi /etc/exports /home -maproot=0:0 -network 10.0.5.0 -mask 255.255.255.0
Now each client needs to mount it
Client:
($:~)=> vi /etc/fstab
10.0.5.100:/home /home nfs rw 0 0
Make sure your NFS share is working properly before continuing.
Now we install the lam-mpi clustering software. Do this for all
computers on the cluster.
All:
($:~)=> cd /usr/ports/net/lam
($:~)=> make install clean
Next lets install some software to help us monitor the clusters.
All:
($:~)=> cd /usr/ports/sysutils/ganglia-monitor-core
($:~)=> make install clean
On the server, we need the web interface for this. You should already
have a web server setup with PHP installed and configured for the GD
graphics library support.
Server:
($:~)=> cd /usr/ports/sysutils/ganglia-webfrontend
($:~)=> make install clean
Now onto the configurations.
All:
($:~)=> cp /usr/local/etc/gmond.conf.sample /usr/local/etc/gmond.conf
($:~)=> vi /etc/gmond.conf
There are 2 important areas to change in this file. The rest Ill leave
to your digression.
First being your cluster name:
All:
name "ClusterName"
Next the interface we wish to use for the cluster.
Make sure ClusterName matches the name in the gmond.conf configuration
file. The 10 is the polling interval followed by the computers in the
cluster.
Now our monitoring software is configured lets configure the cluster
software.
All:
($:~)=> vi /usr/local/etc/lam-bhost.def
Configuration for this is easy. Just put in the full domain names to
each box.
Congratulations.. Your clustered. You may open up your browser and
view /usr/local/www/data-dist/ganglia and ultimately setup a point on
your web server to view it.
Now so how do i use this cluster?
some commands that i commenly use are:
Server:
($:~)=> tping N
1 byte from 1 remote node and 1 local node: 0.002 secs
1 byte from 1 remote node and 1 local node: 0.001 secs
1 byte from 1 remote node and 1 local node: 0.001 secs
The tping command is same as ping but it's used to ping the nodes in
the cluster. the N (uppercase) means all nodes in the cluster. If i
just wanted to ping node2.yourdomain.com i would use the lamnodes
command to find out the number associated with that node then run
tping n1 (n1 being node2.yourdomain.com)
Another benifit is i can sit on one machine and tell the cluster to
start applications on the other machines and return the display to the
monitor i'm on.. Lets try it shall we:
Server:
($:~)=> lamexec N echo "hi"
hi
hi
Since i used the uppercase N meaning all nodes it ran the echo "hi" on
both pc's returning the results to the 1 machine. i would suggest
reading up on lamexec for other information and tips you can do with
it. so how can you be sure it' running these processes on both pc's?
watch this:
Server:
($:~)=> lamexec N hostname
node1.yourdomain.com
node2.yourdomain.com
Also read the man lamd page it contains other useful programs for your
cluster. Enjoy and happy crunching.
Some Links Of Interest:
Computer Clusters Profiles on TechTV
http://www.techtv.com/screensavers/answerstips/story/0,24330,2554333,00.html
Offmyserver building a Beowulf cluster
http://www.offmyserver.com/cgi-bin/store/cluster.html
Brooks paper on building a FreeBSD cluster
http://people.freebsd.org/~brooks/papers/bsdcon2003/