In the previous post I talked about setting up the physical hardware for a home made cluster using Raspberry Pis with the goal of creating a hardware and software stack for learning how to use clustering technologies.
The hardware for the Raspberry Pi Cluster is in a usable state so the next step is to install software on it.
I’m aiming for a basic setup that I can keep building upon to get the desired result; A load balanced set of applications that can be scaled.
First step is to install an application into the cluster that will show the user some information. In this case it will be a simple PHP application, but it can be anything such as NodeJS, Python, Ruby etc…
PHP is just being used as an example.
One of the Raspberry Pis will need to have a web server installed to serve the PHP. First of all install raspbian from the official website, I used the lite version as the cluster does not need a full desktop environment only a lightweight server. Instructions on how to do this can be found here.
Once that has been installed installing PHP and a web server is next, again this does not have to be httpd it could be nginx, webrick or anything else.
For simplicity I added an index.php to /var/www/html/ which just displays the Raspberry Pis hostname.
I did the same again on another of the Raspberry Pis so that I had 2 different outputs.
Now we have 2 nodes in the cluster both running the same application but outputting different content so we can identify which one we have connected to via our web browser.
The load balancer is what is going to distribute the traffic among the different applications but the end user will just go to a single address and all the magic will happen behind that.
For this I am going to use something called haproxy. It will take http requests from a web browser over port 80 and redirect them internally to the different raspberry pis.
The configuration for the proxy is stored in “/etc/haproxy/haproxy.cfg” and is very simple. You have to define a front-end and bind it to a port, we will use port 80. The following can be put at the bottom of the configuration file.
Now we need to create the web-backend that is referenced in the front-end.
I have custom DNS in my setup so my Raspberry Pis have domain names, you can replace the domain names with the IP address of each of your Pis.
the parts to replace are “rpi-b” and “rpi-c”.
The back-end configuration can be added underneath the front-end one as follows:
Same example again but using IP addresses:
The type of load balancing we are using here is “roundrobin” which will cycle through each server with each incoming request. There are other modes such as “leastload” but I will cover those in a later post.
Once this is done and you have to reload the haproxy:
sudo service haproxy reload
Go to your web browser to the address of your Raspberry Pi you just configured with haproxy and keep loading it. You should see the response cycle between each of your applications in order due to the roundrobin configuration, as you keep refreshing the page.
In summary we configured 3 devices, 2 of those are serving content and the 3rd to balance load between each application. In my next post I will talk about data storage and how we can balance load across data nodes and how to keep the data in sync.