Recently in Web Category

The EU Directive on Privacy and Electronic Communications initially caused quite a stir 12 months ago, but the UK's Information Commissioner's Officestepped in and said that UK firms would have a year to comply with the regulations.  That year is up on 26th May and people are starting to talk about the EU Cookie Law again, however no-one seems to be exactly sure what the implications will be and the ICO is not offering answers to the questions people are asking.

What is the directive about?

The intention of the directive was to combat "tracking cookies" and other similar techniques used by advertising networks to analyse your online behaviour and offer targeted ads to you.  Cookies are small text files, stored on your computer by a website, that contain short pieces of information. These can range from the contents of your shopping basket to a unique (ish) identifier used by large ad networks to track your browsing history.  Whilst the files themselves are harmless, many privacy groups object to the non-consensual tracking of an internet user's browsing habits.  The "unique" identifiers used do not contain any real personal information and cannot track you across different computers or even different browsers on the same machine, however they allow ad networks to build up a profile on the person using that computer based on their browsing habits.  By analysing what sites you visit that contain their adverts, they can make an educated guess of your age and gender and get an insight into what you read about, therefore allowing them to show you adverts that have more relevance to you, in turn allowing them to charge more for the placement of those adverts.
Google recently announced that they are expanding their use of SSL encryption to more local domains around the globe in an effort to "increase the privacy and security of your web searches".  Whilst this seems a noble intention, it will affect every site owner that uses a stats package such as Google Analytics or Piwik and those sites that use keyword data to enhance the user experience for their visitors.

google-analytics-output.png
Before embarking on a MySQL Cluster installation, it is important to remember that MySQL Cluster is 'just' a storage engine for your existing MySQL database servers.  It stores data at the table level, not the whole database, it is therefore on the same functional level as MyISAM or INNODB.  You still need a standard MySQL server to access the table data and store the database information.  This has the fringe benefit that you can target specific tables to be saved to the cluster, rather than the whole database, if you have some tables that are either more important, or more heavily used.

In this article we are going to cover MySQL Cluster, it's installation on Debian (Lenny), Master-Master replication and how to tie all this together with HAProxy for a very high availability solution.
NB: At present you cannot request or share additional internal IP addresses with Rackspace Cloud, so you are going to have to use the external addresses.  With large databases this will incur additional bandwidth charges, be sure to evaluate this additional cost against the benefits provided by high availability.  When Rackspace allow the allocation of additional internal IPs, this article will be updated to reflect that.

Through two related tools and some cheap Rackspace Cloud servers, you can provide a front-end for your database that will balance between multiple replicated database servers and automatically failover if one of your balancers develops a fault.

This article will cover setting up heartbeat and pacemaker to handle the transfer of an IP from one machine to another on a failure and then another install of HAProxy on both boxes to balance the load between your database servers and cope with either of them failing.
To go with our Load Balanced Web Cluster, which provides good availability for your web services, providing high availability for your database is also likely to be an important requirement.  In most modern web apps, there's not much use having your webservers available constantly if your database is down.

There are a number of solutions to this with MySQL and every situation will require a different response, there are a lot of good articles out there to help you decide which solution is best. In this article we will be covering MySQL Master-Master replication and installation on Debian (Lenny) using Rackspace Cloud Servers.

The standard model of MySQL replication is a single master with multiple slaves, which provides you with very good read reliability, but writes can only be made to the master node.  This means that if the master fails, you can't just switch to another node and carry on as before, your slaves will become out of sync.  Additionally you can't load balance between your nodes for reads and writes.  By using a multiple master configuration, you can drop a node at any point and either switch your connection string to using the remaining master, or use load balancing and failover with HAProxy.
Upon creation, most Rackspace Cloud Servers have a completely open firewall policy that will allow any computer into any port on your machine.  Linux uses IPTables to firewall connections into and out of your server but needs a fair amount of configuration to get it working and for it to stay working on a reboot.

In this article we will cover basic saving and loading of IPTables rules in Debian/Ubuntu on shutdown and boot up as well as common rules that go with our other guides.
When we first wanted to load balance web servers, we initially followed Rackspace Cloud's articles on the subject. They recommended using mod_proxy with Apache.  This took a little while to set up and even with countless amounts of config changes, every now and then requests would get lost and you'd have to refresh your browser to get connected again.  This was not a problem in a development environment, but is unacceptable when we wanted to go live with the service.  So we looked for a different solution and found HAProxy, which was not only easier to set up than mod_proxy_balancer, but is tonnes more reliable and quicker too.