Everyone loves to get loads of traffic to their websites, however, websites get timed out or respond very slowly when many users try to access it at the same time. WordPress does a pretty good job of handling lots of users at once, however it in itself cannot help you manage all the traffic, and you need proper server setups and cache setup to scale to accommodate more users.
WordPress Super Cache does a pretty neat job too at handling heavy traffic, however, there will be a time when the cache in itself will not solve your problems and you will need to make changes to your server setup in order to ensure that your website loads properly and fast.
One of my pet projects on my personal websites has been to scale the site to allow around 70-100 thousand+ users to seamlessly access the site without having to suffer slow load times on a daily basis.
I did tweak my server quite a bit, however, after a point of time I ran out of tweaks and looked at what I could do to scale the system to seamlessly manage more traffic. I did manage to handle traffic to a point by tweaking the servers, but alas the traffic kept increasing and my I ran out of tweaks.
This is when I started to scale the system itself to handle more traffic. Though I am not done fully yet and will do some more changes in near future to scale it further, I thought it best to document my changes so that others could benefit from it. So here is the first part, which includes a tutorial on how you can scale WordPress by using a master-slave (replication) setup for MySQL.
Note: The guide is aimed towards users use standalone WordPress installation on a VPS or dedicated server. Shared hosting providers do not allow users to change system settings, and this implementation will require to make few.
What Is MySQL Replication or Master-Slave Setup?
MySQL replication basically involves an automatic replication of data from one MySQL database server to a different one located on the same machine or another one. In this scenario you can setup a master MySQL database to which data is written, once that happens the data is then automatically replicated across all the slaves you have setup.
Let me give you a practical example; you have database A where inserts/updates happen, now in a master-slave setup this will be your master server. With replication, you keep the master database as is, however, you add new slave databases which will constantly check back with the master database for updates and then automatically replicate it to themselves without you having to run manual SQL update/insert statements.
So when you insert new data into the master database, the slave database will be notified and will automatically copy the new data over to itself, this replication happens quickly enough that you may not even notice it. If you are not yet clear about what this means, you can visit the MySQL manual for replication.
How Does MySQL Replication Help?
In a production environment of a website database reads constitute a major chunk of the total database accesses. So for example, if 1000 users visit your website, there might be somewhere around 1000 or more hits to the database, depending on how your site works. The latency to respond to those requests will be higher when compared to a site which gets lesser traffic since all those 1000+ hits have to be processed by a single database.
With MySQL replication you can distribute the reads between the master itself and several slaves, so the latency to respond back to requests will be much lower since the load is distributed among several servers.
In addition to that, since you are replicating data, you will also have backups in several places, which you can then use as failovers if your main database has any problem.
How Can You Replicate MySQL Data In A Master-Slave Environment?
Replicating data from a master to one or more slaves is very easy, all it takes is a few configuration changes. A master server will have to be configured only once, and so will a slave. However, you might have to repeat certain steps on the master server in order to grant privileges to multiple slaves to access the data on it.
I would rather prefer not to go into the steps required to create a master slave environment, as an excellent one already exists on HowtoForge: How To Set Up Database Replication In MySQL.
In my setup I have used a Master-Slave environment, but you can also use a Master-Master environment, I would not go into that, but you can always look it up on the internet.
How do I use a MySQL Master-Slave Setup with WordPress?
After you have created a master-slave setup you have multiple resources, but how do you really tell WordPress to actually read from multiple databases?
Don’t worry you do not have to scratch your head and hire programmers to do it. There is a solution in the form of HyperDB which is written by the WordPress team, and is actively used in WordPress MU and on WordPress.com(?).
Just download HyperDB and follow the instructions to setup the different databases you have created. Once that is done, HyperDB will internally take care of distributing the load across multiple MySQL servers.
HyperDB is not like a regular plugin so you will need to follow the installation instructions and also make changes to the DB settings to specify the different MySQL servers you have. Once you have made the changes, your WordPress install will load faster than when it did with a single MySQL database.
That’s it, with two simple steps you now have multiple databases serving your website. This will definitely ease out the load on your servers and will also allow you to easily scale by adding more slaves to your setup.
In my scenario the cost of scaling added only around $20 per box a month, since I threw in servers with minimal configuration as they were only DB servers and did not take up much resources to read data. If I need to scale more I could throw in few more boxes and handle more traffic easily.
Just an afterthought, it does make a difference if your slave servers are on the same network as the master or if they are on an outside network, as far as possible try and add boxes on an internal network itself.
Did you find this guide interesting? Do you think this will help you scale your system? Do let me know through your comments. Oh and also do tell me if you have used any other method to scale your system or would prefer to do it differently.
It is good to use it in conjunction with Ryan’s Memcached Object Cache and Andy’s batcache.
Nice write-up. I hope I get a chance to try it out soon.
Keith – great story and I just moved to VPS so I hope to try this out in the future.
BTW – can you contact me via email?
Thanks.
Just a question about your afterthought. What if the boxes are outside the network, wouldn’t that way add bandwith? and plus the advantage to have multiple-location backups.
Your article made me want to try that. I’m currently serving from my home computer-connection, and I would like to scale using my family’s computers-connections… wish me good luck 😉
Hi Keith, thanks for this post. I am a web designer and not an IT person. But there have been situations where we were suggested Joomla instead of WP saying it breaks after a certain point is passed in size. They probably need to read this.
As Lester mentioned, most scaling discussions should include a shared cache system. WP Super Cache is nice, but only if you have one web server. Once you even consider more than one web server you need to look at a shared cache system like memcached. Ryan’s memcached object cache plugin and batcache (from Andy) are a great combination that leverage memcached to provide a network available cache system for all of your web servers.
Yeap, a central memcached cluster is a must for scaling
@Joseph and @Lester – I agree that memcache is essential and will covering it in future parts in the series
Very cool. Thanks Keith. Now I just have to find a new host so that I can go back to using a VPS. Media Temple was a nightmare.
Tough part about maximizing load time when you have traffic with a WordPress install is having 3rd party plugins play nice with eachother, the theme you’re using, and any customization you make.
I have enjoyed WP-Minify, WP-Super Cache, and XCache as my opcode cache with Mystique as my theme of choice to customize, but then one of the three developers makes an upgrade and there goes compatibility.
Open source software like the WordPress community is a good reminder that to really be into managing your own site you really need to learn just about everything from client-side to server-side unless you want to suffer a heart attack.
I have setup a VPS from scratch and I realized That I should have used a 64 bit version.(my mistake)
If you need a setup like THIS don’t forget mysql limitation of 2gig per table on a 32 bit linux. So If you expect a lot of visitors, comments then you may choose the 64 version. So you can go above 2 gig limit.
Then Sky is the limit, actually 4 terabytes for 1 table not the whole database, Just your comment table!
By the the time you fill up this…
Source: http://dev.mysql.com/doc/refma.....table.html
enjoy !
Yes, but doesn’t replication in this manner allow you to add in a 64 bit MySQL database for the reads – assuming you NEED more than 2GB ram for your table.
Thanks for this very useful post. Will be bookmarking this, will definitely use it before i ran out of tweaks 😉
Oh here is a question I have been toying with – would it speed things up to replicate databases in the same box? Having one dbase for writes and one (or more) for reads – particularly on a blog that has a number of authors – this might make things quicker if the database is bottlenecking?
Hi there,
Really good info and it clarified a lot regarding the master/slave usage.
Does all my content will be going into on massive database.
is it the best way, or will it be better to seperate content int many databases.
For example: one site – one DB.
I am a bit confuse as i dont think wordpress.com is using one massive database with all 60 million people.
Thanks for any help and hope i make sense.
Regards,
Jerome
So this is what HyperDB does. 😀 At first I really can’t understand what’s the use of it, but now I know that it’s very useful. 😀
Part 2?
Hi
Thanks for this post. I’m going to implement it on my own site. I’m going to try to provide high availability over multiple geographical sites.
Right now my “single point of faillure” is at the internet connection at site A. In the next couple of month’s, I’m going to create site B and setup the replication. (Both sites will be connected with an site2site vpn.)
Do you know how the MySQL Master/Slave setup reacts on connection issue between the 2 databases? I mean, what if the link between the master and the slave is down. I do a create and an update in the master database, the master will try to notify the slave but it will fail since the link is down. 10 min later the slave is back up, will the master retry? Will it push the previous updates when a new insert/update is done?
Kr.