Before you read ahead, the problems that occurred to cause me headaches was partially my own mistake. However, I have not fully investigated other causes and will update the post once I get the chance to fully investigate the issue.
As a developer it is always easy to overlook minor things, some of them may not cause you much headaches, but some of them could eventually build up to give you nightmares because they are hard to spot.
Before I go ahead, please take a look at the screenshot below.
If you just saw a console with some SQL queries, let me point you to two big problems in the above screenshot.
- The total rows in the table wp_postmeta (which holds meta info for posts) is more than 10 million, 10140712 to be precise.
- The total number of rows in wp_postmeta without post_id is again more than 10 million, 10113548 to be precise.
To tell you the truth, some of these no post_id records were inserted by a custom code I wrote to shorten URLs using the Bit.ly API. Due to a small mistake in the code, the $post_id was written as $postt_id, not too hard to spot, but yes a mistake which several of us tend to overlook many times.
Due to this several records with empty post_id were inserted into the above table. However, this was quickly rectified within a week, when a Bit.ly rep emailed me to tell me that I was using their API excessively. I quickly checked the code, spotted the mistake and rectified it before it escalated. The reason why the API was called multiple times was because I was generating short URLs for posts which did not have them in the wp_postmeta table, and since it was entering an empty post_id into the table, it ran over and over again.
That problem was sorted out a couple of months back. However, recently my DB backups for which I checked size were 550MB+, this left me wondering as to what was wrong, as that huge a DB for WordPress is certainly abnormal, unless you have posts in the multiples of 5 figures. In addition to that, some other issues started to crop up with the server. I was firefighting it and did not give much heed to the DB size. After two days of firefighting I came back to the huge DB size. I even went and opened a thread on WP forums earlier today to see if anyone had a similar issue.
Finally the server issues were really becoming an headache, so I decided to run a query on the table to see what was happening. To my surprise I saw 10 million+ rows in it. A further query told me that more than 10 million of those rows were practically useless.
I went ahead and deleted the rows and now everything is back to normal. However, this raised several questions in my mind as to whether certain things need change or not.
Of course, I as a developer am partly responsible for this. I do check code I release as plugins several times, have it tested and then release it. Though something I write for myself hardly gets the same treatment.
That aside such small mistakes can quickly bring down a high traffic site, as it did with mine, and should certainly not happen with anyone else.
What Could be Done to Avoid This?
Like I said earlier, empty post id values in the post_meta are useless, however, due to some flaws they do get entered into the database. This should be avoided.
I have few suggestions for this.
- As a developer make sure that you don’t add empty values when they are not allowed and will not be useful.
- Another thing plugin developers should ensure is that, when they are updating records for a meta value, they should explicitly delete the old values for the same meta, or it will quickly escalate into the above issue.
- WordPress could either have a foreign key check on the post_meta table or include a check to return errors for empty post_id.
Neither of the above suggestions are hard to implement. However, like I said such small things can easily be overlooked and could eventually build up into a big issue, like the unwanted 10 million records I had. There might be other plugins which might have the same problem. With so many of them, it is really hard to figure out which one’s have them. If I would not have been a developer and known my stuff, it would be really hard to even detect any of the above problems.
And I do accept that it was my mistake, but I am certain that my code alone did not add up to add 10 million useless records in the DB. I am yet to fully investigate the problem, will update the post when I complete it.
This post is more of an eye-opener to me, and gives me another insight to troubleshoot problems with WordPress. I thought it best to share this with people on WLTC, as it might be beneficial to you, or while troubleshooting problems for others.
It would be interesting to know if anyone of you have faced this problem in the past.
Ohhh, So you got big Db because of your bit.ly api plugin customization.
Dude test & research your plugins not your main site, like create a testing place somewhere, so if something weird happens doesn’t affects your techie-buzz
I too was looking for plugin to automatically make short urls for each post & make it available below posts by php call function
Dude share me that plugin. I prefer stumble upon links
Holy crap on a cracker! 10 million rows? 🙂
One more tip:
define('WP_DEBUG', true);
should be the best developer’s friend 🙂Yes, for development environment WordPress should always be in debug mode and with that you can clean up all minor warning and most important avoid all kinds of big errors that are not easy to spot.
Good idea, you have reminded me to do that on my localhost installation.
Thank you for posting this. Though I have never encounter this, it will make me more careful in the future.
I’ve opened a ticket to prevent WordPress from inserting rows into wp_postmeta if post_id is 0.
This is a very interesting article. I’ve never run into this problem, but I still find it a great article and a really informative post. Thanks!
Isn’t that the purpose of the $unique parameter in add_post_meta? Set it to true and it won’t add a new key or over-write the old one. The function returns false in that case so you know to use update_post_meta.
A post_id of 0 is perfectly valid and a logical way of storing default post meta values – the code that reads meta data will automatically get those default values when a new post is created (where a plug-in provides custom editing on the post editor).
If WP removed the ability to use post_id=0 I’d open a ticket to report a bug.
WOW i guess it is the little thing that can become BIG problems. I am not a techie but I can see this bringing down a site.