One simple trick to clean up your WordPress database

Most WordPress sites likely run Akismet, which comes bundled with every new copy. What they probably don’t know is that, over time, Akismet will cause your wp_commentmeta table to balloon in size.

Use wp db size --tables to see the size of all tables:

$ wp db size --tables
+-----------------------+--------+
| Name                  | Size   |
+-----------------------+--------+
| wp_users              | 9 KB   |
| wp_usermeta           | 66 KB  |
| wp_posts              | 274 MB |
| wp_comments           | 48 MB  |
| wp_links              | 3 KB   |
| wp_options            | 41 MB  |
| wp_postmeta           | 25 MB  |
| wp_terms              | 796 KB |
| wp_term_taxonomy      | 621 KB |
| wp_term_relationships | 905 KB |
| wp_termmeta           | 48 KB  |
| wp_commentmeta        | 687 MB |
+-----------------------+--------+

Whoa! wp_commentmeta is way larger than I’d expect it to be. What’s going on there?

Let’s take a look at what keys are used:

$ wp db query "SELECT DISTINCT meta_key FROM wp_commentmeta"
+-----------------------+
| meta_key              |
+-----------------------+
| akismet_result        |
| akismet_history       |
| ERRating              |
| akismet_user_result   |
| akismet_user          |
| akismet_rechecking    |
| akismet_as_submitted  |
| akismet_pro_tip       |
| _wp_trash_meta_status |
| is_customer_note      |
| rating                |
+-----------------------+

Hm. Can’t quite tell what might be a large one from that set. Let’s look at a random comment:

$ wp comment meta list 659968
+------------+----------------------+-----------------------------------------------------------------------------------------+
| comment_id | meta_key             | meta_value                                                                              |
+------------+----------------------+-----------------------------------------------------------------------------------------+
| 659968     | akismet_result       | true                                                                                    |
| 659968     | akismet_history      | {"time":1503414726.1497,"event":"check-spam"}                                           |
| 659968     | akismet_as_submitted | {"comment_author":"coach handbag tassels embroidery","comment_author_email":"[email protected] |
|            |                      | il.com","comment_author_url":"http:\/\/www.bestcoachbag.store\/accordion-zip-wallet-in- |
|            |                      | signature-embossed-leather-p-37.html","comment_content":"Hi, i think that i saw you vis |
|            |                      | ited my site thus i came to return the favor?I'm attempting to find things to enhance m |
|            |                      | y website!I suppose its ok to use some of your ideas!!","comment_type":"","user_ip":"58 |
|            |                      | .19.83.5","user_agent":"Mozilla\/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; Mozill |
|            |                      | a\/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1) ; .NET CLR 1.0.3705)","blog":"http:\ |
|            |                      | /\/example.com","blog_lang":"en_US","blog_charset":"UTF-8","permalink":"http:\/\/exa |
|            |                      | ple.com\/one-two-three"}                                           |
+------------+----------------------+-----------------------------------------------------------------------------------------+

What on Lord’s earth? Why is Akismet storing a copy of every comment in comment meta? I wonder how large that is in total:

$ wp db query "SELECT sum(char_length(meta_value)) FROM wp_commentmeta WHERE meta_key='akismet_as_submitted'"
+------------------------------+
| sum(char_length(meta_value)) |
+------------------------------+
|                    567045828 |
+------------------------------+

Ah, I see. Akismet’s akismet_as_submitted entry is 567 MB of the 1 GB total database size. That makes sense.

PS I tried to list the total comments but then I ran into a WP-CLI bug. Que será, será.

Questions to ask when determining why a WordPress site is going down

Working on a WordPress site that’s crashing all of the time? Here are some qualifying questions you can use to determine why it might be going down:

  • When does the site go down? Is it a particular time of day that could be correlated to traffic patterns?
  • When the site goes down, what do you do to bring it back up?
  • When did the problem start happening? Is it closely related to some other change that was made to the site?
  • What have you done in previous attempts to fix the problem?
  • How are the servers configured (RAM, etc.)? Does adding more capacity remediate the problem?
  • What caching plugins are you using? Do you have other caching strategies in place (e.g. Varnish or Cloudflare)?
  • What plugins are you running? How customized is your theme?

Importantly, these qualifying questions should give you better visibility into the nature of the problem, to help guide further debugging.

Markup normalization

A small selection of the Vim statements required to normalize every possible variant of shitty markup entered by copy and paste online editors into 35,000 articles over the last eight years:

:%s/,|||,/=nr2char(11)/g
:%s/,|||/=nr2char(11)/g
:%s/""/=nr2char(21)/g
:%s/"//g
:exe '%s/' . nr2char(11) . '/","/g'
:exe '%s/' . nr2char(21) . '/"/g'
:exe '%s/"$//g' # add a
:%s/^/"/g
:%s/<br /><p><br />/</p><p>/g
:%s/<p></p><p>/<p>/g
:%s/<BR><br />/<br />/g
:%s/<br /><p></p><p>/<p>/g
:%s/<p><br />/<p>/g
:%s/</p></p><p>/</p><p>/g
:%s/</p><br /><p>/</p><p>/g
:%s/<br /><p>/</p><p>/g
:%s/<p><p>/<p>/g
:%s/<P>/</p><p>/g
:%s/</p></p>/</p>/g
:%s/<br /></p><p>/</p><p>/g

Two more things: 1) Anyone who’s ever tried to tell you to use find and replace in bbEdit for large files is dead wrong. 2) College Publisher, you suck ****. ‘,|||,’ is not a valid delimiting character. Quit being malicious.

Lastly, if I’ve thought ahead, I would’ve tracked invalid markup against prevalence and date range. That would’ve made for a fascinating anthropological study.

#wcbos: Advanced Theme Performance Techniques

Frederick Townes is the founder of W3 Edge, CTO at Mashable and author of W3 Total Cache. He’s presenting today on WordPress theme performance best practices.First, he recommends contributing back to the WordPress Codex because everyone in the room thinks it could be improved.

Pay lots of attention to the hierarchy with page templates.

Think about how many files you’re loading into memory, and the overall footprint they end up consuming. You can track this down using xdebug.

Fundamentals:

  • The larger the heap, the greater the execution time.
  • “Graduate” groups functions to plugins.
  • The fewer files the better.
  • Explore and use microformats for reviews, businesses & organizations, products, and people.
  • Use external services and fail gracefully.

W3 Total Cache has a debug mode that will show you what’s being cached on a request and what’s being missed.

Trick to debug on production:

define( 'WP_DEBUG', true );
// log to wp-content/debug.log, useful tests on production
define( 'WP_DEBUG_LOG', true );
define( 'WP_DEBUG_DISPLAY', false );