Skip to:
Content

Opened 22 months ago

Closed 22 months ago

Last modified 22 months ago

#1885 closed defect (fixed)

WP_Meta_Query look up on thread_id causing severe slowness, linear growth in query time with post count

Reported by: vibol Owned by: johnjamesjacoby
Milestone: 2.1 Priority: normal
Severity: normal Version: 2.0
Component: Actions/Filters Keywords:
Cc:

Description

bbPress uses wp_postmeta.meta_key to store post <-> thread relationships in the form of thread_ids assigned to a post. This is causing severe slowness on forums with many millions of posts.

The following information is from my original thread on this. I've consolidated the information below for a quick read. Original link to diagnosis: http://bbpress.org/forums/topic/bbpress-202-sql_calc_found_rows-and-16mm-posts-painfully-slow

===

I was troubleshooting an issue with a rather large migration to bbPress (1.6MM posts) and bbPress 2.0.2 is very slow to load thread pages.

It looks like bbPress (WP_Query) is trying to do an unindexed query lookup:

CAST(wp_postmeta.meta_value AS CHAR) = '787852'

The query takes quite some time and does a full table scan:

# Time: 120518 19:23:53
# User@Host: db[db] @ localhost []
# Query_time: 9.688145  Lock_time: 0.000050 Rows_sent: 3  Rows_examined: 1623367
SET timestamp=1337387033;
SELECT SQL_CALC_FOUND_ROWS  wp_posts.* FROM wp_posts  INNER JOIN wp_postmeta ON (wp_posts.ID = wp_postmeta.post_id) WHERE 1=1  AND wp_posts.post_type IN ('topic', 'reply') AND (wp_posts.post_status = 'publish' OR wp_posts.post_status = 'closed') AND ( (wp_postmeta.meta_key = '_bbp_topic_id' AND CAST(wp_postmeta.meta_value AS CHAR) = '787852') ) GROUP BY wp_posts.ID ORDER BY wp_posts.post_date ASC LIMIT 0, 25;

Stack trace:

/home/site/public_html/wp-includes/query.php   2498   get_sql ()   -- /home/site/public_html/wp-includes/
/home/site/public_html/wp-includes/query.php   2913   get_posts ()   -- /home/site/public_html/wp-includes/
/home/site/public_html/wp-includes/query.php   3000   query ()   -- /home/site/public_html/wp-includes/
/home/site/public_html/wp-content/plugins/bbpress/bbp-includes/bbp-reply-template.php   129   __construct ()   -- /home/site/public_html/wp-content/plugins/bbpress/bbp-includes/
/home/site/public_html/wp-content/themes/site/bbpress/content-single-topic.php   35   bbp_has_replies ()   -- /home/site/public_html/wp-content/themes/site/bbpress/
/home/site/public_html/wp-includes/theme.php   1117   require ()   -- /home/site/public_html/wp-includes/
/home/site/public_html/wp-includes/theme.php   1091   load_template ()   -- /home/site/public_html/wp-includes/
/home/site/public_html/wp-includes/general-template.php   128   locate_template ()   -- /home/site/public_html/wp-includes/
/home/site/public_html/wp-content/plugins/bbpress/bbp-includes/bbp-core-compatibility.php   144   get_template_part ()   -- /home/site/public_html/wp-content/plugins/bbpress/bbp-includes/
/home/site/public_html/wp-content/themes/site/single-topic.php   33   bbp_get_template_part ()   -- /home/site/public_html/wp-content/themes/site/
/home/site/public_html/wp-includes/template-loader.php   43   include ()   -- /home/site/public_html/wp-includes/
/home/site/public_html/wp-blog-header.php   16   require_once ()   -- /home/site/public_html/
/home/site/public_html/index.php   17   require ()   -- /home/site/public_html/

Here's the $bbp_r query passed into WP_Query:

Array
(
    [meta_query] => Array
        (
            [0] => Array
                (
                    [key] => _bbp_topic_id
                    [value] => 1623928
                    [compare] => =
                )

        )

    [post_type] => Array
        (
            [0] => topic
            [1] => reply
        )

    [orderby] => date
    [order] => ASC
    [posts_per_page] => 25
    [paged] => 1
    [s] =>
    [post_status] => publish,closed
)

The generated query fragment that is problematic is this:

wp_postmeta.meta_key = '_bbp_topic_id'
AND CAST(wp_postmeta.meta_value AS CHAR) = '787852'

This looks like an issue with WP_Meta_Query issuing a CAST for all rows in wp_postmeta during the query.

Relevant LOCs in meta.php:

$where[$k] = ' (' . $where[$k] . $wpdb->prepare("CAST($alias.meta_value AS {$meta_type}) {$meta_compare} {$meta_compare_string})", $meta_value);

Attachments (1)

#1885.patch (620 bytes) - added by johnjamesjacoby 22 months ago.

Download all attachments as: .zip

Change History (7)

comment:1 johnjamesjacoby22 months ago

  • Milestone changed from Awaiting Review to 2.1
  • Owner set to johnjamesjacoby

Have been meaning to fix the meta queries for a while. I think it's important to get this one fixed for 2.1.

Fix incoming.

comment:2 johnjamesjacoby22 months ago

  • Resolution set to fixed
  • Status changed from new to closed

(In [4059]) Performance:

  • Add 'numeric' to applicable meta-queries to avoid casting as char.
  • Remove meta-queries, and use post_parent where possible.
  • Introduce _bbp_has_replies_where() filter, attached to 'posts_where' which is responsible for adding the lead topic to the results. This avoids having to use a costly meta-query, potentially resulting in full table scans.
  • Audit meta-queries, and tweak where needed.
  • Fixes #1885.
  • Props vibol for investigation.

comment:3 follow-up: vibol22 months ago

John, I did some preliminary tests and the results aren't looking good. Here are the slow queries that came through.

The first is the Forum index page that lists all the threads. The second is the Thread index page that lists all the posts.

For the first query, last_active_time now looks to be the culprit. It's a postmeta lookup with a sort.

For the second query, it looks like wp_post.post_parents is the culprit. The type_status_date index in wp_post does not have post_parents as an index field so MySQL does not have a compatible index for the lookup.

# Time: 120705  2:37:05
# Query_time: 12.126533  Lock_time: 0.000091 Rows_sent: 25  Rows_examined: 313770
SET timestamp=1341473825;
SELECT SQL_CALC_FOUND_ROWS  wp_posts.* FROM wp_posts  INNER JOIN wp_postmeta ON (wp_posts.ID = wp_postmeta.post_id) WHERE 1=1  AND wp_posts.post_parent = 17  AND wp_posts.post_type = 'topic' AND (wp_posts.post_status = 'publish' OR wp_posts.post_status = 'closed') AND (wp_postmeta.meta_key = '_bbp_last_active_time' ) GROUP BY wp_posts.ID ORDER BY wp_postmeta.meta_value DESC LIMIT 0, 25;
# Time: 120705  2:37:24
# Query_time: 11.814214  Lock_time: 0.000102 Rows_sent: 6  Rows_examined: 1623851
SET timestamp=1341473844;
SELECT SQL_CALC_FOUND_ROWS  wp_posts.* FROM wp_posts  WHERE 1=1  AND (wp_posts.ID = 1623928 OR wp_posts.post_parent = 1623928)  AND wp_posts.post_type IN ('topic', 'reply') AND (wp_posts.post_status = 'publish' OR wp_posts.post_status = 'closed') GROUP BY wp_posts.ID ORDER BY wp_posts.post_date ASC LIMIT 0, 25;

comment:4 in reply to: ↑ 3 johnjamesjacoby22 months ago

Replying to vibol:

John, I did some preliminary tests and the results aren't looking good. Here are the slow queries that came through.

For the first query, last_active_time now looks to be the culprit. It's a postmeta lookup with a sort.

Meta queries are always going to be a slow like this. There isn't an available index we can use here to avoid the meta query. We *could* hack in repurposing post_date as the freshness time, but we lose the actual date the topic was posted. It would take some rearchitecting to move the original post creation date to postmeta, to make the index available.

For the second query, it looks like wp_post.post_parents is the culprit. The type_status_date index in wp_post does not have post_parents as an index field so MySQL does not have a compatible index for the lookup.

What happens if you change the 'orderby' in bbp_has_replies() from 'date' to 'ID'? That will take the weight off of using the type_status_date index, and instead should use the primary and post_parent ones.

Odd that the previous meta-query method was 2 seconds faster.

johnjamesjacoby22 months ago

comment:5 vibol22 months ago

Got it, for the postmeta issue. Sounds like it's a bit involved.

On the bbp_has_replies() function, sorting by ID is a good optimization. I tried it and while it causes MySQL to use the PRIMARY index key, MySQL does not appear to be using the index for post_parent. Here's the EXPLAIN:

EXPLAIN SELECT wp_posts.* FROM wp_posts WHERE 1=1  AND (wp_posts.ID = 1218973 OR wp_posts.post_parent = 1218973)  AND wp_posts.post_type IN ('topic', 'reply') AND (wp_posts.post_status = 'publish' OR wp_posts.post_status = 'closed') GROUP BY wp_posts.ID ORDER BY wp_posts.ID ASC LIMIT 0, 25 \G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: wp_posts
         type: index
possible_keys: PRIMARY,type_status_date,post_parent,type_status_id_parent
          key: PRIMARY
      key_len: 8
          ref: NULL
         rows: 44
        Extra: Using where
1 row in set (0.00 sec)

From this, MySQL *should* be using an index merge on PRIMARY and post_parent, but it's not. Changing the query to use FORCE INDEX solves the performance problem:

EXPLAIN SELECT wp_posts.* FROM wp_posts FORCE INDEX (PRIMARY, post_parent) WHERE 1=1  AND (wp_posts.ID = 1218973 OR wp_posts.post_parent = 1218973)  AND wp_posts.post_type IN ('topic', 'reply') AND (wp_posts.post_status = 'publish' OR wp_posts.post_status = 'closed') GROUP BY wp_posts.ID ORDER BY wp_posts.ID ASC LIMIT 0, 25 \G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: wp_posts
         type: index_merge
possible_keys: PRIMARY,post_parent
          key: PRIMARY,post_parent
      key_len: 8,8
          ref: NULL
         rows: 54
        Extra: Using union(PRIMARY,post_parent); Using where; Using filesort
1 row in set (0.00 sec)

comment:6 vibol22 months ago

Just to add, on the performance improvement side:

SELECT wp_posts.* FROM wp_posts WHERE 1=1  AND (wp_posts.ID = 1218973 OR wp_posts.post_parent = 1218973)  AND wp_posts.post_type IN ('topic', 'reply') AND (wp_posts.post_status = 'publish' OR wp_posts.post_status = 'closed') GROUP BY wp_posts.ID ORDER BY wp_posts.ID ASC LIMIT 0, 25;
...
25 rows in set (12.60 sec)
SELECT wp_posts.* FROM wp_posts FORCE INDEX (PRIMARY, post_parent) WHERE 1=1  AND (wp_posts.ID = 1218973 OR wp_posts.post_parent = 1218973)  AND wp_posts.post_type IN ('topic', 'reply') AND (wp_posts.post_status = 'publish' OR wp_posts.post_status = 'closed') GROUP BY wp_posts.ID ORDER BY wp_posts.ID ASC LIMIT 0, 25;
...
25 rows in set (0.00 sec)
Note: See TracTickets for help on using tickets.