Opened 10 months ago
Last modified 7 weeks ago
#1925 new defect
Performance: Optimize _bbp_last_active_time to use an indexed field for very large deployments
| Reported by: |
|
Owned by: | |
|---|---|---|---|
| Priority: | normal | Milestone: | 2.4 |
| Component: | General | Version: | 2.1.2 |
| Severity: | normal | Keywords: | dev-feedback |
| Cc: | vibol, georgemamadashvili@…, wordpress@…, jared@…, stephen@…, nashwan.doaqan@… |
Description
Background ticket: http://bbpress.trac.wordpress.org/ticket/1885
Background Conversation:
Vibol: For the first query, last_active_time now looks to be the culprit. It's a postmeta lookup with a sort.
JJJ: Meta queries are always going to be a slow like this. There isn't an available index we can use here to avoid the meta query. We *could* hack in repurposing post_date as the freshness time, but we lose the actual date the topic was posted. It would take some rearchitecting to move the original post creation date to postmeta, to make the index available.
Problem: The _bbp_last_active_time query is doing a full table scan and sort against the postmeta table when retrieving forum index pages. The slowdown is linear with size of table scan. Here's an example slow query:
# Time: 120804 19:52:17 # Query_time: 21.460766 Lock_time: 0.000051 Rows_sent: 25 Rows_examined: 313770 SET timestamp=1344127937; SELECT SQL_CALC_FOUND_ROWS wp_posts.* FROM wp_posts INNER JOIN wp_postmeta ON (wp_posts.ID = wp_postmeta.post_id) WHERE 1=1 AND wp_posts.post_parent = 17 AND wp_posts.post_type = 'topic' AND (wp_posts.post_status = 'publish' OR wp_posts.post_status = 'closed') AND (wp_postmeta.meta_key = '_bbp_last_active_time' ) GROUP BY wp_posts.ID ORDER BY wp_postmeta.meta_value DESC LIMIT 0, 25;
Expected behavior: Should be fast if the data is indexed, but the postmeta table does not have indexes against it.
Guidance: I'm looking for someone to fix this, or, I am happy to put in the time to fix it with guidance on desired approach. This is a non-trivial issue, so someone with more experience in WP/BBP is ideal.
Attachments (2)
Change History (18)
comment:2
johnjamesjacoby — 9 months ago
As I dig through this, it seems like there are more issues that crop up. See the following set of queries that happen from forum view to thread reply.
# Query_time: 14.640429 Lock_time: 0.000133 Rows_sent: 25 Rows_examined: 313770 SET timestamp=1345373401; SELECT SQL_CALC_FOUND_ROWS wp_posts.ID FROM wp_posts INNER JOIN wp_postmeta ON (wp_posts.ID = wp_postmeta.post_id) WHERE 1=1 AND wp_posts.post_parent = 17 AND wp_posts.post_type = 'topic' AND (wp_posts.post_status = 'publish' OR wp_posts.post_status = 'closed') AND (wp_postmeta.meta_key = '_bbp_last_active_time' ) GROUP BY wp_posts.ID ORDER BY wp_postmeta.meta_value DESC LIMIT 0, 25; # Time: 120819 5:51:22 # Query_time: 11.424189 Lock_time: 0.000054 Rows_sent: 1 Rows_examined: 1623849 SET timestamp=1345373482; SELECT COUNT( DISTINCT post_author ) FROM wp_posts WHERE ( post_parent = 1623928 AND post_status = 'publish' AND post_type = 'reply' ) OR ( ID = 1623928 AND post_type = 'topic' ); # Time: 120819 5:51:50 # Query_time: 27.115751 Lock_time: 0.000070 Rows_sent: 55996 Rows_examined: 56000 SET timestamp=1345373510; SELECT ID FROM wp_posts WHERE post_parent = 17 AND post_status IN ( 'publish', 'closed' ) AND post_type = 'topic' ORDER BY ID DESC;
The last one I'll have to try to attach because it's too big to fit in this field.
WP posts have four date fields. post_date, post_date_gmt, post_modified and post_modified_gmt. Feels hacky but we could use the post_modified or post_modified_gmt in the forums and topics records for the last_active_time. Even without index, not having to join against postmeta will increase the performance a lot
comment:6
johnjamesjacoby — 6 months ago
Sadly, since all of the core database columns are used by WordPress already, reusing a column for our own purposes is probably dangerous, and will take more work to override the way WordPress naturally writes to those columns than it's probably worth.
post_modified *almost* makes sense, but we should be fully aware of what WordPress core uses that for before hijacking it. If it's used for comparing revisions, we're out of luck; if it's not used at all, we're at risk of it being deprecated and disappearing completely (like post_category).
comment:7
johnjamesjacoby — 5 months ago
- Milestone changed from 2.3 to 2.4
Query times are the average of 20 runs. First couple of runs ignored as to get MySQL "hot".
DB Size:
forums 5
topics 29208
replies 95755
Main query now (ordering by meta_value):
SELECT SQL_CALC_FOUND_ROWS wp_posts.ID
FROM wp_posts
INNER JOIN wp_postmeta
ON ( wp_posts.ID = wp_postmeta.post_id )
WHERE 1 = 1
AND wp_posts.post_parent = 4
AND wp_posts.post_type = 'topic'
AND ( wp_posts.post_status = 'publish'
OR wp_posts.post_status = 'closed'
OR wp_posts.post_status = 'private' )
AND ( wp_postmeta.meta_key = '_bbp_last_active_time' )
GROUP BY wp_posts.ID
ORDER BY wp_postmeta.meta_value DESC
LIMIT 0, 15
Avg. time: 162ms
Main query without the join:
SELECT SQL_CALC_FOUND_ROWS wp_posts.ID
FROM wp_posts
WHERE 1 = 1
AND wp_posts.post_parent = 4
AND wp_posts.post_type = 'topic'
AND ( wp_posts.post_status = 'publish'
OR wp_posts.post_status = 'closed'
OR wp_posts.post_status = 'private' )
GROUP BY wp_posts.ID
ORDER BY wp_posts.post_modified DESC
LIMIT 0, 15
Avg. time: 14.9ms
I'm cleaning the patch to switch all uses of _bbp_last_active_time meta to post_modified. Will post asap.
First pass
1) The batch update script in tools.php is not working. Need to fix.
2) I need to change the importer in includes/admin/converters/bbPress1.php, but I don't have a clue how. Need to research how it works. (Never played with the importer)
comment:10
vibol — 3 months ago
Finally swinging back around to this. MZAWeb, are you using your changes in production?
Another field that has come to my attention is menu_order. It's primarily used for menu items, but is seldom used for Posts. There are already many plugins that use it to allow users to manually order their Pages and Posts. It's an int(11) so it should be able to fit a timestamp until about 2033.
In either case (post_modified or menu_order), altering the index on the table to add the newly referenced field should help speed things up as well since post_modified and menu_order are not in the indices.
I'd be curious to hear your thoughts.
comment:11
vibol — 3 months ago
An update.
First off, thank you MZAWeb for contributing your patch. The patch makes a significant difference in performance for my use case, enough so that I intend to use it in production.
The only concern would be curiosity into where WordPress updates last_modified--I believe it does this if a Post is edited (in this case, a Topic). I'm still mulling over the idea of using menu_order, simply because that is a field that is seldom changed by WordPress.
I'll put up changes I made to MZAWeb's patch that prevents an infinite recursion when a Forum is created. I'll also include an index that I've added to further speed up the query.
comment:12
jaredatch — 8 weeks ago
- Cc jared@… added
comment:13
in reply to:
↑ 9
netweb — 8 weeks ago
- Cc stephen@… added
Replying to MZAWeb:
2) I need to change the importer in includes/admin/converters/bbPress1.php, but I don't have a clue how. Need to research how it works. (Never played with the importer)
This part should be pretty straightforward (for me), I'll test your patch and add a patch for the importers here next week.
comment:14
follow-up:
↓ 15
alex-ye — 7 weeks ago
- Cc nashwan.doaqan@… added
I really like this ticket it will help to get a better performance , I read your conversation about using last_modified or menu_order but It seems not a good idea for me , And you know why ! ...
There are some other options we can use :
1 - Create a new table "wp_forum_data" like an example ,
with an indexed columns for the most needed things .
2 - Extend the wp_posts table columns .
comment:15
in reply to:
↑ 14
;
follow-up:
↓ 16
MZAWeb — 7 weeks ago
Replying to alex-ye:
I really like this ticket it will help to get a better performance , I read your conversation about using last_modified or menu_order but It seems not a good idea for me , And you know why ! ...
Actually, I'm not sure why it doesn't seem a good idea for you. Care to elaborate? I'm on the fence myself and would love to hear arguments.
There are some other options we can use :
1 - Create a new table "wp_forum_data" like an example ,
with an indexed columns for the most needed things .
2 - Extend the wp_posts table columns .
I totally get the arguments in favor of that, I really do. But I'm quite sure adding a new table or field it's not gonna happen.
comment:16
in reply to:
↑ 15
alex-ye — 7 weeks ago
Replying to MZAWeb:
Care to elaborate? I'm on the fence myself and would love to hear arguments.
The main arguments is :
1 - WordPress is still use the post_modified, post_modified_gmt and menu_order , So using this fields by bbPress in another meaning can cause many bugs as my expectation .
2 - _bbp_last_active_time is not the only meta value which make SQL queries slow , bbPress use many meta fields , and it may be add more in the future
3 - What if the keymaster want to know the really last modified forum ?! , last_active_time and post_modified have a confuse meaning ! , This is not logical to me
I totally get the arguments in favor of that, I really do. But I'm quite sure adding a new table or field it's not gonna happen.
I think adding a new table can be good idea if we have many meta to index , but can you explain why you refuse to add a new column ?
So... We're sort of stuck. The approach you quoted from our previous conversation is the only available index in the posts table that makes sense to try to use. We could also, maybe, try a subquery using the ID and post_parent indexes, but that only works if we can guarantee that newer replies will always have a higher ID. This *should* be the case, but it's hard to know for sure given the way the importer works, and how other forum software stores its data.
It's a problem not easily solved due to our limited schema and a lack of indexes on the postmeta table. Happy that you're around looking into it too.