Opened 4 years ago
Closed 3 years ago
#1146 closed defect (fixed)
Recount functions are incredibly poor on performance/resources and timeout
| Reported by: |
|
Owned by: |
|
|---|---|---|---|
| Priority: | high | Milestone: | 1.0.3 |
| Component: | Administration | Version: | 1.0.2 |
| Severity: | critical | Keywords: | has-patch tested |
| Cc: |
Description
All of the recount functions need to be rewritten to work completely internal with mysql instead of external loops. There is a serious growing problem with query counts that can go into the MILLIONS.
If a forum has half a million topics, many of the recount functions first fetch ALL the topic id's and THEN loop through them, writing to meta for each one separately. Updatemeta uses at least 3 queries each.
So to use just one of the recount functions on a forum with a half million topics, you currently pull an array of the topic id's several megabytes in size into ram, then proceed to use 500,000 x 3 = 1.5 MILLION QUERIES to update the meta.
Then you repeat that for many of the recount functions. By the time you are done, you've used over 10 MILLION QUERIES and melted even the most robust of mysql servers into a puddle (but probably timed out most PHP setups well before that).
I have an internal recount function in Post Count Plus that does the recount completely in mysql and handles a half million posts in 6 seconds. The manual loop method can't even finish in the 30 second timeout on most PHP installs.
Attachments (3)
Change History (28)
This little query will recount all posts for all topics, 100% in mysql, no external php looping requires and only one query from php
INSERT INTO bb_topics (topic_id, topic_posts) (SELECT topic_id, count(post_status) as topic_posts FROM bb_posts WHERE post_status = '0' GROUP BY topic_id) ON DUPLICATE KEY UPDATE topic_posts=VALUES(topic_posts)
in addition to recounting posts in topics above:
recount topics in forums in one query
INSERT INTO bb_forums (forum_id, topics) (SELECT forum_id, count(topic_status) as topics FROM bb_topics WHERE topic_status = '0' GROUP BY forum_id) ON DUPLICATE KEY UPDATE topics=VALUES(topics)
recount posts in forums in one query
INSERT INTO bb_forums (forum_id, posts) (SELECT forum_id, count(post_status) as posts FROM bb_posts WHERE post_status = '0' GROUP BY forum_id) ON DUPLICATE KEY UPDATE posts=VALUES(posts)
- Milestone set to 1.1
- Owner set to sambauers
- Priority changed from normal to high
- Severity changed from normal to critical
- Status changed from new to assigned
These are a great start. Recounting is currently a dangerous affair for anyone with any reasonable amount of data.
Prioritising this for 1.1 and maybe before that if I can get time. I'd appreciate more of your SQL query writing if you have time.
It would be nice to just have a single recount button with no options if we can get these fast enough.
One small limitation on this is that sometimes we can't do queries which query more than one table very reliably.
For example, I was just rewriting the recount of topics replied for each user in this way, but my user tables are in a separate DB from the rest of the tables.
As a result, the sub-query wouldn't work. The solution was to manually build the values to insert from the sub-query that is run separately.
Three queries (DELETE, SELECT, INSERT) instead of two (DELETE, INSERT WITH SUB QUERY).
Still multiple times faster than before though.
A really (really, really) big table might run into a MAX_QUERY_SIZE limit too, we can probably work around that quite simply though.
Oh sorry, I forgot to update this ticket, I've now rewritten a few of the recounting functions for bbPress 0.9 - though not all of them yet and a couple of them will have to be changed for 1.0
Oh and I highly recommend you keep them separate and not just one big recount button.
Even the faster versions take several seconds on very large forums and it's possible that PHP will still timeout if they are all done at once. I would NOT even put in a "check all" button, people should look and select only what they need and do it in multiple passes if necessary to keep server loads down.
comment:10
sambauers — 4 years ago
comment:11
sambauers — 4 years ago
comment:12
sambauers — 4 years ago
comment:13
sambauers — 4 years ago
comment:14
follow-up:
↓ 15
Nightgunner5 — 4 years ago
bb_recount_topic_tags is missing a global $bbdb;
comment:15
in reply to:
↑ 14
Gautam Gupta — 3 years ago
Replying to Nightgunner5:
bb_recount_topic_tags is missing a global $bbdb;
That's fixed. Check [2391].
- Milestone changed from 1.1 to 1.5
comment:17
GautamGupta — 3 years ago
- Keywords needs-patch added
- Milestone changed from 1.5 to 1.1
- Summary changed from recount functions are incredibly poor on performance/resources and timeout to Recount functions are incredibly poor on performance/resources and timeout
- Version changed from 1.0 to 1.0.2
Now, only 3 functions are left to be optimized.
comment:18
sambauers — 3 years ago
I added a first attempt at faster tag recounting. Unfortunately it was necessary to bypass the taxonomy API completely. It was also necessary to change the way each function returns. Recount functions used to return a message, now they return an array with the first value being an "exit" code and the second being the message. Recount facilities added by plugins can still just return a message if they like or an array similar to the core functions.
comment:19
sambauers — 3 years ago
- Keywords needs-testing added; needs-patch removed
comment:20
sambauers — 3 years ago
Just reattached patch to fix a bad message being displayed in some cases when requesting a "Remove deleted topics from users' favorites" recount.
comment:21
sambauers — 3 years ago
- Keywords has-patch added
comment:22
GautamGupta — 3 years ago
- Keywords tested added; needs-testing removed
The patch works great, I just tested it on a high activity forum. Here's the log that came back:
Counting the number of posts in each topic… Complete! Counting the number of voices in each topic… Complete! Counting the number of deleted posts in each topic… Complete! Counting the number of topics in each forum… Complete! Counting the number of posts in each forum… Complete! Counting the number of topics to which each user has replied… Complete! Counting the number of topic tags in each topic… Complete! Counting the number of topics in each topic tag… Complete! Deleting topic tags with no topics… Already run with success. Removing deleted topics from user favorites… Nothing to remove! Cleaning up bbPM messages…
comment:23
sambauers — 3 years ago
- Milestone changed from 1.1 to 1.0.3
This should just be added to the next minor release.
comment:24
GautamGupta — 3 years ago
I conducted a statistical test, and here are the results (without any plugins active):
- No. of queries taken to normally load the recount page: 7
- No. of queries taken by the functions which were not optimized (not including 7 above): 39
- No. of queries taken by the functions which were optimized by the patch (not including 7 above): 15
That makes a difference of 24 queries! (More than 50% which is great)
IMO, this can go into the core.
comment:25
chrishajer — 3 years ago
- Resolution set to fixed
- Status changed from assigned to closed
Example:
Here is how you are doing the topic_voices recount in 1.0
It uses 3-4 queries per topic because updatemeta reads, then writes, then reads, as well as flush the cache each time and other memory management. On a large forum this function might not even finish, hitting the php timeout and exhausting mysql resources:
if ( $topics = (array) $bbdb->get_results("SELECT topic_id FROM $bbdb->topics ORDER BY topic_id") ) { $messages[] = __('Counted voices'); foreach ($topics as $topic) { $topic_id = (int) $topic->topic_id; if ( $voices = $bbdb->get_col( $bbdb->prepare( "SELECT DISTINCT poster_id FROM $bbdb->posts WHERE topic_id = %s AND post_status = '0';", $topic_id ) ) ) { $voices = count( $voices ); bb_update_topicmeta( $topic_id, 'voices_count', $voices ); } } unset($topics, $topic, $topic_id); }Here is how I do it in only 2 queries in my Topic Views plugin for 0.9
It is lightning fast as it's done 100% inside of mysql.
$query="SELECT 'voices_count',topic_id,count(DISTINCT poster_id) as meta_value FROM $bbdb->posts WHERE post_status = 0 GROUP BY topic_id"; $bbdb->query("DELETE FROM $bbdb->topicmeta WHERE meta_key = 'voices_count'"); $bbdb->query("INSERT INTO $bbdb->topicmeta (meta_key, topic_id, meta_value) $query");