Welcome to the Question2Answer Q&A. There's also a demo if you just want to try it out.
+2 votes
1.9k views
in Plugins by

Hi guys, having a blob table of close to 400 MB makes my list all uploads plugin (image manager) very slow when using "Show only unused images", and thus causing a 504 gateway time-out of the server:

"Gateway Time-out
The gateway did not receive a timely response from the upstream server or application.
Apache/2.2.22 Server at www.gute-mathe-fragen.de Port 80"

Even though that I specify to check only the images of the last 3 days. So I think that it is also due to the amount of posts (currently about 89 000) that are checked for containing the blob ID.

Has anyone an idea how I can improve the code so that there is no timeout?

Thanks a lot!
Kai

 

The php code of the listing page: https://github.com/echteinfachtv/q2a-list-uploads-page/blob/master/qa-list-uploads-page.php

by
Indeed it smells like a query taking too much. Enable the Q2A DEBUG flag. Check the queries the plugin is running. Paste them in the question.

1 Answer

+1 vote
by
I've just taken a very quick look at the link and glanced at the queries. You have a main query $queryRecentUploads that shouldn't take long to process for a fixed amount of days (although if you add an index on created field it will run in no time). The issue I see is that for each of those returned results you are running more queries. That is killing the DB. To make things worse this part "WHERE `content` LIKE '%"." will pretty sure burn your hard drive :) This is because it is checking the whole content of all matched posts... that is too much text processing.
 
It seems you're using it to look for a blob id. Most likely you can find that somewhere else and, in case it is not present (I don't know much about blobs) then the best approach (by far) will be to create your own table to track those posts and update it whenever a post is created with an event module. So your query will run against that table and it doesn't require any text processing.
 
Add the index on created first (now). That will improve performance, probably enough to get rid of the error. That will give you some time to, meanwhile, fix the plugin.
 
Just throwing some ideas.
by
Hi pupi1985, thanks for the input. I have not found any other way to check if the blob-ID (which is just a number) is contained in the posts, yet.

The plugin worked great up to 50000 posts or so. Then the gateway timeout appeared.

I could create an extra table, where I store uploaded images (their blob IDs). But when a user edits his questions and deletes the image later on. The blobID would be still in the table.

Today I have also thought of implementing the "processing" that q2a uses for "reindex content" and "recalculate user points" - then a timeout could be avoided. But I have not found time to look into the mechanism.

Of course, the best solution would be to check the currently uploaded images against the current (created, edited) posts so that redundant images can be removed. But here we would miss avatar images and possible images in pages. All seems a bit messy.

"Add the index on created first (now)" ← What do you mean?
by
"The blobID would be still in the table" -> Actually not. Use the event module to catch the question edit/removal/hiding/whatever event and then perform the text check only in that post. Then update your table accordingly to keep it up to date.

"reindex content": Be careful with that. It will very much depend on how you implement it but you should take extra attention to make it performant. Otherwise, the indexing might take more than the actual query and, hence, not be able to finish because of a timeout.

Add the index on created first (now): In order to keep the service (your plugin) up and running for as much time as possible, you should get this fixed ASAP (now). The plugin will require some refactorings that will take you some time to perform. The index on CREATED to speed the first query will only take a few minutes. You can do something like this:

ALTER TABLE qa_blob ADD INDEX img_man_idx1 (created);

Check the MySQL manual for more information: http://dev.mysql.com/doc/refman/5.0/es/alter-table.html

If you're lucky that will speed things up enough for you to win some time to fix the plugin keeping the service up and running the minimum amount of time :)
by
I wondered why I can list all images by visiting the default plugin page (./listuploads) but get a timeout when deleting.

Just now I see that I specified *30 days* for the delete action and not 3 days as for the listing. Changing $lastdays = 3; solves it - at least for now...
by
I've added some comments in github to improve performance and a bit of readability. Then, it is just a matter of getting those hands dirty :)
by
Hi pupi1985, thanks for your code suggestions. I will have a look over the next days. Thanks a lot :)
by
I finally rewrote the entire code but with a different approach - that seems to be more efficient: I go over all posts, extract all blobids, save them in a big array. Then I take the blobids of the last 3 days and check if they are in the array, if not, they are not used and can be deleted (frontend by the admin). In other words, only two database queries!

I have also implemented a rotate feature because too many users upload pictures rotated :)

The plugin will be released in the latest version after some more testing, say two months (Christmas?!).

Greetings,
Kai
...