Scalability Performance - Thousands of entries

Random stuff about serendipity. Discussion, Questions, Paraphernalia.
steve
Regular
Posts: 18
Joined: Sun Nov 30, 2003 3:42 am

Scalability Performance - Thousands of entries

Post by steve »

We're kinda pushing the envelope and have hit the wall now. Our Serendipity installation -0.71- has slowed down long enough where the index page load just times out!

The slow query is :

SELECT e.id, e.title, e.timestamp, e.comments, e.exflag, e.author, e.authorid, e.trackbacks, e.isdraft, e.allow_comments, e.last_modified, a.username, a.email , e.body, e.extended FROM blogDB_entries AS e LEFT JOIN blogDB_authors a ON e.authorid = a.authorid LEFT JOIN blogDB_entrycat ec ON e.id = ec.entryid LEFT JOIN blogDB_category c ON ec.categoryid = c.categoryid WHERE isdraft = 'false' AND e.timestamp <= '1105911050' GROUP BY e.id ORDER BY timestamp DESC LIMIT 15

This query never finishes. Any optimizing tips or tricks for large entry-count environments?
garvinhicking
Core Developer
Posts: 30022
Joined: Tue Sep 16, 2003 9:45 pm
Location: Cologne, Germany
Contact:

Re: Scalability Performance - Thousands of entries

Post by garvinhicking »

Hm, I'm personally working with an article base of about 600 entries and my query still executes within 0.1x seconds.

Did you check your SQL index keys, do they exist? Which MySQL version are you using, did you enable the query cache?

Serendipity 0.8 has a set of performance improvements, but this particular query you posted can't be much more improved. Did you try a "EXPLAIN SELECT ... " and look at that output?

Regards,
Garvin
# Garvin Hicking (s9y Developer)
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
steve
Regular
Posts: 18
Joined: Sun Nov 30, 2003 3:42 am

Scalability Performance - Thousands of entries

Post by steve »

We checked our index keys and yes they do exist.

We are stuck with this install at MySQL 3.23; we plan on migrating that server to 4.02 soon.

Explain results of the query are:

*** row 1 ***
table: e
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 40507
Extra: where used; Using temporary; Using filesort
*** row 2 ***
table: a
type: ALL
possible_keys: PRIMARY
key: NULL
key_len: NULL
ref: NULL
rows: 1
Extra: NULL
*** row 3 ***
table: ec
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 40509
Extra: NULL
*** row 4 ***
table: c
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: ec.categoryid
rows: 1
Extra: Using index

We do however- replicate to MySQL 4. When we execute that same query on that server (w/query cache 20mb), it also never completes. The query cache won't save the query and result set unless it does, so IMHO query cache is not our solution.

It * would appear* trivial to pull the latest 15 posts; mebbe we need some switch that can be thrown to use an alt query to do so -?

G, our entry table has more than 20 times your article base so we are working a huge entry table.

Regards;

Steve
garvinhicking
Core Developer
Posts: 30022
Joined: Tue Sep 16, 2003 9:45 pm
Location: Cologne, Germany
Contact:

Re: Scalability Performance - Thousands of entries

Post by garvinhicking »

Hi steve!

I'm not a big MySQL performance tuning guru. But reading this:

Code: Select all

*** row 1 ***
          table:  e
           type:  ALL
  possible_keys:  NULL
            key:  NULL
        key_len:  NULL
            ref:  NULL
           rows:  40507
          Extra:  where used; Using temporary; Using filesort
Looks bad - possible_keys should be used, right?

If you can provide a ZIPped SQL dump of your entries, I could try it on my server and see if I face the same issues; other than that, maybe you can get help from someone who knows how to read the EXPLAIN SELECTs properly? (because I can't :-))

Regards,
Garvin
# Garvin Hicking (s9y Developer)
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
tomsommer
Core Developer
Posts: 240
Joined: Tue Sep 02, 2003 6:43 pm
Location: Denmark
Contact:

Post by tomsommer »

filesort is the worst thing you can get...

It looks like your table might have lost its keys, try doing an OPTIMIZE on all of them (might take a while)
Tom Sommer (Serendipity Core Developer)
http://blog.dreamcoder.dk
tired_one
Regular
Posts: 25
Joined: Thu Jan 13, 2005 11:51 pm
Location: USA

Post by tired_one »

You may also want to run check table before doing optimize or analyze - it outputs basically the same info as myisamchk but doesn't require your tables to be out of comission while its running. In your case, if you run check table you probably want to use the option extended as it does a full key lookup for each row...but this can take a very long time.

Just so you know - when the Explain output says possibly_keys: NULL that means that Explain couldn't find any usable keys. And, using filesort is ugly - it means that the database engine is first doing the select part, then scanning all the rows returned a second time to index them and then going through the rows again to sort them based on the indexes created. its another indication that your tables aren't indexed properly.

So, it looks like your main problem child is the blogDB_entries table. Start with show index from blogDB_entries; - and compare the output of indices with your join statements and order by clause. Maybe you just need to add an index on your timestamp field since the only index that exists on that table by default is on the id field...and you are sorting on that timestamp field. ;) If that doesn't help things along, then do check table and/or optimize table.

-Robyn
steve
Regular
Posts: 18
Joined: Sun Nov 30, 2003 3:42 am

Everyone had good clues to point us in the right direction

Post by steve »

and thanks!

We changed some of the default install index creations:
_authors - added index on authorid
_entrycat - default installs no indexes, we indexed both the entryid and categoryid
_entries - added index on timestamp

the explain wound up looking like:

+-------+--------+------------------+-----------+---------+---------------+-------+---------------------------------------------+
| table | type | possible_keys | key | key_len | ref | rows | Extra |
+-------+--------+------------------+-----------+---------+---------------+-------+---------------------------------------------+
| e | range | timestamp | timestamp | 5 | NULL | 40507 | where used; Using temporary; Using filesort |
| a | ALL | PRIMARY,authorid | NULL | NULL | NULL | 1 | |
| ec | ref | entryid | entryid | 4 | e.id | 1 | |
| c | eq_ref | PRIMARY | PRIMARY | 4 | ec.categoryid | 1 | Using index |
+-------+--------+------------------+-----------+---------+---------------+-------+---------------------------------------------+

Now the blog loads! it's running 3-6 seconds.

Prior to changing the indexes, we did optimize the table which *did not* help.

We see we are still getting filesort in the explain - any inputs will be gladly heard.

We tested most of the added indexex performance by deleting them one at a time and re-running the query. It seems each one was needed.

Once again a big thanks to the s9y dev team.
garvinhicking
Core Developer
Posts: 30022
Joined: Tue Sep 16, 2003 9:45 pm
Location: Cologne, Germany
Contact:

Re: Everyone had good clues to point us in the right directi

Post by garvinhicking »

Hi steve!

About your index keys: An index on authorid should not be needed; the authorid key is already a primary key...

About entrycat: Our db.sql defaults to this:

Code: Select all

CREATE UNIQUE INDEX entryid_idx ON {PREFIX}entrycat (entryid, categoryid);
So there should already be an index for those.

Finally, the timestamp index on entries already exists regarding to our db.sql file:

Code: Select all

CREATE INDEX date_idx ON {PREFIX}entries (timestamp);
So I suppose when your Serendipity tables where created, the SQL user you created was not privileged to create index keys...

Regards,
Garvin
# Garvin Hicking (s9y Developer)
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
tired_one
Regular
Posts: 25
Joined: Thu Jan 13, 2005 11:51 pm
Location: USA

Post by tired_one »

I think there's something buggy about your db.sql file. ;) In the several installations I have going those indicies that you reference were never created by the install script...and I have goddess-like powers on all my servers and all my databases...lol.

So, I ran the commands manually using the same login I used in the configuration for s9y and the indices were created at that time.

-Robyn
garvinhicking
Core Developer
Posts: 30022
Joined: Tue Sep 16, 2003 9:45 pm
Location: Cologne, Germany
Contact:

Post by garvinhicking »

On my server a s9y installation always creates the index keys. I know of at least 15 installations I made that it worked there as well (different machines every time, different mysql versions, different php versions).

Are you sure you installed Serendipity with the "goddess-like" useraccount? I can't imagine why it would not create keys, because here it does....

Do you have a reproducible installation where the creation of the indices fails? I'd love to have a look at that setup.

Regards,
Garvin
# Garvin Hicking (s9y Developer)
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
tired_one
Regular
Posts: 25
Joined: Thu Jan 13, 2005 11:51 pm
Location: USA

Post by tired_one »

Yes - I'm sure I used my 'goddess-like' account...on one box its the only account that exists for the database.

Do you have VMWare or something similar? I could send you a virtual machine that is set up exactly like my production machine where the index creation fails.

Honestly, I didn't even know that these indexes existed since I never bothered to look in the db.sql file prior to today. Maybe now that I created them manually my load times will improve.

-Robyn
garvinhicking
Core Developer
Posts: 30022
Joined: Tue Sep 16, 2003 9:45 pm
Location: Cologne, Germany
Contact:

Post by garvinhicking »

Tired one, do you use a Linux environment on that specific VMWare machine?

Could it be that you use an old PHP version? (below 4.0.5?)

I suppose you're using Serendipity 0.7.1 - could you please edit the serendipity_db_mysql.inc.php file and change the whole function serendipity_db_schema_import() with this one:

Code: Select all

function serendipity_db_schema_import($query) {
    static $search  = array('{AUTOINCREMENT}', '{PRIMARY}',
        '{UNSIGNED}', '{FULLTEXT}', '{FULLTEXT_MYSQL}', '{BOOLEAN}');
    static $replace = array('int(11) not null auto_increment', 'primary key',
        'unsigned'  , 'FULLTEXT', 'FULLTEXT', 'enum (\'true\', \'false\') NOT NULL default \'true\'');

    echo 'Input query: <pre>' . $query . '</pre><br />';
    echo 'Transformed query: <pre>' . str_replace($search, $replace, $query) . '</pre><br />';
    $result = serendipity_db_query(str_replace($search, $replace, $query));
    echo 'Result of query: <pre>' . print_r($result, true) . '</pre>';
    echo 'Possible errors: <pre>' . mysql_error() . '</pre>';
    echo '<br /><br />';
    return $result;
}
This one tracks all the queries send by importing...I'd love to see the output of this!

Regards,
Garvin
# Garvin Hicking (s9y Developer)
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
tired_one
Regular
Posts: 25
Joined: Thu Jan 13, 2005 11:51 pm
Location: USA

Post by tired_one »

Yes, I use Linux - on that particular box it is Red Hat Enterprise AS 3.0, and yes the version of s9y we are using is 0.7.1. But, our PHP is 4.3.4 and mySQL is 4.0.16.

So - here's the output of the clean install I just did:

Code: Select all

Attemping to setup database...
Checking to see if the database is already in place. If you see a database query error here, ignore it... Done
Creating default database setup...Input query: 

create table s9ynew_authors (username varchar(20) default null,password varchar(32) default null,authorid {AUTOINCREMENT} {PRIMARY},mail_comments int(1) default '1',mail_trackbacks int(1) default '1',email varchar(128) not null default '',userlevel int(4) {UNSIGNED} not null default '0',right_publish int(1) default '1');

Transformed query: 
create table s9ynew_authors (username varchar(20) default null,password varchar(32) default null,authorid int(11) not null auto_increment primary key,mail_comments int(1) default '1',mail_trackbacks int(1) default '1',email varchar(128) not null default '',userlevel int(4) unsigned not null default '0',right_publish int(1) default '1');

Result of query: 
1
Possible errors: 



Input query: 
create table s9ynew_comments (id {AUTOINCREMENT} {PRIMARY},entry_id int(10) {UNSIGNED} not null default '0',parent_id int(10) {UNSIGNED} not null default '0',timestamp int(10) {UNSIGNED} default null,title varchar(150) default null,author varchar(80) default null,email varchar(200) default null,url varchar(200) default null,ip varchar(15) default null,body text,type varchar(100) default 'regular',subscribed {BOOLEAN},status varchar(50) not null);

Transformed query: 
create table s9ynew_comments (id int(11) not null auto_increment primary key,entry_id int(10) unsigned not null default '0',parent_id int(10) unsigned not null default '0',timestamp int(10) unsigned default null,title varchar(150) default null,author varchar(80) default null,email varchar(200) default null,url varchar(200) default null,ip varchar(15) default null,body text,type varchar(100) default 'regular',subscribed enum ('true', 'false') NOT NULL default 'true',status varchar(50) not null);

Result of query: 
1
Possible errors: 



Input query: 
CREATE {FULLTEXT} INDEX body_idx on s9ynew_comments (body);

Transformed query: 
CREATE FULLTEXT INDEX body_idx on s9ynew_comments (body);

Result of query: 
1
Possible errors: 



Input query: 
create table s9ynew_entries (id {AUTOINCREMENT} {PRIMARY},title varchar(200) default null,timestamp int(10) {UNSIGNED} default null,body text,comments int(4) {UNSIGNED} default '0',trackbacks int(4) {UNSIGNED} default '0',extended text,exflag int(1) default null,author varchar(20) default null,authorid int(11) default null,isdraft {BOOLEAN},allow_comments {BOOLEAN},last_modified int(10) {UNSIGNED} default null,moderate_comments {BOOLEAN});

Transformed query: 
create table s9ynew_entries (id int(11) not null auto_increment primary key,title varchar(200) default null,timestamp int(10) unsigned default null,body text,comments int(4) unsigned default '0',trackbacks int(4) unsigned default '0',extended text,exflag int(1) default null,author varchar(20) default null,authorid int(11) default null,isdraft enum ('true', 'false') NOT NULL default 'true',allow_comments enum ('true', 'false') NOT NULL default 'true',last_modified int(10) unsigned default null,moderate_comments enum ('true', 'false') NOT NULL default 'true');

Result of query: 
1
Possible errors: 



Input query: 
create table s9ynew_references (id {AUTOINCREMENT} {PRIMARY},entry_id int(10) {UNSIGNED} not null default '0',link text,name text);

Transformed query: 
create table s9ynew_references (id int(11) not null auto_increment primary key,entry_id int(10) unsigned not null default '0',link text,name text);

Result of query: 
1
Possible errors: 



Input query: 
CREATE TABLE s9ynew_exits (entry_id int(11) NOT NULL default '0',day date NOT NULL,count int(11) NOT NULL default '0',scheme varchar(5),host varchar(128) NOT NULL,port varchar(5),path varchar(255),query varchar(255),PRIMARY KEY  (host,day,entry_id));

Transformed query: 
CREATE TABLE s9ynew_exits (entry_id int(11) NOT NULL default '0',day date NOT NULL,count int(11) NOT NULL default '0',scheme varchar(5),host varchar(128) NOT NULL,port varchar(5),path varchar(255),query varchar(255),PRIMARY KEY  (host,day,entry_id));

Result of query: 
1
Possible errors: 



Input query: 
CREATE TABLE s9ynew_referrers (entry_id int(11) NOT NULL default '0',day date NOT NULL,count int(11) NOT NULL default '0',scheme varchar(5),host varchar(128) NOT NULL,port varchar(5),path varchar(255),query varchar(255),PRIMARY KEY  (host,day,entry_id));

Transformed query: 
CREATE TABLE s9ynew_referrers (entry_id int(11) NOT NULL default '0',day date NOT NULL,count int(11) NOT NULL default '0',scheme varchar(5),host varchar(128) NOT NULL,port varchar(5),path varchar(255),query varchar(255),PRIMARY KEY  (host,day,entry_id));

Result of query: 
1
Possible errors: 



Input query: 
create table s9ynew_config (name varchar(255) not null,value text not null,authorid int(11) default '0');

Transformed query: 
create table s9ynew_config (name varchar(255) not null,value text not null,authorid int(11) default '0');

Result of query: 
1
Possible errors: 



Input query: 
CREATE TABLE s9ynew_suppress (ip varchar(15) default NULL,scheme varchar(5),host varchar(128),port varchar(5),path varchar(255),query varchar(255),last timestamp NOT NULL);

Transformed query: 
CREATE TABLE s9ynew_suppress (ip varchar(15) default NULL,scheme varchar(5),host varchar(128),port varchar(5),path varchar(255),query varchar(255),last timestamp NOT NULL);

Result of query: 
1
Possible errors: 



Input query: 
CREATE unique INDEX url_idx on s9ynew_suppress (host, ip);

Transformed query: 
CREATE unique INDEX url_idx on s9ynew_suppress (host, ip);

Result of query: 
1
Possible errors: 



Input query: 
CREATE TABLE s9ynew_plugins (name varchar(128) not null,placement varchar(6) not null default 'right',sort_order int(4) not null default '0',authorid int(11) default '0',PRIMARY KEY(name));

Transformed query: 
CREATE TABLE s9ynew_plugins (name varchar(128) not null,placement varchar(6) not null default 'right',sort_order int(4) not null default '0',authorid int(11) default '0',PRIMARY KEY(name));

Result of query: 
1
Possible errors: 



Input query: 
CREATE TABLE s9ynew_category (categoryid {AUTOINCREMENT} {PRIMARY},category_name varchar(255) default NULL,category_icon varchar(255) default NULL,category_description text,authorid int(11) default NULL,category_left int(11) default '0',category_right int(11) default '0',parentid int(11) DEFAULT '0' NOT NULL);

Transformed query: 
CREATE TABLE s9ynew_category (categoryid int(11) not null auto_increment primary key,category_name varchar(255) default NULL,category_icon varchar(255) default NULL,category_description text,authorid int(11) default NULL,category_left int(11) default '0',category_right int(11) default '0',parentid int(11) DEFAULT '0' NOT NULL);

Result of query: 
1
Possible errors: 



Input query: 
CREATE TABLE s9ynew_images (id {AUTOINCREMENT} {PRIMARY},name varchar(255) not null default '',extension varchar(5) not null default '',mime varchar(255) not null default '',size int(11) not null default '0',dimensions_width int(11) not null default '0',dimensions_height int(11) not null default '0',date int(11) not null default '0',thumbnail_name varchar(255) not null default '',authorid int(11) default '0',path text);

Transformed query: 
CREATE TABLE s9ynew_images (id int(11) not null auto_increment primary key,name varchar(255) not null default '',extension varchar(5) not null default '',mime varchar(255) not null default '',size int(11) not null default '0',dimensions_width int(11) not null default '0',dimensions_height int(11) not null default '0',date int(11) not null default '0',thumbnail_name varchar(255) not null default '',authorid int(11) default '0',path text);

Result of query: 
1
Possible errors: 



Input query: 
CREATE {FULLTEXT} INDEX pathkey_idx on s9ynew_images (path);

Transformed query: 
CREATE FULLTEXT INDEX pathkey_idx on s9ynew_images (path);

Result of query: 
1
Possible errors: 



Input query: 
CREATE TABLE s9ynew_entrycat (entryid int(11) not null,categoryid int(11) not null);

Transformed query: 
CREATE TABLE s9ynew_entrycat (entryid int(11) not null,categoryid int(11) not null);

Result of query: 
1
Possible errors: 



Input query: 
CREATE UNIQUE INDEX entryid_idx ON s9ynew_entrycat (entryid, categoryid);

Transformed query: 
CREATE UNIQUE INDEX entryid_idx ON s9ynew_entrycat (entryid, categoryid);

Result of query: 
1
Possible errors: 



Done
Done creating database


Serendipity was successfully installed on your system.
It did create the indices noted in the output, but notice how it didn't transform the following lines:

Code: Select all

CREATE {FULLTEXT_MYSQL} INDEX entry_idx on {PREFIX}entries (title,body,extended);
CREATE INDEX date_idx ON {PREFIX}entries (timestamp);
CREATE INDEX mod_idx ON {PREFIX}entries (last_modified);
So, I had to manually create them. But, those are the only 3 that were skipped...which is much better than it was.

-Robyn
garvinhicking
Core Developer
Posts: 30022
Joined: Tue Sep 16, 2003 9:45 pm
Location: Cologne, Germany
Contact:

Post by garvinhicking »

Tired_one - thanks a lot. Much.

With your help I was finally able to find the problem, which was happening on fresh installations. I didn't notice them because I often upgrade my blogs and upgrade their structure with intermediate DB files.

There was a bug in our SQL parsing routine which did not catch those SQL index creation keys.

I have fixed this in our latest 0.8 development CVS (0.8-alpha10). Those index keys were missing:

Code: Select all

CREATE {FULLTEXT_MYSQL} INDEX entry_idx ON {PREFIX}entries (title,body,extended);
CREATE INDEX date_idx ON {PREFIX}entries (timestamp);
CREATE INDEX mod_idx ON {PREFIX}entries (last_modified);
CREATE INDEX configauthorid_idx ON {PREFIX}config (authorid);
CREATE INDEX pluginauthorid_idx ON {PREFIX}plugins (authorid);
CREATE INDEX imagesauthorid_idx ON {PREFIX}images (authorid);
CREATE INDEX entrypropid_idx ON {PREFIX}entryproperties (entryid);
Thanks a lot,
Garvin
# Garvin Hicking (s9y Developer)
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
steve
Regular
Posts: 18
Joined: Sun Nov 30, 2003 3:42 am

tired_one - Garvin - Tom - tnx

Post by steve »

thanks for all the detective work in uncovering the problem. We implemented the sql statements and created the indexes.

tnx again!

We are running a shared installation (just migrated to it) and reviewed the other blogs' tables and note the indexes weren't created as well. We used the shared installation instruction set located at http://www.s9y.org/41.html for a number of these.

On one installation in particular, we followed Garvin's instruction set for upgrading -then converting to a shared installation (http://www.s9y.org/forums/viewtopic.php ... highlight=)

The entries table for this installation has these indexes:

+-------------------+------------+----------+--------------+---------------+-----------+-------------+----------+--------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Comment |
+-------------------+------------+----------+--------------+---------------+-----------+-------------+----------+--------+---------+
| table entries | 0 | PRIMARY | 1 | id | A | 338 | NULL | NULL | |
| table entries | 1 | date_idx | 1 | timestamp | A | 338 | NULL | NULL | |
| table entries| 1 | mod_idx | 1 | last_modified | A | 338 | NULL | NULL | |
+-------------------+------------+----------+--------------+---------------+-----------+-------------+----------+--------+---------+


and is missing the fulltext index.

If we upgrade to the soon stable .8 release, this should add the proper indexes, am i right?

anyway tnx for all the brainpower
Post Reply