Performance - adding indexes to stats_downloads table

Project:RUcore Statistics
Version:8.1.1
Component:Code
Category:bug report
Priority:normal
Assigned:chadmills
Status:closed
Description

A summary is below. I think adding these two indexes to dev and then test and getting some new performance figures is worthwhile. The building of the indexes took minutes and I used last night's production MySQL dump. Here is a summary, using real world queries performed by the software.

Query 1
-------
SELECT DISTINCT datastreamID FROM stats_downloads;

Added Index
-----------
ALTER TABLE `stats`.`stats_downloads` ADD INDEX `datastreamID` (`datastreamID`);

Performance
-----------
Before INDEX added: 152 rows in set (4.01 sec)
After INDEX added: 152 rows in set (0.05 sec)

Query 2; a typical example
--------------------------
SELECT COUNT(a.datastreamID) AS count, a.datastreamID as datastreamID FROM stats_downloads a, stats_identity b WHERE a.identity=b.iid AND a.object="rutgers-lib:31143" GROUP BY datastreamID order by datastreamID;

Added Index
-----------
ALTER TABLE `stats`.`stats_downloads` ADD INDEX `object` (`object`);

Performance
-----------
Before INDEX added: 1 row in set (4.27 sec)
After INDEX added: 1 row in set (0.00 sec)

Comments

#1

On dev these are the performance gains using the two queries listed in original comment. This was run against the current stats database on dev and resource rutgers-lib:4291 had the most entries that is why it was chosen as an example. After the index the performance increase is notable.

Query 1
-------
SELECT DISTINCT datastreamID FROM stats_downloads;

Added Index
-----------
ALTER TABLE stats_downloads ADD INDEX `datastreamID` (`datastreamID`);

Performance
-----------
Before INDEX added: 324 rows in set (0.10 sec)
After INDEX added: 324 rows in set (0.01 sec)

Query 2; a typical example
--------------------------
SELECT COUNT(a.datastreamID) AS count, a.datastreamID as datastreamID FROM stats_downloads a, stats_identity b WHERE a.identity=b.iid AND a.object="rutgers-lib:4291" GROUP BY datastreamID order by datastreamID;

Added Index
-----------
ALTER TABLE stats_downloads ADD INDEX `object` (`object`);

Performance
-----------
Before INDEX added: 2 rows in set (0.15 sec)
After INDEX added: 2 rows in set (0.01 sec)

#2

On test these are the performance gains using the two queries listed in comment 1. This was run against the current stats database on dev and resource rutgers-lib:24203 had the most entries that is why it was chosen as an example. After the index the performance increase is notable.

Query 1
-------
SELECT DISTINCT datastreamID FROM stats_downloads;

Added Index
-----------
ALTER TABLE stats_downloads ADD INDEX `datastreamID` (`datastreamID`);

Performance
-----------
Before INDEX added: 347 rows in set (0.09 sec)
After INDEX added: 347 rows in set (0.01 sec)

Query 2; a typical example
--------------------------
SELECT COUNT(a.datastreamID) AS count, a.datastreamID as datastreamID FROM stats_downloads a, stats_identity b WHERE a.identity=b.iid AND a.object="rutgers-lib:24203" GROUP BY datastreamID order by datastreamID;

Added Index
-----------
ALTER TABLE stats_downloads ADD INDEX `object` (`object`);

Performance
-----------
Before INDEX added: 2 rows in set (0.09 sec)
After INDEX added: 2 rows in set (0.02 sec)

#3

Version:8.1» 8.1.1
Status:active» fixed

#4

Status:fixed» closed

Back to top