Uploaded image for project: 'Sakai'
  1. Sakai
  2. SAK-28594

Quartz Job Very Slow (with MySQL)

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: CLOSED
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.5.0
    • Fix Version/s: 2.5.6
    • Component/s: Sitestats
    • Labels:
      None
    • Environment:
      RHEL 4, Sun Java 1.5, MySQL 4.1
    • Previous Issue Keys:
      STAT-58

      Description

      Running the Quartz job on a MySQL-backed Sakai instance with several million rows in a database on a server-class machine takes an extemely long time. SHOW STATUS LIKE 'Created_tmp_%' shows an extemely high number of large (~13GB) temporary tables being created on disk.

      The root cause (per MySQL's slow-query-log) appears to be the MYSQL_GET_EVENT query contained in StatsAggregateJobImpl.java.
      The slow query log reports this query taking ~ 5min / 1000 records processed. A full-table scan combined with creation of a temporary table on disk. This temporary table in disk (13G on my test) is then created for each 1000 rows retrieved from the database.

      When explain is run on the the results contain type=ALL for SAKAI_SESSION as s – the MySQL manual suggests that that indicates probably performance problems. In effect, what's happening appears to be a creation of a large temporary join table, which is then recreated for each group.

      I have 2 thoughts on possible solutions:

      1. For MySQL simply remove the limit on the results returned by this query, and process the entire join at once – which should remove the continual recreation of the temporary table.

      2. Reformulate the query to only join portions of the session table for each chunk of data, e.g.

      === original query ===

      select EVENT_ID as EVENT_ID,EVENT_DATE as EVENT_DATE,EVENT as EVENT,REF as REF,SESSION_USER as SESSION_USER,e.SESSION_ID as SESSION_ID
      from SAKAI_EVENT e join SAKAI_SESSION s on e.SESSION_ID=s.SESSION_ID
      where EVENT_ID >= 1
      order by EVENT_ID asc
      limit 1000 offset 0

      === possible revised query ===

      select EVENT_ID, EVENT_DATE, EVENT, REF, SESSION_USER, e.SESSION_ID as SESSION_ID
      from SAKAI_EVENT e join SAKAI_SESSION s on e.SESSION_ID=s.SESSION_ID
      where EVENT_ID >= 1 and EVENT_ID < 1001 order by EVENT_ID asc

      Note: this query potentially returns a slightly different (e.g. less than 1000 in this example) number of records, if you're skipping admin events.

        Gliffy Diagrams

          Zeplin

            Attachments

              Activity

                People

                Assignee:
                nfernandes Nuno Fernandes (Inactive)
                Reporter:
                jayshao Jason Shao (Inactive)
                Votes:
                0 Vote for this issue
                Watchers:
                0 Start watching this issue

                  Dates

                  Created:
                  Updated:
                  Resolved:

                    Git Integration