Uploaded image for project: 'Sakai'
  1. Sakai
  2. SAK-33838 GBNG: improve Import/Export processes
  3. SAK-33864

GBNG: Improve performance of Gradebook Import process

    Details

    • Type: Sub-task
    • Status: Verified
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 11.4
    • Fix Version/s: 12.1, 19.0
    • Component/s: Gradebook
    • Labels:
      None
    • 12 status:
      Resolved
    • Test Plan:
      Hide

      Please add a Test Plan here.

      Show
      Please add a Test Plan here.

      Description

      This PR contains several changes to the back-end code that performs the Gradebook Import:

      • Reduce the amount of logging performed during the saving routine, change the log level appropriately
        • Currently an INFO level log message is produced for each grade, and each comment, for each user in each "sheet" that is processed
        • For large imports, this equates to a lot of logging statements, which contributes to log bloat and probably reduces performance on some level
        • It also could be argued that it's a security concern to log grades and user identifiers at INFO level OOTB
      • Bulk save grades and comments per assignment/gradebook item
        • The current process saves each grade and each comment for each student in each item serially, and waits for a return code from each save/update
        • This slows down the import process immensely; massive improvements in processing time are achieved by saving grades and comments in bulk, one assignment/grade item at a time
        • Also, the return value was never being used. I think originally it was designed this way to have real-time feedback in case someone else was editing the same item/grade/comment; however there's no code to actually perform this. Even if we were to implement this, what decision should be made? Does the Import suddenly abort or fail? Does the exact item that was concurrently edited get skipped? In our opinion, concurrent edits don't really matter for the context of an Import; the last saved value should always take precedence and there is always the grade log to see who made changes and when.
        • Instead of returning one return code per grade/student/assignment, return one code per saveGradesAndComments() call
      • Eliminate String->Double conversion where appropriate
      • Reduce double/triple/etc querying
        • There are multiple places where the same data is being re-queried multiple times and then tossed out, rather than being passed down the line for use in other places. Refactoring these places to utilize the already retrieved data correlates to large performance increases
      • AJAX panel swap the CreateGradeItemStep/GradeImportConfirmationStep panels into place to avoid a page refresh and the associated loading/rendering time between each step
      • Fixed a bug when navigating back and forwards through the Import wizard, which causes loss of user data:
        • To reproduce, import a gradebook item with no points, assign points in the import wizard, hit next, hit back, and see that your points are now missing.
        • Also, when going back and forth in the wizard and changing your mind about things like points, you'll be presented with "Title required and must be unique" when you finally try to save, although it does still create your item, but only the first version you set up (ie. if you picked 10 points and then went back to make it 15 points, you'll be presented with the title error but the item will be created with 10 points behind the scenes even if you just leave the wizard.
      • Fixed a bug where if you import an item with no points possible in the header, grading data would not be imported for this column. The item would be created, but all the grades in the corresponding spreadsheet column would be ignored.
        • To reproduce, import a spreadsheet containing one new gradebook item and a corresponding comment column. Do not give the item a points possible value in the header, but give some students both grades and comments. Import the file, notice that the wizard forces you to give the item a points possible value in the next step, finish the wizard; notice the item was created, all comments were imported but none of the grades were imported.

      Performance Metrics

      The following 5 scenarios were tested both prior to these modifications and after, using the same site, same users, same gradebook states, and same import files. The site used to test has the following commonalities:

      • Course site
      • 5000 student enrolments

      The 5 scenarios/import files are as follows:

      • Scenario #1:
        • We'll call this file "import1000_allNew.csv"
        • Start with empty gradebook (no gradebook items exist)
        • Import file has 1000 student rows
        • Import file has 2 gradebook item columns
        • Import file has 2 corresponding comment columns
        • Grading/comment data is only present in the first ~55 student rows
      • Scenario #2:
        • We'll call this file "import1000_updates.csv"
        • Start with the gradebook state after the import from Scenario #1
        • Import file has 1000 student rows
        • Import file has the same 4 gradebook item/comment columns as the Import file from Scenario #1
        • Grading/comment data is only present (updated) in the first ~55 rows
      • Scenario #3:
        • We'll call this file "import2000_allNew.csv"
        • Start with empty gradebook (no gradebook items exist)
        • Import file has 2000 student rows
        • Import file has 5 gradebook item columns
        • Import file has 5 corresponding comment columns
        • Does not contain grading or comment data for any of the student rows (it's essentially just creating the gradebook items)
      • Scenario #4:
        • We'll call this file "import2000_updates.csv"
        • Start with the gradebook state after the import from Scenario #3
        • Import file has 2000 student rows
        • Import file has the same 10 gradebook item/comment columns as the Import file from Scenario #3
        • Grading/comment data is present for most rows/columns
      • Scenario #5:
        • We'll call this file "import5000.csv"
        • Start with empty gradebook (no gradebook items exist)
        • Import file has 5000 student rows
        • Import file has 72 gradebook item columns
        • Import file does not have any corresponding comment columns
        • Grading data is present for most rows/columns

      Performance Results

      Scenario Pre-Modifications Results Post-Modifications Results Performance Increase
      Scenario #1
      • createNewItems block = 104 ms
      • itemsToMod block = 0 ms
      • itemsToSave block = 56817 ms
      • saveComment() called 2000 times; total elapsed = 13012 ms; average elapsed = 6 ms
      • saveGrade() called 2000 times; total elapsed = 31043 ms; average elapsed = 15 ms
      • total = 56923 ms
      • createNewItems block = 165 ms
      • itemsToMod block = 1 ms
      • itemsToSave block = 175 ms
      • saveGradesAndCommentsForImport() called 2 times; total elapsed = 138 ms; average elapsed = 69 ms
      • total = 349 ms
       ~99.3%
      Scenario #2
      • createNewItems block = 0 ms
      • itemsToMod block = 0 ms
      • itemsToSave block = 56988 ms
      • saveComment() called 2000 times; total elapsed = 12998 ms; average elapsed = 6 ms
      • saveGrade() called 2000 times; total elapsed = 31550 ms; average elapsed = 15 ms
      • total = 56989 ms
      • createNewItems block = 0 ms
      • itemsToMod block = 0 ms
      • itemsToSave block = 254 ms
      • saveGradesAndCommentsForImport() called 4 times; total elapsed = 220 ms; average elapsed = 55 ms
      • total = 261 ms
       ~99.5%
      Scenario #3
      • createNewItems block = 238 ms
      • itemsToMod block = 0 ms
      • itemsToSave block = 213340 ms
      • saveComment() called 0 times
      • saveGrade() called 10000 times; total elapsed = 148363 ms; average elapsed = 14 ms
      • total = 213579 ms
      • createNewItems block = 261 ms
      • itemsToMod block = 0 ms
      • itemsToSave block = 152 ms
      • saveGradesAndCommentsForImport() called 5 times; total elapsed = 109 ms; average elapsed = 21 ms
      • total = 420 ms
       ~99.8%
      Scenario #4
      • createNewItems block = 0 ms
      • itemsToMod block = 0 ms
      • itemsToSave block = 473194 ms
      • saveComment() called 10000 times; total elapsed = 64246 ms; average elapsed = 6 ms
      • saveGrade() called 10000 times; total elapsed = 393776 ms; average elapsed = 39 ms
      • total = 473196 ms 
      • createNewItems block = 0 ms
      • itemsToMod block = 0 ms
      • itemsToSave block = 8646 ms
      • saveGradesAndCommentsForImport() called 10 times; total elapsed = 8571 ms; average elapsed = 857 ms
      • total = 8653 ms
       ~98.2%
      Scenario #5
      • createNewItems block = 3613 ms
      • itemsToMod block = 0 ms
      • itemsToSave block = 7566676 ms
      • saveComment() called 0 times
      • saveGrade() called 360000 times; total elapsed = 5729909 ms; average elapsed = 15 ms
      • total = 7570292 ms == ~126 minutes 
      • createNewItems block = 3232 ms
      • itemsToMod block = 0 ms
      • itemsToSave block = 30970 ms
      • saveGradesAndCommentsForImport() called 72 times; total elapsed = 30480 ms; average elapsed = 423 ms
      • total = 34206 ms 
      ~99.5%

        Gliffy Diagrams

          Attachments

            Issue Links

              Activity

                People

                • Assignee:
                  bjones86 Brian Jones
                  Reporter:
                  bjones86 Brian Jones
                • Votes:
                  0 Vote for this issue
                  Watchers:
                  2 Start watching this issue

                  Dates

                  • Created:
                    Updated:
                    Resolved:

                    Git Source Code