click here for details... Sakai Executive Director Position Search now open
Issue Details (XML | Word | Printable)

Key: SAK-7407
Type: Task Task
Status: Closed Closed
Resolution: Fixed
Priority: Critical Critical
Assignee: Unassigned
Reporter: Ray Davis
Votes: 0
Watchers: 1
Operations

If you were logged in you would be able to see more operations.
Sakai

Analyze and reduce data contention in Gradebook

Created: 15-Dec-2006 10:20   Updated: 23-Oct-2008 10:46
Component/s: Gradebook
Affects Version/s: 2.3.0
Fix Version/s: 2.4.0

Time Tracking:
Not Specified

Issue Links:
Duplicate
 
Relate
 

2.6.x Status: None
2.5.x Status: None
2.4.x Status: None


 Description  « Hide
For some time, the Gradebook development team has anticipated the possibility of serious data contention when updating assignment scores and course grades for large course sites, particularly when there are multiple graders and when programmatic clients such as Samigo and Assignments let students update scores.

Although no one has as yet been able to reliably reproduce any issues, we seem to be getting early reports of chickens coming home to roost.


 All   Comments   Work Log   Change History   Subversion Commits   git Commits      Sort Order: Ascending order - Click to sort in descending order
Ray Davis added a comment - 15-Dec-2006 10:25
Here are the potential subtasks I've come up with so far:

* Add more log messages specifically to help in analyzing deadlocks.

* When it's fairly certain that the Gradebook will be updating a selected record, use SELECT FOR UPDATE to reduce the chances of deadlock. We can make that decision much smarter if we change the application logic for Assignment Details and Course Grade to:
- Remember the old score/grade values in the bean.
- Compare the form's old values to the form's new values and use that to decide what has to be updated (rather than comparing the form's new values to what's currently in the DB).

* Consider breaking the multiple-update transactions into multiple single-update transactions. (Ask our local DBAs for advice about this.)

* This is more of an annoyance when analyzing logs rather than a real performance drain, but I see truly ridiculous numbers of read-only transactions coming from getGradebookUid and isSiteMemberInRole, probably due to JSF calling stuff like rendered="#{overviewBean.userAbleToGradeAll}" a ridiculous number of times. Look into some simple request-scoped caching to take care of this.

* Hardest one, and possibly the most effective one: Redesign our logic to postpone the DB updating of CourseGradeRecord's PointsEarned and SortGrade as long as possible.

Ray Davis added a comment - 15-Dec-2006 13:47
In revision 19616, I added a special logger to look for grade table data contention instances and causes.

It can be selectively turned on with the following log4j properties:

log4j.logger.org.sakaiproject.tool.gradebook.business.impl.GradebookManagerHibernateImpl.GB_DATA=debug
log4j.logger.org.sakaiproject.component.gradebook.GradebookServiceHibernateImpl.GB_DATA=debug

The messages will also show up if the parent class's logger is at a "debug" level or lower.

Ray Davis added a comment - 22-Dec-2006 13:10
After comparing the benefits of various strategies, I finally decided to try the most radical approach and completely remove calculated course grades from the database. Instead, they're now calculated on the fly when they're needed.

I moved this change into the Sakai trunk in revision 20001.

If there are no bugs, the switch should be invisible to existing users. Behind the scenes, however, this is a major modification of all business logic relating to course grades.

Before this check-in, users could receive contention errors (and lose work) in scenarios such as the following:

* Grader A is scoring Student X on Quiz 1 at around the same time Grader B is scoring Student X on Quiz 2.

* Grader A is scoring Student X on Quiz 1 at around the same time Grader B is scoring Student Y on Quiz 1.

* Grader A is scoring Student X on Quiz 1 at around the same time Instructor is overriding Student X's course grade.

* Student X is submitting a Gradebook-linked assignment at around the same time Grader A is scoring Student X on a different assignment.

After this change, the ONLY time data contention should occur is when two graders are trying to change the same score for the same student at around the same time.

By rearranging the business logic, I've tried to minimize the negative side of the change. However, the student's view of automatically calculated course grades (when they've been released to students but have not been explicitly overridden by the instructor), and the display of gradebook-wide mean values in the instructor's Course Grade Details view will both require more reading from the database. Given students' anxiety over discovering final grades as quickly possible, the student view is much more likely to be more of a performance issue, and it would be good to plan for some serious load testing over the next month or so. (And some serious QA testing, too.)

(As a little background, the original Gradebook design was influenced by our belief that we'd soon be adding grade curving as a feature, and by our experiences dealing with rabid student curiosity in the UC Berkeley pilot project. However, with the addition of Gradebook programmatic clients such as Samigo, Assignments, and Message Center, and with the addition of support for course sections and GSIs/TAs, reducing data contention when scoring has turned out to be a much higher priority than supporting curved grading.)

Ray Davis added a comment - 22-Dec-2006 13:17
If we run into performance problems around the grades, the most obvious next step I see is to break the course grades and assignment scores into separate database tables, and to break the course grade definitions and assignment definitions into separate database tables. Leaving them all together might make them too difficult to optimize.

I didn't take that step now, though, because that would require massive changes to existing production databases. The changes I checked in are fully compatible with existing data.

Megan May added a comment - 08-Feb-2007 08:35
Updating fix version from nightly2/trunk to 2.4.0.001 in prep for first QA tag (Prelim testing pre-code freeze)

Ray Davis added a comment - 26-Feb-2007 10:39
Quick update: A team at Indiana performed load testing to replicate the unnecessary deadlocking, and then to verify the fix.

Megan May added a comment - 12-Apr-2007 13:45
Tasks for 2.4 have been completed. If you find a problem related to a task, please create a bug report and use the Link feature to create a connection between the two issues.