|
Sorry to upgrade this to blocker, but this bug will have the end effect of losing student and faculty work. We have seen a catalina.out grow to 33gb in an hour. Once the disk is full, students will be unable to submit assignments as Sakai will attempt to write the upload to tmp space first.
Hi Zhen, Yes, our catalina.out was full of the same text as Charles reported. We use logrotate to truncate the catalina.out on a daily basis, but if the catalina.out is able to grow without bound before logrotate runs, users will be unable to upload files once the disk is full.
Thanks for the quick fix. I will try to replicate and test on trunk. Just to confirm that, we have a server failure here in UM last hour because of the same problem!
From Matthew Jones: "Potpie going down looks to be related to 2009-09-23 15:38:30,027 [TP-Processor101] WARN org.sakaiproject.assignment.impl.BaseAssignmentService - org.sakaiproject.assignment.impl.DbAssignmentService@48262730 zipSubmissions --IOException unable to create the zip file for userIOE 265_ Lab 1/Chatlapalli . . . At least that's the tail of the log. The file is too big to get off at the moment. The error looks pretty crazy! (Heres a screenshot, the numbers go on until 11803). The log file is almost 2 gigabytes, with the other log files being in the double digit megabyte range." I have tested on trunk and Download All continues to work as expected. I have not been able to replicate issue on latest trunk.
Matthew Jones has provide a script that can be used for testing. Basically it slows down the http traffic. So for testing the download all feature, one can run the script from one's computer and force the IOException on zip downloading.
The instruction from Matthew is as follows: "-Uncompress it sudo to root and move it to root's home directory. (/var/root/.bashrc) - Now you need to change roots shell to use bash (As root chsh, edit the line to say Shell: /bin/bash instead of Shell: /bin/sh) - Source the file (source /var/root/.bashrc) - Now these commands will be available either as root or via sudo <command> It has a few commands currently in it that are active on that terminal (You can use 0 for the bit/s or you can use something with a kilo value [like 300k or 1500k]) #oraset <speed> = Sets oracle to a certain speed in bit/s #httpset <speed> = Slows down http traffic (port 80 and 443) to speed #orachange <speed> = Changes speed of oracle #httpchange <speed> = Changes speed of http #pipereset = Resets everything back to default" Note: httpset 0 will actually unlimit the bit/s; so "httpset 1" should be good enough for slowing down the transfer. The related bash file is attached The change should work. But it's not ideal. If a problem occurs, it stops output for that user but continues with the next. The problem is that you then have an incomplete ZIP file, but the user doesn't know it. I'd abort at the first failure, and produce an error that is visible to the user.
Based on the thread below, I will change the code to (1) abort the thread after the first IOException; and (2) generate a UI alert, same as the log message.
"On Sep 24, 2009, at 3:42 PM, Charles Hedrick wrote: I agree it's likely that they can't. I still think it's safest to abort at the first error. On Sep 24, 2009, at 3:34 PM, Zhen Qian wrote: Chuck: I think the problem happens most likely when the user has network connection problems. So do you think the user will be able to view the alert message within the tool? Thanks, - Zhen |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||
I couldn't not remember why the looping was there in the first place. Doesn't seem right to me, either. I will remove it for now....