Uploaded image for project: 'Contrib: Evaluation System'
  1. Contrib: Evaluation System
  2. EVALSYS-369

competing servers in a cluster trying to do the same initialization

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: CLOSED
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 1.1.1 Minor fixes
    • Fix Version/s: 1.2
    • Component/s: Database / DAOs
    • Labels:
      None
    • Environment:
      cluster

      Description

      The idea here is that we need a way to ensure that we can run preloads/table table fixes/etc. on a single server to avoid running them on all servers in the cluster. Basically a general locking strategy.

      This is critical since it will cause duplicate data loading in the case where there are multiple servers starting up for the first time.

      ========================================================================
      The thread on this In reverse chronological order:

      If you depend on the session factory then that will init first
      followed by the service. Hence its safe to update tables in the service.

      Otherwise it would not be safe to use the session factory in the init
      of the service..... or am I just lucky with wiki ?
      Ian

      These suggestions are interesting but we are using Hibernate and as
      far as I know there is not a way to check for the init in hibernate as
      the table init happens outside the control of your code.

      As far as best practices go there are a few pages that mention them:
      http://confluence.sakaiproject.org/confluence/display/BOOT/Sakai+Programming+Best+Practices
      http://confluence.sakaiproject.org/confluence/display/SAKDEV/Best+Practices+for+High+Quality+Code
      http://confluence.sakaiproject.org/confluence/display/SAKDEV/Best+Practices+for+Kernel+code

      I think there may be others.
      -AZ

      I agree with this recommendation, and would add that it also seems to
      be a reasonable pattern to recommend for Quartz jobs – it would for
      most jobs to automatically enforce (run at most 1 time) without
      requiring configuration or scheduling tricks.

      If the job takes a longer amount of time than a few seconds, it's
      probably also reasonable if possible to chunk the job and allow
      multiple nodes to do multiple chunks. Admittedly this is probably more
      common in the Quartz scenario than for initializing tables.

      Jason

      P.S. Did we make progress on the best practices library to the point
      where once we get consensus on this issue we have an authoritative
      place to document it?

      IMHO,

      1. On each node, check for evidence of the initialization already
      being present. (sql select)
      if not
      2. open a transaction
      3. take out a lock on a flag indicating initialization (checked in
      step1) (sql update)
      4. Run a sequence of sql statements
      5. commit the transaction.

      Provided the time between 1 and 5 is not more than 10s

      Other nodes will either block at 1 if there is a node doing 2-5 or
      not perform the initialization.

      We have used this sequence in RWiki since 2.2

      2 - 5 should only be run if auto.ddl is true and

      if auto.ddl is false, and you need to run 2-5..... you should
      probably throw an exception to prevent startup.

      You should also provide the initialization SQL script so that DBA's
      can run it manually, but

      HTH
      Ian

      Seems like this is worth sending out to the larger developer list.
      -AZ

      > Dick,
      >
      > To avoid the issue of having competing servers in a cluster trying to do the
      > same initialization of the eval tables how about adding a new eval property
      > which is the name of the server that should do the evaluation. In that way
      > most servers can be set to wait until the data is initialized and the chosen
      > one will be free to initialize the tables without fear of colliding with
      > other servers.
      >
      > - Dave
      >
      >
      > David Haines
      > CTools Developer
      > Digital Media Commons
      > University of Michigan
      > dlhaines@umich.edu

        Gliffy Diagrams

          Zeplin

            Attachments

              Issue Links

                Activity

                  People

                  Assignee:
                  aaronz Aaron Zeckoski (Inactive)
                  Reporter:
                  rwellis Richard Ellis (Inactive)
                  Votes:
                  0 Vote for this issue
                  Watchers:
                  0 Start watching this issue

                    Dates

                    Created:
                    Updated:
                    Resolved:

                      Time Tracking

                      Estimated:
                      Original Estimate - Not Specified
                      Not Specified
                      Remaining:
                      Remaining Estimate - 0 minutes
                      0m
                      Logged:
                      Time Spent - 1 day, 7 hours
                      1d 7h

                        Git Integration