The idea here is that we need a way to ensure that we can run preloads/table table fixes/etc. on a single server to avoid running them on all servers in the cluster. Basically a general locking strategy.
This is critical since it will cause duplicate data loading in the case where there are multiple servers starting up for the first time.
The thread on this In reverse chronological order:
If you depend on the session factory then that will init first
followed by the service. Hence its safe to update tables in the service.
Otherwise it would not be safe to use the session factory in the init
of the service..... or am I just lucky with wiki ?
These suggestions are interesting but we are using Hibernate and as
far as I know there is not a way to check for the init in hibernate as
the table init happens outside the control of your code.
As far as best practices go there are a few pages that mention them:
I think there may be others.
I agree with this recommendation, and would add that it also seems to
be a reasonable pattern to recommend for Quartz jobs – it would for
most jobs to automatically enforce (run at most 1 time) without
requiring configuration or scheduling tricks.
If the job takes a longer amount of time than a few seconds, it's
probably also reasonable if possible to chunk the job and allow
multiple nodes to do multiple chunks. Admittedly this is probably more
common in the Quartz scenario than for initializing tables.
P.S. Did we make progress on the best practices library to the point
where once we get consensus on this issue we have an authoritative
place to document it?
1. On each node, check for evidence of the initialization already
being present. (sql select)
2. open a transaction
3. take out a lock on a flag indicating initialization (checked in
step1) (sql update)
4. Run a sequence of sql statements
5. commit the transaction.
Provided the time between 1 and 5 is not more than 10s
Other nodes will either block at 1 if there is a node doing 2-5 or
not perform the initialization.
We have used this sequence in RWiki since 2.2
2 - 5 should only be run if auto.ddl is true and
if auto.ddl is false, and you need to run 2-5..... you should
probably throw an exception to prevent startup.
You should also provide the initialization SQL script so that DBA's
can run it manually, but
Seems like this is worth sending out to the larger developer list.
> To avoid the issue of having competing servers in a cluster trying to do the
> same initialization of the eval tables how about adding a new eval property
> which is the name of the server that should do the evaluation. In that way
> most servers can be set to wait until the data is initialized and the chosen
> one will be free to initialize the tables without fear of colliding with
> other servers.
> - Dave
> David Haines
> CTools Developer
> Digital Media Commons
> University of Michigan