Sakai
  1. Sakai
  2. SAK-10366

Branch to develop a JSR-170 Backed Content Hosting Serivce

    Details

      Description

      We are going to make a branch to modify Content Hosting Service so that it will work ontop of JSR-170 (probablty the JCRService in contrib)

      Ian Boston
      Jim Eng
      Linda M Place

      will be involved in this patch.

      We intend to test it with the UoM dataset to ensure that it can cope with 1.5TB of data.
      Aiming to be available in 2.5

        Activity

        Hide
        Ian Boston added a comment -
        Show
        Ian Boston added a comment - Branch has been created https://source.sakaiproject.org/svn//content/branches/SAK-10366/
        Hide
        Ian Boston added a comment -
        First pass at a base JCR layer is now in the Branch. It extends DbContentService and create a new implementation of Storage that uses a LiteStorageUser that is unbound from XML DOM for create Resources.

        This layer binds to the JCRService which stacks wht whole of CHS on JCR rather than on the DB

        Show
        Ian Boston added a comment - First pass at a base JCR layer is now in the Branch. It extends DbContentService and create a new implementation of Storage that uses a LiteStorageUser that is unbound from XML DOM for create Resources. This layer binds to the JCRService which stacks wht whole of CHS on JCR rather than on the DB
        Hide
        Ian Boston added a comment -
        I have moved the Storage Layer out of the JCRContentService partially to discover where the internal hidden links to protected class members are in parent classes. There appear to be quite a number.

        It also looks like the storage is split between the Storage layer and the DbContentService with direct access in DbContentService in a number of places especially with commits mid stream.

        The other area of concern is the use of setting UUID on an item which should be immutable, as part othe move operations. It may be necessary to bind down to the move operation in the JCR as this will be far faster than serializing all the content into the JVM and then out again to perform the move, rather than just moving the node.

        Show
        Ian Boston added a comment - I have moved the Storage Layer out of the JCRContentService partially to discover where the internal hidden links to protected class members are in parent classes. There appear to be quite a number. It also looks like the storage is split between the Storage layer and the DbContentService with direct access in DbContentService in a number of places especially with commits mid stream. The other area of concern is the use of setting UUID on an item which should be immutable, as part othe move operations. It may be necessary to bind down to the move operation in the JCR as this will be far faster than serializing all the content into the JVM and then out again to perform the move, rather than just moving the node.
        Hide
        Ian Boston added a comment -
        resolveUUID, getUUID, findUUID, findInternalUUID all appear to do the same thing. The first 3 are in the API. These are being routed to a single getUUID with simple semantics. The catch throwable in resolve never gets triggered unless the DB is down.
        Show
        Ian Boston added a comment - resolveUUID, getUUID, findUUID, findInternalUUID all appear to do the same thing. The first 3 are in the API. These are being routed to a single getUUID with simple semantics. The catch throwable in resolve never gets triggered unless the DB is down.
        Hide
        Ian Boston added a comment -

        The layer is now working, but there are some issues that need sorting out.

        1. CHS relies heavily on the trailing / in entity names to identify collections, JCR uses a node type and so the path separator is not significant. The use of this in CHS is variable as it also relies on the BaseContentCollectionEdit class and the IS_COLLECTION property. In some places there is code that uses all three of these, but will never execute certain paths since its impossible.

        2. The Tool layer appears to be making repeated demands for the same entity over and over again in the same request cycle. For instance for each control on the UI (the name, the dropdown, the member count) it will individually ask all the way down to the storage layer to findCollection(id). Without caching of the ContentCollectionEdit in the storage layer, you will get this object created 10 or more time per request, when its really the same object. This is from the tool.


        3. The caching layer in the thread local (inside BaseContentService) appears to re-create the object by copying it, which leaves the request cycle in a variable state. You might (and do) end up with 10 or more copies of a ContentCollectionEdit (or resource) in memory with only one of them in edit mode. This must lead to inconsistancies.

        I am not intending to fix all these problems since it will mean major surgery on the Resources Tool which Jim is working on but I do have to work around them in the CHS - JCR layer.
        Show
        Ian Boston added a comment - The layer is now working, but there are some issues that need sorting out. 1. CHS relies heavily on the trailing / in entity names to identify collections, JCR uses a node type and so the path separator is not significant. The use of this in CHS is variable as it also relies on the BaseContentCollectionEdit class and the IS_COLLECTION property. In some places there is code that uses all three of these, but will never execute certain paths since its impossible. 2. The Tool layer appears to be making repeated demands for the same entity over and over again in the same request cycle. For instance for each control on the UI (the name, the dropdown, the member count) it will individually ask all the way down to the storage layer to findCollection(id). Without caching of the ContentCollectionEdit in the storage layer, you will get this object created 10 or more time per request, when its really the same object. This is from the tool. 3. The caching layer in the thread local (inside BaseContentService) appears to re-create the object by copying it, which leaves the request cycle in a variable state. You might (and do) end up with 10 or more copies of a ContentCollectionEdit (or resource) in memory with only one of them in edit mode. This must lead to inconsistancies. I am not intending to fix all these problems since it will mean major surgery on the Resources Tool which Jim is working on but I do have to work around them in the CHS - JCR layer.
        Hide
        Ian Boston added a comment -
        SAK-10366 branch is now synchronized with Trunk content hosting and works. (somewhat, I havent tested everything)

        This branch replaced the Database Storage layer in Content Hosting Service with a JCR/JSR-170 Storage layer.

        To try it out (if you are feeling brave)


        ---------------------------------------------------------------------------

        Check out a JCRService implementation
         
        svn co https://source.sakaiproject.org/contrib//tfd/trunk/jackrabbitservice
        cd jackrabbitservice
        mvn -Dmaven.tomcat.home=/Users/ieb/mytomcat clean install sakai:deploy

        (The 1060 unit tests in the jackrabbit build may need a mysql db, I think they use Derby but I cant remember)

        then Checkout SAK-10366

        svn co https://source.sakaiproject.org/svn/content/SAK-10366
        cd SAK-10366
        mvn -Dmaven.tomcat.home=/Users/ieb/mytomcat clean install sakai:deploy


        configure sakai.properties to make the JCRService use MySQL


        dbDialect@org.sakaiproject.jcr.api.JCRService.repositoryBuilder=mysql
        dbUser@org.sakaiproject.jcr.api.JCRService.repositoryBuilder=sakai22
        dbPass@org.sakaiproject.jcr.api.JCRService.repositoryBuilder=sakai22
        dbDriver@org.sakaiproject.jcr.api.JCRService.repositoryBuilder=com.mysql.jdbc.Driver
        dbURL@org.sakaiproject.jcr.api.JCRService.repositoryBuilder=jdbc:mysql://127.0.0.1:3306/sakai22?useUnicode=true&characterEncoding=UTF-8
        contentOnFilesystem@org.sakaiproject.jcr.api.JCRService.repositoryBuilder=false
        journalLocation@org.sakaiproject.jcr.api.JCRService.repositoryBuilder=/Users/ieb/Caret/sakai22/shared/jcr-journal/
        dataSourcePersistanceManager@org.sakaiproject.jcr.api.JCRService.repositoryBuilder=false

        startup tomcat

        You should see lots of JCR messages fly past with no exceptions upto the Tomcat Started point.


        there is a jcr-webdav installation at
        http://localhost:8080/jcr-webdav/repository/default/sakai

        login as admin admin

        And the normal webdav and resources will also work. (I was last having some problems with Dublin Core meta data, reported file sizes and the mime type in jcr webdav)

        --------------------------------------------------------------------------

        I will continue working on this over the next 2 weeks.

        Ian
        Show
        Ian Boston added a comment - SAK-10366 branch is now synchronized with Trunk content hosting and works. (somewhat, I havent tested everything) This branch replaced the Database Storage layer in Content Hosting Service with a JCR/JSR-170 Storage layer. To try it out (if you are feeling brave) --------------------------------------------------------------------------- Check out a JCRService implementation   svn co https://source.sakaiproject.org/contrib//tfd/trunk/jackrabbitservice cd jackrabbitservice mvn -Dmaven.tomcat.home=/Users/ieb/mytomcat clean install sakai:deploy (The 1060 unit tests in the jackrabbit build may need a mysql db, I think they use Derby but I cant remember) then Checkout SAK-10366 svn co https://source.sakaiproject.org/svn/content/SAK-10366 cd SAK-10366 mvn -Dmaven.tomcat.home=/Users/ieb/mytomcat clean install sakai:deploy configure sakai.properties to make the JCRService use MySQL dbDialect@org.sakaiproject.jcr.api.JCRService.repositoryBuilder =mysql dbUser@org.sakaiproject.jcr.api.JCRService.repositoryBuilder =sakai22 dbPass@org.sakaiproject.jcr.api.JCRService.repositoryBuilder =sakai22 dbDriver@org.sakaiproject.jcr.api.JCRService.repositoryBuilder =com.mysql.jdbc.Driver dbURL@org.sakaiproject.jcr.api.JCRService.repositoryBuilder =jdbc: mysql://127.0.0.1:3306/sakai22?useUnicode=true&characterEncoding=UTF-8 contentOnFilesystem@org.sakaiproject.jcr.api.JCRService.repositoryBuilder =false journalLocation@org.sakaiproject.jcr.api.JCRService.repositoryBuilder =/Users/ieb/Caret/sakai22/shared/jcr-journal/ dataSourcePersistanceManager@org.sakaiproject.jcr.api.JCRService.repositoryBuilder =false startup tomcat You should see lots of JCR messages fly past with no exceptions upto the Tomcat Started point. there is a jcr-webdav installation at http://localhost:8080/jcr-webdav/repository/default/sakai login as admin admin And the normal webdav and resources will also work. (I was last having some problems with Dublin Core meta data, reported file sizes and the mime type in jcr webdav) -------------------------------------------------------------------------- I will continue working on this over the next 2 weeks. Ian
        Hide
        David Haines added a comment -
        I'm not quite sure where to find out if JSR 170 will make easy to implement an un-delete / soft delete capability. That would make a lot of people happy.
        Show
        David Haines added a comment - I'm not quite sure where to find out if JSR 170 will make easy to implement an un-delete / soft delete capability. That would make a lot of people happy.
        Hide
        Ian Boston added a comment -
        Soft delete,
        JSR-170 has versioning of everything, so a soft delete, undelete should be relatively easy.


        Show
        Ian Boston added a comment - Soft delete, JSR-170 has versioning of everything, so a soft delete, undelete should be relatively easy.
        Hide
        Ian Boston added a comment -
        The branch is now merged into trunk.

        In the process I have separated out the JCR dependencies from the main build so that you dont have to have a) JCR running in trunk/2.5.

        In the base pom.xml for content there are 2 new projects content-impl-jrc/impl and content-impl-jrc/pack

        impl builds the JCR extension jar that provides JCR storage and depends on the sakai jcr api.
        pack is an alternative pack to the standard DB CHS pack. This pack contains both the DB CHS and the JCR CHS

        For the moment, we will build the impl but not peform the pack so the JCR jar will not appear in the running Sakai.

        As we migrate we will continue to build the DB CHS (and the JCR CHS jar) but will move to the jcr pack.

        There is no need to change what is checked out, since build configuration is now all done with the pom.xml and not as a result of the code being present on disk.
        Show
        Ian Boston added a comment - The branch is now merged into trunk. In the process I have separated out the JCR dependencies from the main build so that you dont have to have a) JCR running in trunk/2.5. In the base pom.xml for content there are 2 new projects content-impl-jrc/impl and content-impl-jrc/pack impl builds the JCR extension jar that provides JCR storage and depends on the sakai jcr api. pack is an alternative pack to the standard DB CHS pack. This pack contains both the DB CHS and the JCR CHS For the moment, we will build the impl but not peform the pack so the JCR jar will not appear in the running Sakai. As we migrate we will continue to build the DB CHS (and the JCR CHS jar) but will move to the jcr pack. There is no need to change what is checked out, since build configuration is now all done with the pom.xml and not as a result of the code being present on disk.
        Hide
        Peter A. Knoop added a comment -
        [Bulk Change] This issue is currently Unresolved, however, it has a Fix Version set. In keeping with the newly added Target Version, Fix Versions should only be set for Resolved Issues, and only after it has been merged to that version specifically. The Fix Version is being reset to Unknown for this issue. Please use the Target Version to indicate when you plan to address this issue.
        Show
        Peter A. Knoop added a comment - [Bulk Change] This issue is currently Unresolved, however, it has a Fix Version set. In keeping with the newly added Target Version, Fix Versions should only be set for Resolved Issues, and only after it has been merged to that version specifically. The Fix Version is being reset to Unknown for this issue. Please use the Target Version to indicate when you plan to address this issue.
        Hide
        Ian Boston added a comment -
        Branch is now retired, and code merged into CHS in trunk, closing item
        Show
        Ian Boston added a comment - Branch is now retired, and code merged into CHS in trunk, closing item

          People

          • Assignee:
            Ian Boston
            Reporter:
            Ian Boston
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: