Search had an issue where documents were continually being indexed over and over again.
In order to fix this a new field was added to the “sakai_index” index called “indexed”. When a document is added indexed=false, when the indexing thread runs it will then index all the documents that have indexed=false and set the documents to indexed=true… This is a better approach than relying on the EntityContentProducer to produce valid searchable data.
The important thing to know is when this update is installed it will automatically begin to reindex all documents that are currently in your index. Depending on the size of your index this may increase load as the documents are being indexed. Upgrade reccomendations:
- Upgrade during an low usage time to minimize any indexing activities, note depending on the size of your index and the number of nodes this could take a long time i.e. an index containing 500,000 documents where 50% of the files resulted in being indexed on 2 nodes can take 18 hours.
- Delete your current index and begin indexing only newly added files, downside is that files there were indexed are no longer in the index, and therefore not searchable.
- Delete your current index and using the option “Rebuild Whole Index” at a later scheduled time, i.e. during a low usage period (holidays, intersession)