Uploaded image for project: 'Sakai'
  1. Sakai
  2. SAK-42872

Update Apache Tika 1.23

    XMLWordPrintable

    Details

    • 20 status:
      Resolved
    • Test Plan:
      Hide

      Please add a Test Plan here.

      Show
      Please add a Test Plan here.

      Description

      Release 1.23 - 12/02/2019
      https://www.apache.org/dist/tika/CHANGES-1.23.txt

      • NOTE: The PDFParser now relies on OCRDPI to render page images when
        users configure OCR on rendered page images. This will have the effect
        of increasing rendered image size (TIKA-2624).
      • NOTE: tika-server no longer returns 415 for file types for which there
        is no parser.
      • Fix bug in AUTO OCR strategy in the PDFParser (TIKA-3002).
      • Fix incorrect height and width metadata extraction from JPEG images (TIKA-2630).
      • Upgrade to POI 4.1.1 (TIKA-2851).
      • Upgrade to PDFBox 2.0.17 (TIKA-2951).
      • Ensure that the PDFParser respects custom configuration of Tesseract
        from tika-config.xml via Eric Pugh (TIKA-2970).
      • Add parser for XLIFF v1.2 files (TIKA-2975).
      • Add mime type detection support for WebAssembly (TIKA-2894),
        HEIF / HEIC images (TIKA-2942), Digilite FDF (TIKA-2988);
        and xml-root detection for XFDF (TIKA-2990) and XDP (TIKA-2989).
      • Add an XLZ Parser (TIKA-2976).
      • Fix deadlock with ForkParser when InputStream throws IOException (TIKA-2892)

        Gliffy Diagrams

          Zeplin

            Attachments

              Activity

                People

                Assignee:
                dhorwitz David Horwitz
                Reporter:
                dhorwitz David Horwitz
                Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                  Dates

                  Created:
                  Updated:
                  Resolved:

                    Git Integration