We are pleased to share a new version of Aspose.Pdf for Java with following improvements:
New Features:
PDFNEWJAVA-33818 EPUB to PDF support
PDFNEWJAVA-33678 Converting non searchable PDF to searchable PDF document
Fixed Bugs:
PDFNEWJAVA-34558 TextFragment Underline formatting is not working
PDFNEWJAVA-34616 PDF to PDF/A - Resultant file is not correct
PDFNEWJAVA-34614 PDF file is not properly being converted to PDF/A_1a format
PDFNEWJAVA-34441 TextFragment Annotation characters overlap issue
PDFNEWJAVA-34566 PDF Table cells rowspan not working, when page breaks
PDFNEWJAVA-34575 NullPointerException is being generated while creating PDF file
PDFNEWJAVA-34567 Systematic NullPointerException while using API
PDFNEWJAVA-34491 When replacing text, contents overlap in resultant file
PDFNEWJAVA-34259 Saving back to the same document throws InvalidPdfFileFormatException exception
PDFNEWJAVA-33934 PDF to DOC: Table rendered incorrectly
PDFNEWJAVA-33932 PDF to DOC: Image rendering issue
PDFNEWJAVA-33998 PCL to PDF: Some text is broken
PDFNEWJAVA-34550 PDF to HTML: incorrect Encoding of output HTML
PDFNEWJAVA-33993 PDF to DOC - Page contents are not properly being rendered
PDFNEWJAVA-34167 PDF to DOC: bullet list is rendered incorrectly
PDFNEWJAVA-33476 Aspose PDF for Java 4.0 throwing exceptions on Linux
PDFNEWJAVA-33464 Problem while adding EMF image to PDF
PDFNEWJAVA-33507 ImportXML() method is not working
PDFNEWJAVA-33592 PdfFIleInfo class is malfunctioning
PDFNEWJAVA-34049 NullPointerException while adding TextStamp to PDF file
PDFNEWJAVA-34468 PDF to Image: 200 resolution trims output image
PDFNEWJAVA-34472 PDF to DOC - Missing Header text and formatting issues
PDFNEWJAVA-34501 Epub to PDF: NoClassDefFoundError
PDFNEWJAVA-33926 Wrong borders generates in the resulting file
Public API and Backwards Incompatible ChangesAdded method:
com.aspose.facades.Form.importXml(String inputXml)Changes:com.aspose.pdf.generator.legacyxmlmodel.Heading:
isAutoSequence(boolean) -> setAutoSequence(boolean)
isInList(boolean) -> setInList(boolean)PDFNEWJAVA-33818 EPUB to PDF support Implemented EPUB to PDF converter. Example: EpubLoadOptions optionsepub = new EpubLoadOptions(); Document docepub = new Document("Alice.epub", optionsepub); docepub.save("Alice.pdf");PDFNEWJAVA-33678 Converting non searchable PDF to searchable PDF document This feature is implemented. This logic recognize text for pdf images. For recognition you may use outer OCR supports HOCR standard(http://en.wikipedia.org/wiki/HOCR). I have used free google tesseract OCR(http://en.wikipedia.org/wiki/Tesseract_%28software%29) Please install it to you computer from http://code.google.com/p/tesseract-ocr/downloads/list, after that you will have tesseract.exe console application. To inspect result Pdf visually, open the result output971.pdf in Adobe Reader and press Ctrl+A to select all the text. Example:initLicense();final String myDir = "D:\\LocalTesting\\";Document doc = new Document(myDir + "input.pdf");
// Create callBack - logic recognize text for pdf images. Use outer OCR supports HOCR standard(http://en.wikipedia.org/wiki/HOCR).
// We have used free google tesseract OCR(http://en.wikipedia.org/wiki/Tesseract_%28software%29)
CallBackGetHocr cbgh = new CallBackGetHocr()
{@Overridepublic String invoke(java.awt.image.BufferedImage img){File outputfile = new File(myDir + "test.jpg");try{ImageIO.write(img, "jpg", outputfile);} catch (IOException e1){e1.printStackTrace();}try{java.lang.Process process = Runtime.getRuntime().exec("tesseract" + " " + myDir + "test.jpg" + " " + myDir + "out hocr");System.out.println("tesseract" + " " + myDir + "test.jpg" + " " + myDir + "out hocr");process.waitFor();} catch (IOException e){e.printStackTrace();} catch (InterruptedException e){e.printStackTrace();}// reading out.html to stringFile file = new File(myDir + "out.html");StringBuilder fileContents = new StringBuilder((int) file.length());Scanner scanner = null;try{scanner = new Scanner(file);String lineSeparator = System.getProperty("line.separator");while (scanner.hasNextLine()){fileContents.append(scanner.nextLine() + lineSeparator);}} catch (FileNotFoundException e){e.printStackTrace();} finally{if (scanner != null)scanner.close();}// deleting temp filesFile fileOut = new File(myDir + "out.html");if (fileOut.exists()){fileOut.delete();}File fileTest = new File(myDir + "test.jpg");if (fileTest.exists()){fileTest.delete();}return fileContents.toString();}};
// End CallBack
doc.convert(cbgh);
doc.save(myDir + "output971.pdf");