Codeplex - hocrtopdf.codeplex.com - hOcr2Pdf.NET

Latest News:

Source code checked in, #16372 2 Oct 2012 | 01:46 am

Upgrade: New Version of LabDefaultTemplate.xaml. To upgrade your build definitions, please visit the following link: http://go.microsoft.com/fwlink/?LinkId=254563

Source code checked in, #16371 2 Oct 2012 | 01:44 am

Checked in by server upgrade

Closed Issue: Mangled text format in resulting PDF? [236] 12 May 2012 | 03:27 am

The issue I'm still having is that when I cut-and-paste the text out of the resulting pdf, the formatting is completely messed up. A basic Tesseract text conversion of the same .tif file yields th...

Closed Issue: ExtractText skipping the last page [412] 12 May 2012 | 02:40 am

hocrtopdf\PdfReader.cs line 57 reads: for (int i = 1; i < pdf.NumberOfPages; i++) Page numbers start at 1 so the int "i = 1" is correct. However "i < pdf.NumberOfPages" ignores the last page. For exa...

Updated Wiki: Home 8 May 2012 | 03:00 pm

Project Description hOcr2Pdf.NET is a .NET library to convert .hocr html produced by Tesseract or Cuneiform into searchable pdfs using HtmlAgilityPack and iTextSharp. It is written in C#. NOTE I have...

Updated Wiki: Home 8 May 2012 | 02:59 pm

Project Description hOcr2Pdf.NET is a .NET library to convert .hocr html produced by Tesseract or Cuneiform into searchable pdfs using HtmlAgilityPack and iTextSharp. It is written in C# and inspired ...

Updated Wiki: Home 8 May 2012 | 02:59 pm

Project Description hOcr2Pdf.NET is a .NET library to convert .hocr html produced by Tesseract or Cuneiform into searchable pdfs using HtmlAgilityPack and iTextSharp. It is written in C# and inspired ...

Created Issue: Need to use page size and rotation of source pdf when compressing [600] 5 May 2012 | 03:59 am

When compressing pdf files that contain random page sizes and layouts, I need to get the page size and rotation of the page that i'm compressing and set the destination size and rotation to be the...

Updated Wiki: Documentation 4 May 2012 | 08:45 am

hOcr2Pdf.NET is a library that programmers can use to create highly compressed, searchable pdf's for applications. Requirements: .NET 4.0 or higher Tesseract 3.0 w/ the ability to produce hOcr fi...

Released: hoct2pdf.net - 08032011 (Aug 03, 2011) 4 May 2012 | 08:44 am

8/3/2011 Added Jpeg2000 for compressing color images the enum PdfImageType.Auto in the PDFSettings class now converts b/w and greyscale images to JBIG2 and color images to Jpeg2000 Added PdfCompres...

Recently parsed news:

Recent searches: