Links:How to save html file to PDF

I want  to save html file generated by ASP.NET to PDF.

I was pointed to itextsharp open source project.

I found a few links, discussing how to do it:

 iTextSharp Tutorial Chapter 7: XML and (X)HTML

 iTextSharp Demo( 2.0): introduces HtmlParser.Parse.(see the source code here)

We tried to use it.

HtmlParser.Parse does NOT throw any error , but the pdf file generated from this could be blank/empty.
Debug output shows the messages from parser, if Html file has invalid structure.

This is a big problem: HtmlParser.Parse is very strict and any minor mistakes in HTML causes exceptions or almost silent creation of empty PDF file.

The post of Creating pdf in .NET from html has a lot of interesting comments, including suggestion  to use HTML Agility Pack.

We are going to try how HtmlParser.Parse will be tolerant to html, regenerated from HTML Agility Pack.

The thread   [ 1819614 ] Error parsing images in HTML files has description of the fix

Another option is always use XML complient HTML, verified by ,but it could take some time to tidy up the HTML generated from ASP.NET

 Links to other products:  

Generate PDF from ASP.NET gives a few references to different products including iTextSharp

 Dynamically Generating PDFs in .NET : 

 Another option is to try (and possibly buy) commercial product abcpdf 


I saw a suggestion to use -the command line version of HTMLDoc to convert HTML to PDF, but it is not good for programmatic access.