/ rails

Generating PDF files from HTML content

Although nowadays we have Email, SMS services integrated in most of web applications, sometimes there could be need of documentation for future reference which users may want to reuse later based on their need. Similar requirement we had in one of our web application where we wanted to provide option to user to generate single or bulk PDF files out of any HTML templates they have in their account. There was different features around the generation of PDF document but major work was involved in generation of PDF out of HTML which is also not difficult task when we have rich ruby gems for help.

There are multiple gems available for generating PDF files but identifying one out of those really depends on your exact requirement like what you want? and what you don't want? I also compared 3 different gems which are mostly used for PDF conversion in Ruby. I will briefly explain the key differences which I see in all 3 gems which I analysed, the gem which I used and then few tricks which I used in my PDF generation feature -

  1. Prawn -
    This is basically Ruby library for PDF generation. It's like complete DSL which can process text given by you with defined operations using DSL methods to generate PDF. It is very rich in it's features as it also has good support for fonts, images, vector drawing, graphics and many other things. The only thing which we can can't directly do with Prawn is HTML to PDF conversion. you can read about Prawn in detail here

  2. Wicked PDF -
    This gem can be used for HTML to PDF conversion as it has many ways to convert HTML content to PDF directly using utility methods which it provides. It depends on shell utility wkhtmltopdf . As it performs conversion at machine level, you have to be cautious regarding fonts, CSS and external javascript library which you are using as you have to provide absolute paths for all resources. It also provides few helpers for including stylesheets and javascript files with layouts.
    But if you want to use assets with Rails asset pipeline in your app then you might have to use gem helpers which base64 encodes your assets and add it inline to your page. It will be faster for small assets but for larger assets it may have performance issues which you might want to consider.
    The major advantage of this gem is you can even render PDF views like you render HTML views in your app. You can read more about this gem here

  3. PDFKit -
    This gem is somewhat similar to Wicked PDF as it is also dependent on wkhtmltopdf shell utility at backend to convert HTML to PDF. There are few differences between PDFKit and Wicked PDF gems in terms of options provided for generating PDF which you can check from its documentation . The only major difference is in way of rendering PDF views as PDFKit has option to introduce its middleware in your app and then setup routes which can serve PDF views.

So this was all about differences about gems for generating PDF. Out of these I used Wicked PDF for converting HTML templates to PDF files. Although there were few tricks which I had to use in few cases and I am explaining those below for you -

Few tricks and workarounds which I applied in few cases

  • Character encoding - As I have already mentioned above, with Wicked PDF your assets needs to have absolute path if you want to include them with Rails asset pipeline but as in my case, I didn't have much assets to load for templates, so I simply avoided referencing any assets to avoid any unidentified issue with assets loading.

    But to keep character encoding uniform i.e. utf-8, I just prepended meta tag to my HTML content while converting it to PDF -

    META_TAG = "<meta charset='utf-8'/>"

    WickedPdf.new.pdf_from_string(META_TAG + html_content_for_pdf)

  • Page-break - Similarly for introducing page break at required place to start next content on new page, I had to insert specific page break HTML tag with inline CSS -

    PAGE_BREAK_HTML = <p style='page-break-after:always;'></p>

    html_content_for_pdf = html_templates_array_for_pdf_file.join(PAGE_BREAK_HTML)

    WickedPdf.new.pdf_from_string(META_TAG + html_content_for_pdf)

  • Setting up page margins -
    From gem documentation you will get that there is way of specifying margin for page in millimeter unit while rendering it from controller. Alternatively, you can do it with following way by converting page margins from pixels to millimeters and passing it to backend as 'margin' option to 'pdf_from_string' method of WickedPdf instance.

    WickedPdf.new.pdf_from_string(pdf_content, margin: { top: '25.4mm', right: '25.4mm', left: '25.4mm', bottom: '25.4mm' })

  • Convert videos to clickable image links -
    In our application we have BombBomb integration through which we can insert short videos into templates which user can send in emails. While converting these templates to PDF file, requirement was to keep those videos as clickable image links and on clicking on those images, video should open in new tab.

    Gif image link was properly being generated inside template which was referencing video url but when this HTML was transformed to PDF, image was no longer clickable! Not to worry, a small tweak in inline css made it work...Hurray!!!

    <p><a href='videoURL' target='_blank' rel='noopener noreferrer' style='display:block;'><img src='imgURL'/></a></p>

    Yes, styling of display: block for hyperlink made it work and image became clickable. Again I had to insert style as inline as I didn't use stylesheet for PDF generation. In your case it can go in stylesheet if you are referencing one for PDF files.

  • Issue with fonts -
    As PDF files are generated by machine level utility by Wicked PDF gem, fonts which were present in template were not getting retained as it is in PDF and it used get reset to default font. To make required fonts available for PDF, I had to install them on our server machine which processes PDF generation task. Here is the link which I referred to install Microsoft fonts on our server and then it started reflecting on generated PDF pages!

That's all about key things which I found worth sharing regarding HTML to PDF conversion. Hope you find it useful. If you have any suggestions or anything else to share on this topic, please feel free to mention it in comments section.

Thank you & happy coding...!!!

Tushar Titame

I am software enthusiast. I like to work on web frameworks, services and trending things in JS world. As a hobby I love visiting beautiful places and capturing them through eyes...

Read More
Generating PDF files from HTML content
Share this

Subscribe to Engineering At Kiprosh