pdf2image
DefinitelyTyped icon, indicating that this package has TypeScript declarations provided by the separate @types/pdf2image package

1.2.3 • Public • Published

pdf2image

This package allows an easy way to convert a pdf file into images of a given size and quality of the desired pages.

All of the convert operations return a Promise that in success returns a list of the processed pages. in case of an error returns the list of errors.

Requirements

The following programs need to de installed for this module to work.

  • Imagemagick
  • pdfinfo

Example

var pdf2image = require('pdf2image');
 
//converts all the pages of the given pdf using the default options 
pdf2image.convertPDF('example.pdf').then(
    function(pageList){
        console.log(pageList);
    }
);

The Promise return on success

[
    {
        "page" : 1,
        "index": 1,
        "name" : "page_1.jpg",
        "path" : "path/to/page_1.jpg"
    }
    ,{
        "page" : 2,
        "index": 2,
        "name" : "page_2.jpg",
        "path" : "path/to/page_2.jpg"
    }
    ,{
        "page" : 5,
        "index": 3,
        "name" : "page_5.jpg",
        "path" : "path/to/page_5.jpg"
    }
]

Note: the result list is always sorted.

Options

The converter has the given options:

  1. density - this defines the density of the generated image. (default: 96)
  2. width - this defines the width of the generated image.
  3. height - this defines the height of the generated image.
  4. quality - this defines the quality of the generated image. (only for jpg, default: 100)
  5. backgroundColor - this defines the color to be used as the background for the pages that contain transparency.
    1. The color must be defined as hex string (e.g. '#FF0000' - red)
    2. If the output is png and no background is defined then the transparency will remain, if the output is jpg then the backgroundColor will default to white ('#FFFFFF').
  6. outputType - the output of image to be generated (default: 'jpg'). Options - jpg, png.
  7. pages - this defines the pages to be converted using the page option syntax. (default: '*')
  8. singleProcess - Makes the system process only a page at a times has not to hog all avaliable resources.
  9. outputFormat - this option defines the full path of the generated images. This option can be defined one of two ways:
    1. As a string, that can contain several tokens. A token always starts with the character '%'. If an invalid token is detected it will be ignored. The list of tokens is the following:

      1. d - Represents the page number.
      2. D - Represents the page number considering the first page is page 0.
      3. i - Represents the order of processing of the page.
      4. I - Represents the order of processing of the page considering the first processed page is page 0.
        • e.g. Considering that only pages 1, 5, 7 and 8 will be processed then this will be the respective values:
          • Page 1: d = 1, D = 0, i = 1, I = 0.
          • Page 5: d = 5, D = 4, i = 2, I = 1.
          • Page 7: d = 7, D = 6, i = 3, I = 2.
          • Page 8: d = 8, D = 7, i = 4, I = 3.
      5. t - Represents the total number of pages in the pdf.
      6. T - Represents the total number of processed pages.
      7. s - Represents the name of the pdf file.
      8. p - Represents the path of the directory containing the pdf file.
        • file : "/home/user/file1.pdf", p = "/home/user/"
        • file : "../file2.pdf", p = "../"
        • file : "file3.pdf", p = ""
      9. % - Inserts a '%' character.
      10. {...} - this is a pice of code that as acess the previous mencioned values
        • e.g. "example_%{d+10}" will generate for page 1 the string "example_11".
    2. As a function, with the following parameters (pageNum, pageIndex, totalPagesProcessed, totalPDFPages, name, path) where:

      • pageNum - Is the page number (same as the token 'd').
      • pageIndex - Is the order of processing of the page (same as the token 'i').
      • totalPagesProcessed - Is the total number of processed pages (same as the token 'T').
      • totalPDFPages - Is the total number of pages in the PDF (same as the token 't').
      • name - Is the name of the pdf file (same as the token 's').
      • path - Is the path of the directory containing the pdf file (same as the token 'p').
    • Note: The file extension will be put automatically.

Notes

  1. Any invalid option will be ignored.
  2. Only density or width and/or height can be used at once, if none are found it will be used by default a density of 96.
  3. In case that width and heigth are both defined, the final dimensions will be the ones where the picture fits in the given dimensions while keeping the original image ratio.

Page option syntax

The pages option allows an easy way to define the pages to be converted using any combination of the rules seperated by commas.

Rules

(X and Y represent an postive integer number)

  1. * : converts all pages. This is a special rule because it cannot be used with any other rule.
  2. X : converts the page X.
  3. X-Y : converts all pages between X and Y (X and Y included).
  4. -X : converts all pages between 1 and X (same as 1-X)
  5. X- : converts all pages between X and the last page (X included)
  6. /X : converts all pages that are multiple of X
  7. even -> converts all even pages
  8. odd -> converts all odd pages

Notes

  1. Any invalid rule will simply be ignored.
  2. Any page will only be converted once, even if there are more than one rule defining it.
  3. Any page that doesnt exist will be ignored.

Example: '1,3,7-9' will convert pages 1,3,7,8 and 9.

Example using the options

var pdf2image = require('pdf2image');
 
//converts the pages 1,3,5,6,7,9 and above with quality 100, density 200, and with the name "example_page_X.jpg"
pdf2image.convertPDF('example.pdf',{
    density : 200,
    quality : 100,
    outputFormat : '%s_page_%d',
    outputType : 'jpg',
    pages : '1,3,5-7,9-'
});

Compiling a converter

This allows to have a set of options that can be used repeatedly on various pdfs.

var pdf2image = require('pdf2image');
 
//The converter uses the same options as the convertPDF function
var converter = pdf2image.compileConverter({
    density : 200,
    quality : 100,
    outputFormat : '%s_page_%d',
    outputType : 'png',
    pages : 'even'
});
 
//Converts a single pdf
converter.convertPDF('example.pdf');
 
//Converts multiple pdfs
converter.convertPDFList(['example1.pdf','example1.pdf']);
 
 

Tests

Tested on Ubuntu 16.10 with node v6.10.0

Package Sidebar

Install

npm i pdf2image

Weekly Downloads

569

Version

1.2.3

License

MIT

Last publish

Collaborators

  • ricardoc