Installation
npm i node-pdf-ocr
Install Tesseract from GitHub
Install Ghostscript from GitHub
Edit .env file and set the following variables:
TESSERACT_PATH = /path/to/tesseract/tessdata
GHOSTSCRIPT_PATH = /path/to/ghostscript
Note: You may remove GHOSTSCRIPT_PATH if you have installed Ghostscript on your system and added it to PATH.
Usage
require('dotenv').config(); // To load executable paths from .env file
const PdfOcr = require('node-pdf-ocr');
PdfOcr('/path/to/pdf/file.pdf')
.then((text) => console.log(text))
.catch((err) => console.error(err));
License
MIT License