JavaScript port of gjtorikian/html-proofer (Hash: 93ba616eb45b7ba844197fc29824995e8fcd2279
, Version: 4.0.1
)
Currently, the library is fully working and I'm using it internally for my projects.
I've used an original html-proofer for many projects to check static documentation link integrity. I was running it with locally installed ruby, locally with ruby in docker, on CI runner and it was working fine.
But in some cases, I had a requirement to create a custom set of checks with pretty complex logic including integration with non-public services. Ruby is not a wide-spread programming language in my working environment it was pretty hard to find someone who either know Ruby or was willing to learn a new language just to maintain or occasionally write new checks.
I've decided to port html-proofer from Ruby to JavaScript as JavaScript was already a part of the stack I was using and it perfectly complements other automated tests for static sites. JavaScript is a mainstream language and well known withing the dev community, so it is not a problem anymore to find developers with the required skills.
I've tried to keep the original html-proofer API as much as possible, but some APIs I had to change to be better consumed from JavaScript world.
Disregard the method of usage the library should be installed first.
Install in current folder
npm install html-proofer.js
or install globally (it would be available in any folder)
npm install -g html-proofer.js
Running for current folder:
npx htmlproofer .
Output would look like the following (if there are no issues detected):
Running 3 checks (Links, Images, Scripts) in . on *.html files...
Ran on X files!
HTML-Proofer finished successfully.
You can import library and implement custom checks or just run default set of checks on desired file/folder.
Let's assume we want to check that our html files does not contain mailto links to octocat@github.com
mailto_octocat.html
<h1>Hello</h1>
<a href="mailto:octocat@github.com">hey!</a>
<a href="mailto:someoneelse@github.com">ho!</a>
You can create custom check class
const {HTMLProofer, Check, DummyReporter} = require('html-proofer.js')
class MailToOctocat extends Check {
internalRun() {
for (const node of this.html.css('a')) {
const link = this.createElement(node)
if (link.isIgnore()) {
continue
}
if (this.isMailtoOctocat(link)) {
this.addFailure(`Don't email the Octocat directly!`, link.line)
}
}
}
isMailtoOctocat(link) {
return link.url.rawAttribute === 'mailto:octocat@github.com'
}
}
Now we are ready to submit our custom check to HTMLProofer
const reporter = new DummyReporter()
const options = {
checks: [MailToOctocat],
}
const path = '<directory>'
main = async () => {
const proofer = HTMLProofer.checkDirectory(path, options, reporter)
await proofer.run()
console.log(proofer.failedChecks)
}
main()
as a result it should report something like that:
Running 1 check (MailToOctocat) in <directory> on *.html files...
Ran on 1 file!
HTML-Proofer found 1 failure!
[
Failure {
path: '<directory>/mailto_octocat.html',
checkName: 'MailToOctocat',
description: "Don't email the Octocat directly!",
line: 3,
status: null,
content: null
}
]
The HTMLProofer
constructor takes an optional hash of additional options:
Option | Description | Default |
---|---|---|
allow_hash_href |
If true , assumes href="#" anchors are valid |
true |
allow_missing_href |
If true , does not flag a tags missing href . In HTML5, this is technically allowed, but could also be human error. |
false |
assume_extension |
Automatically add specified extension to files for internal links, to allow extensionless URLs (as supported by most servers) | .html |
checks |
An array of Strings indicating which checks you want to run | Links,Images,Scripts |
check_external_hash |
Checks whether external hashes exist (even if the webpage exists) | true |
check_sri |
Check that <link> and <script> external resources use SRI |
false |
directory_index_file |
Sets the file to look for when a link refers to a directory. | index.html |
disable_external |
If true , does not run the external link checker |
false |
enforce_https |
Fails a link if it's not marked as https . |
true |
extensions |
An array of Strings indicating the file extensions you would like to check (including the dot) | ['.html'] |
ignore_empty_alt |
If true , ignores images with empty/missing alt tags (in other words, <img alt> and <img alt=""> are valid; set this to false to flag those) |
true |
ignore_files |
An array of Strings or RegExps containing file paths that are safe to ignore. | [] |
ignore_empty_mailto |
If true , allows mailto: href s which do not contain an email address. |
false |
ignore_missing_alt |
If true , ignores images with missing alt tags |
false |
ignore_status_codes |
An array of numbers representing status codes to ignore. | [] |
ignore_urls |
An array of Strings or RegExps containing URLs that are safe to ignore. This affects all HTML attributes, such as alt tags on images. |
[] |
log_level |
Sets the logging level. One of debug , info , warn , or error
|
info |
only_4xx |
Only reports errors for links that fall within the 4xx status code range. | false |
root_dir |
The absolute path to the directory serving your html-files. | "" |
swap_attributes |
JSON-formatted config that maps element names to the preferred attribute to check | {} |
swap_urls |
A hash containing key-value pairs of RegExp => String . It transforms URLs that match RegExp into String via gsub . |
{} |
ancestors_ignorable |
Check ancestor elements for data-proofer-ignore attribute, this could cause performance degradation for large sites (disable it if not required) |
true |