This is a project made by DerpDevs, we were working on some stuff and realized that we were going to need a web scraper/sitemap creator. There were current options, but none of them fullfilled the Object-Oriented style we preferred, we decided to go aside and create an Object-Oriented SiteMap Generator.
This uses an intersting method of progress due to our knowledge limits.
The way this works is:
- It scrapes the website
- It is a live generator with a "done" event for when the scraper is done and an "add" event for every url it scrapes.
- It's formatted like this:
URL: "https://example.com"
{
aPath: {
'/':'https://example.com/aPath'
subPath: {
'/':'https://example.com/aPath/subPath'
}
},
otherPath: {
'/':'https://example.com/otherPath'
}
}
and it continues like that.
This is how you create a sitemap:
(Options Are Showed In This README, excluding the filePath
option)
const SiteMapGenerator = require('oositemap')
var SiteMap = new SiteMapGenerator('https://example.com', options)
// Every time a URL is scraped
SiteMap.on('add', (url) => {console.log(`SCRAPED: ${url}`)})
// When it's done scraping
SiteMap.on('done', (generated) => {console.log(generated)})
// If it errors
SiteMap.on('error', (err) => (console.error(err)))
// Every time something gets ignored due to the "ignore" option
SiteMap.on('ignore', (url) => {console.log(`IGNORED: ${url}`)})
SiteMap.start()
SiteMap.start()
Start the scraper.
SiteMap.stop()
End the scraper (runs the done
event).
(Takes at least 2500ms before it calls the event, just to make sure it's done scraping.)
SiteMap.on(/* Event */,/* Function */)
Registers a handler for an event (one per event).
SiteMap.on('add', (url) => {/* Code Here */})
Every time a URL is scraped.
SiteMap.on('done', (generated) => {/* Code Here */})
When it's done scraping.
SiteMap.on('error', (error) => {/* Code Here */})
If it errors.
SiteMap.on('ignore', (url) => {/* Code Here */})
Every time something gets ignored due to the "ignore" option.