elastic-web-crawler
TypeScript icon, indicating that this package has built-in type declarations

1.0.2 • Public • Published

elastic-web-crawler

This is an unofficial client for working with the Elastic Web Crawler API. This package is isomorphic, allowing it to be utilized in both browser and Node.js environments.

Installation

Using npm:

  npm install elastic-web-crawler

Using yarn:

  yarn add elastic-web-crawler

Usage/Examples

Here is an example of a wrapper component that sets up the ElasticWebCrawler instance and passes it down to its child components:

import React, {
  PropsWithChildren,
  createContext,
  useContext,
  useMemo,
} from 'react';
import {
  ElasticWebCrawler,
  ElasticWebCrawlerRequiredArguments,
} from 'elastic-web-crawler';

export const ElasticWebCrawlerContext = createContext<ElasticWebCrawler | null>(
  null,
);

export const useElasticWebCrawler = () => {
  const elasticWebCrawler = useContext(
    ElasticWebCrawlerContext,
  ) as ElasticWebCrawler;

  if (!elasticWebCrawler) {
    throw new Error(
      'ElasticWebCrawler not found. Make sure to use the ElasticWebCrawlerProvider at the top level of your application.',
    );
  }
  return elasticWebCrawler;
};

export interface ElasticWebCrawlerProviderProps {
  elasticWebCrawlerArguments: ElasticWebCrawlerRequiredArguments;
}

export const ElasticWebCrawlerProvider = ({
  children,
  elasticWebCrawlerArguments,
}: PropsWithChildren<ElasticWebCrawlerProviderProps>) => {
  const elasticWebCrawler = useMemo(
    () => new ElasticWebCrawler(elasticWebCrawlerArguments),
    [elasticWebCrawlerArguments],
  );

  return (
    <ElasticWebCrawlerContext.Provider value={elasticWebCrawler}>
      {children}
    </ElasticWebCrawlerContext.Provider>
  );
};

Then, at the top level of your application, you can wrap your components with the ElasticWebCrawlerProvider component and pass in the client instance. Any component that needs access to the ElasticWebCrawler instance can use the useElasticWebCrawler hook to gain access to it.

import { ElasticWebCrawlerProvider } from './ElasticWebCrawlerContext';
import { useElasticWebCrawler } from './ElasticWebCrawlerContext';

const App = () => (
  <ElasticWebCrawlerProvider
    elasticWebCrawlerArguments={{
      engineName: 'engineName',
      baseUrl: 'baseUrl',
      token: 'token',
    }}
  >
    <YourComponent />
  </ElasticWebCrawlerProvider>
);

const YourComponent = () => {
  const elasticWebCrawler = useElasticWebCrawler();
  // use the crawler instance here
};

Additionally, you can use it in Node.js.

const ElasticWebCrawler = require('elastic-web-crawler');

(async () => {
  const elasticWebCrawler = new ElasticWebCrawler({
    engineName: 'engineName',
    baseUrl: 'baseUrl',
    token: 'token',
  });

  const crawler = await elasticWebCrawler.crawler();
})();

Appendix

Please note that this package is currently being improved upon. The following features will be added in future updates:

1. Crawl rules
2. Entry points
3. User Agent
4. Extracting content from a URL
5. Tracing a URL

🔗 Links

github

License

MIT

Package Sidebar

Install

npm i elastic-web-crawler

Weekly Downloads

5

Version

1.0.2

License

MIT

Unpacked Size

362 kB

Total Files

157

Last publish

Collaborators

  • armantakmazyan