Distributed Computing Framework for Node.js

Early development stage: this project was still under early development, many necessery feature was not done yet, use it on your own risk.

Document

API Reference

A node.js version of Spark, without hadoop or jvm.

You should read tutorial first, then you can learn Spark but use this project instead.

Async API & deferred API

Any api that requires a RDD and generate a result is async, like count, take, max ... Any api that creates a RDD is deferred API, which is not async, so you can chain them like this:

await dcc
  .parallelize([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
  .map(v => v + 1)
  .filter(v => v % 2 === 0)
  .take(10); // take is not deferred api but async

Milestones

0.1.x: Basic

[x] local master.
[x] rdd & partition creation & release.
[x] map & reduce
[x] repartition & reduceByKey
[x] disk storage partitions
[x] cache
[x] file loader & saver
[x] export module to npm
[x] decompresser & compresser
[x] use debug module for information/error
[x] provide a progress bar.
[ ] sampler
[x] sort
[ ] object hash(for key) method
[ ] storage MEMORY_OR_DISK, and use it in sort
[ ] storage MEMORY_SER，storage in memory but off v8 heap.
[ ] config default partition count.

0.2.x: Remote mode

[ ] distributed master
[ ] runtime sandbox
[ ] plugin system
[ ] remote dependency management
[ ] aliyun oss loader
[ ] hdfs loader

How to use

Install from npm(shell only)

npm install -g dcf
#or
yarn global add dcf

Then you can use command: dcf-shell

Install from npm(as dependency)

npm install --save dcf
#or
yarn add dcf

Then you can use dcf with javascript or typescript.

Run samples & cli

download this repo, install dependencies

npm install
# or
yarn

Run samples:

npm run ts-node src/samples/tutorial-0.ts
npm run ts-node src/samples/repartition.ts

Run interactive cli:

npm start

dcf

Distributed Computing Framework for Node.js

Async API & deferred API

Milestones

0.1.x: Basic

0.2.x: Remote mode

How to use

Install from npm(shell only)

Install from npm(as dependency)

Run samples & cli

Readme

Keywords

Package Sidebar

Install

Weekly Downloads

Version

License

Unpacked Size

Total Files

Last publish

Collaborators

dcf

Distributed Computing Framework for Node.js

Async API & deferred API

Milestones

0.1.x: Basic

0.2.x: Remote mode

How to use

Install from npm(shell only)

Install from npm(as dependency)

Run samples & cli

Readme

Keywords

Package Sidebar

Install

DownloadsWeekly Downloads

Version

License

Unpacked Size

Total Files

Last publish

Collaborators

Weekly Downloads