Nodestream
Streaming library for binary data transfers
Description
This library aims to provide a unified API for all the major storage systems out there (filesystem, AWS S3, Google Cloud Storage etc.). It also provides an easy way to manipulate data streams as they are being uploaded/downloaded from those storage systems (compression/ checksum calculation/encryption etc.).
Use cases
- Single API to rule them all
- Easy way to transform incoming/outgoing data
- Work with filesystem storage during development, AWS S3 in production without changing code
- Insert your idea here
Available adapters
Amazon S3 | GridFS | Google Cloud Storage | Local Filesystem |
Available transforms
See Pipelines and Transforms section for more info.
checksum | compress | progress | crypto (WIP) |
---|---|---|---|
Checksum Calculator | Stream (de)compressor | Progress monitor | Stream (en/de)cryption |
Usage
Installation
The first step is to install nodestream into your project:
npm install --save nodestream
The next thing is to decide which adapter you want to use. An adapter is an interface for nodestream to be able to interact with a particular storage system. Let's use local filesystem for a start:
npm install --save nodestream-filesystem
Configuration
Let's create and configure a nodestream instance with which your application can then interact:
// Require the main Nodestream classconst Nodestream = const nodestream = // This tells nodestream which storage system it should interact with // Under the hood, it will try to require `nodestream-filesystem` module adapter: 'filesystem' // This object is always specific to your adapter of choice - always check // the documentation for that adapter for available options config: // The `filesystem` adapter requires a `root` configuration option, so let's provide one root: __dirname '.storage'
Great! At this point, nodestream is ready to transfer some bytes!
Actions
Uploading
You can upload any kind of readable stream. Nodestream does not care where that stream comes from, whether it's an http upload or a file from your filesystem or something totally different.
For this example, we will upload a file from our filesystem.
We will be uploading the file to our local filesystem as well as reading it from the same filesystem. Normally you would probably use a source different from the target storage, but Nodestream does not really care.
const fs = // This is the file we will upload - create a readable stream of that fileconst profilePic = fs nodestream
Congratulations, you just uploaded your first file!
Downloading
Downloading a file is quite straight-forward - all you need is the file's location as returned by the upload()
method and a destination stream to which you want to send the data. This can be any valid writable stream. Again, Nodestream does not care where you are sending the bytes, be it local filesystem, an http response or even a different Nodestream instance (ie. S3 to GridFS transfer).
// Let's create a destination for the downloadconst fs = const destination = fs // We are hardcoding the location here, but you will probably want to// retrieve the file's location from a databasenodestream
Removing
Just pass the file's location to the .remove()
method.
nodestream
Pipelines and Transforms
Nodestream supports two features which are meant to be used together - pipelines and transforms.
- Transform: A plugin which takes an input stream and produces an output stream
- Pipeline: A re-usable, ordered collection of transforms
The real power of pipelines is that you only have to create a single pipeline, tell it which transforms it should use and then you just keep sending files to or retrieving files from it - all files will be processed in exactly the same way.
Here are some ideas what a transform can be used for. With pipelines, you can combine them to your heart's liking:
- Calculating checksums
- Compressing/decompressing data
- Modifying the data completely, ie. appending headers/footers and whatnot
Registering a transform
All transforms must be first registered with your Nodestream instance before you can use them in a pipeline. Registering is easy and is generally recommended to be done immediately after your application is started, because requiring a module is a synchronous, blocking operation, so you want to get it done before you start doing something important.
Once you configure your Nodestream instance, you can register a transform using the .registerTransform()
function.
// Let's register a compression transform! The following will try to require// `nodestream-transform-compress` package.nodestream // You can also register an actual implementation of the transform!const compress = nodestream
Using pipelines
To use a pipeline, you must first create one! Once you have your pipeline, you can then go on and tell it to use any of the registered transforms. Pipelines are reusable, so the general practice is to create one pipeline and use it for all uploads/downloads.
You may want to create multiple pipelines per project to accommodate different processing needs for your files, ie. you might have one pipeline for image uploads (with a transform plugin to calculate checksums and one to crop the images) and another pipeline for other files (with just the checksum transform). Any combination can be achieved.
// Let's create our first pipelineconst pipeline = nodestream // Now, we can tell the pipeline to use any of the registered transforms// The second parameter is specific to each transform so always check the// transform's docs to see what you can setpipeline // You can use a single pipeline for multiple file uploads/downloads// Aaand, you can also pass per-file, transform-specific options herepipeline
WARNING!
The order in which transforms are added to a pipeline using
.use()
matters! Transforms are applied to the stream in the order they were added. For example, if you first add a checksum calculation transform and then a compress transform, the checksum will be calculated from the uncompressed data. Switching the order would cause the checksum to be calculated from the compressed data. There are use cases for both situations, so the choice of ordering is completely up to you.
Accessing transform results
You might have noticed that Nodestream returns a Promise which is resolved with a results
object. In addition to the location
property, this object also contains results of all applied transformations (if there are any).
Each transform declares its "identity" - a string which will be used as a "scope" to publish its results or to provide a mechanism to customise the transform's options. Check each of the transforms' documentation to learn what is its identity
string.
For the checksum
transform, its identity is, surprisingly, "checksum". When used, the calculated checksum can be obtained as follows:
nodestream
There is no limit to the amount of transforms which can be registered per Nodestream instance, although there are some practical limitations which restrict you to only one particular transform type per Nodestream instance (ie. only one checksum transform with only one compress transform).
Contributing
Looking for suggestions, improvements, bug reports... any kind of contribution is welcome!
License
This software is licensed under the BSD-3-Clause License. See the LICENSE file for more information.