WowYow Vision Node.js SDK

Getting Started

Install

npm install '@wowyow/vision-node'

Include

const Vision = require('@wowyow/vision-node');

Authorize

Vision.APIKey = '';

WowYow API Key can be obtained from WowYow AI Studio.

Initialize

let vision = new Vision('https://www.youtube.com/watch?v=IgiB3dyxUvc');

Examples

Documentation

Initialization

There are several ways of initializing the process. As soon as Vision class is instantiated, the processing will begin. Entire process consists of of several subprocesses depending on file type or link. They are: upload, frame extraction & detection.

URL

let vision = new Vision('https://www.youtube.com/watch?v=IgiB3dyxUvc');

Supported sites: YouTube, Dailymotion, Vimeo, Discovery, FOX, Instagram, Facebook & more.

Video Upload

let vision = new Vision('./people_dancing.mp4');

Only mp4 & webm file types are supported.

Configuration

SDK Offers range of configuration options to optimize performance or help debugging. Configuration is passed as second argument in SDK.

Option	Type	Default	Description
models	Array	['DETECT_SCENE', 'DETECT_PEOPLE']	List of models to use. Available are: DETECT_SCENE, DETECT_PEOPLE, DETECT_CLOTHING
fps	Integer	4	Processing FPS. Values can be between 4 and 30. This option is ignored if stream is used. See RTSP
target	String	''	URL of the machine used for processing
directConnection	Boolean	false	Specify whether or not client should establish direct connection to target
preview	Boolean	false	If true, response data will contain preview images with skeleton for DETECT_PEOPLE model and index numbers for pathing.
logPerf	Boolean	false	If true, performance metrics will be logged to console.
skipFrames	Boolean	true	This option is used only for streams. If true, processing will skip as many frames as it's necessary to keep up with video playback. If false, frames will be queued.
footage	Object		Include to optimize detection for specific camera type.
footage.lens	String	'standard'	Type of camera lens. Options are: standard, fisheye, wide
footage.mod	String	''	Video modification applied. Options are: dewarped
footage.movement	String	''	Movement of camera. Options are: static, handheld, pan
ffmpeg	String	''	If you are experiencing issues with RTSP streams please include path to your own ffmpeg binary via this parameter. If ffmpeg is available in your system PATH, you can just pass 'ffmpeg' as parameter. For more help on ffmpeg installation visit Official FFMPEG Download Page.

Config Example

let vision = new Vision(e, {
  models: ['DETECT_SCENE', 'DETECT_PEOPLE']
});

Events

vision.on('progress', ({ event, progress }) => {}); // events are: upload, frame-extraction & detection.
vision.on('data', (data) => {}); // See Data Schema section for details
vision.on('start', () => {});
vision.on('pause', () => {});
vision.on('resume', () => {});
vision.on('end', () => {});
vision.on('error', (err) => {});
vision.on('metadata', (metadata) => {}); // See Metadata section for details
vision.on('stream', (stream) => {}); // See Stream section for details

Methods

vision.pause();
vision.resume();
vision.stop();

Properties

vision.started;
vision.ended;
vision.paused;
vision.metadata;
vision.stream;

Static Methods

// check if there is an avilable server
let available = Vision.available({
  target: ''
});
console.log(available); // true, false

// fetch complete algorithms
let algorithms = Vision.algorithms({
  target: ''
});
console.log(algorithms); // list of algorithms

// fetch available nodes
let nodes = Vision.nodes({
  target: ''
});
console.log(nodes); // list of nodes available to be used in algorithms

Data Schema

Field	Type	Description	Note
data	Object	top-level object
data.{model}	Object
data.{model}.mediaId	String	WowYow Media Identifier
data.{model}.frame	Integer	Frame Number
data.{model}.timestamp	Decimal	Processing FPS.	* Not used in RTSP
data.{model}.width	Integer	Frame Width
data.{model}.height	Integer	Frame Height
data.{model}.model	String	Model name
data.{model}.predictions	Array	List of predictions generated by this model.	* Check [Model Schema](#Model Schema) to see schema for each model.
data.{model}.source	Object
data.{model}.source.base64	String	Base64 link of frame
data.{model}.source.jpeg	String	JPEG link of frame
data.{model}.preview	Object		* Included only if preview config is used.
data.{model}.preview.base64	String	Base64 link of frame including model overlay information
data.{model}.preview.jpeg	String	JPEG link of frame including model overlay information

Metadata

Field	Type	Description	Note
duration	Number	Seconds of video duration. In case of RTSP Stream it will have Infinity value

Stream

Field	Type	Description	Note
url	String	Url to access streamable video file from the server which can be directly added as source of video html element

Model Schema

DETECT_SCENE

Field	Type	Description	Note
prediction.index	Integer	Scene identifier	* Do not expect index to always be incremented by 1

DETECT_PEOPLE

Field	Type	Description	Note
prediction	Object	top-level object
prediction.score	Decimal
prediction.index	Integer	Person identifier used for pathing	* Do not expect index to always be incremented by 1
prediction.keypoints	Array	List of keypoints
prediction.keypoints[].part	String	Name of the body part. Options are: nose, leftEye, rightEye, leftEar, rightEar, leftShoulder, rightShoulder, leftElbow, rightElbow, leftWrist, rightWrist, leftHip, rightHip, leftKnee, rightKnee, leftAnkle, rightAnkle
prediction.keypoints[].position	Object
prediction.keypoints[].position.x	Decimal	Position of keypoint on x axis
prediction.keypoints[].position.y	Decimal	Position of keypoint on y axis
prediction.segments	Object	Object containing each person segment
prediction.segments.body	Object
prediction.segments.body.bbox	Object	Bounding Box of person body segment
prediction.segments.body.bbox.x0	Integer	Coordinate top offset	* Check BBox.
prediction.segments.body.bbox.y0	Integer	Coordinate left offset	* Check BBox.
prediction.segments.body.bbox.x1	Integer	Coordinate top offset	* Check BBox.
prediction.segments.body.bbox.y1	Integer	Coordinate left offset	* Check BBox.
prediction.segments.body.preview	Object		* Included only if preview config is used.
prediction.segments.body.preview.base64	Object	Base64 link of segment cut-out.
prediction.segments.body.preview.jpeg	Object	JPEG link of segment cut-out.
prediction.segments.face	Object
prediction.segments.face.bbox	Object	Bounding Box of person face segment
prediction.segments.face.bbox.x0	Integer	Coordinate top offset	* Check BBox.
prediction.segments.face.bbox.y0	Integer	Coordinate left offset	* Check BBox.
prediction.segments.face.bbox.x1	Integer	Coordinate top offset	* Check BBox.
prediction.segments.face.bbox.y1	Integer	Coordinate left offset	* Check BBox.
prediction.segments.face.preview	Object		* Included only if preview config is used.
prediction.segments.face.preview.base64	Object	Base64 link of segment cut-out.
prediction.segments.face.preview.jpeg	Object	JPEG link of segment cut-out.
prediction.position	String	Human position detection. Options are: sitting, standing
prediction.clothing	Object	Object containing clothing	* Only if DETECT_CLOTHING model is used. Schema is same as this model's main schema.

DETECT_CLOTHING

Field	Type	Description	Note
prediction	Object	top-level object
prediction.score	Decimal
prediction.index	Integer	Clothing identifier used for pathing	* Do not expect index to always be incremented by 1
prediction.type	String	Type of the clothing. Options are: Leggings, Jodhpurs, Capris, Shorts, Jeans, Joggers, Skirt, Gauchos, Culottes, Sweatshorts, Trunks, Cutoffs, Sarong, Sweatpants, Chinos, Halter, Hoodie, Henley, Parka, Cardigan, Tank, Bomber, Peacoat, Top, Poncho, Button-Down, Anorak, Sweater, Blouse, Turtleneck, Blazer, Jacket, Jersey, Tee, Flannel, Jeggings
prediction.bbox	Object	Bounding Box of clothing item
prediction.bbox.x0	Integer	Coordinate top offset	* Check BBox.
prediction.bbox.y0	Integer	Coordinate left offset	* Check BBox.
prediction.bbox.x1	Integer	Coordinate top offset	* Check BBox.
prediction.bbox.y1	Integer	Coordinate left offset	* Check BBox.
prediction.preview	Object		* Included only if preview config is used.
prediction.preview.base64	Object	Base64 link of segment cut-out.
prediction.preview.jpeg	Object	JPEG link of segment cut-out.

RTSP

RTSP processing differs from videos with fixed duration. Serveral things need to be kept in mind when using it:

Response data frame and timestamp use start of processing as reference point, starting both from 0.
By default, frames are dropped if processing is lagging behind video playback for any reason. Whether it's because of low processing fps or network issues.