WowYow Vision Node.js SDK
npm install '@wowyow/vision-node'
const Vision = require('@wowyow/vision-node');
WowYow API Key can be obtained from WowYow AI Studio.
let vision = new Vision('https://www.youtube.com/watch?v=IgiB3dyxUvc');
There are several ways of initializing the process. As soon as Vision class is instantiated, the processing will begin.
Entire process consists of of several subprocesses depending on file type or link. They are: upload, frame extraction & detection.
let vision = new Vision('https://www.youtube.com/watch?v=IgiB3dyxUvc');
Supported sites: YouTube, Dailymotion, Vimeo, Discovery, FOX, Instagram, Facebook & more.
let vision = new Vision('./people_dancing.mp4');
Only mp4 & webm file types are supported.
SDK Offers range of configuration options to optimize performance or help debugging. Configuration is passed as second argument in SDK.
Option |
Type |
Default |
Description |
models |
Array |
['DETECT_SCENE', 'DETECT_PEOPLE'] |
List of models to use. Available are: DETECT_SCENE, DETECT_PEOPLE, DETECT_CLOTHING |
fps |
Integer |
4 |
Processing FPS. Values can be between 4 and 30. This option is ignored if stream is used. See RTSP
|
target |
String |
'' |
URL of the machine used for processing |
directConnection |
Boolean |
false |
Specify whether or not client should establish direct connection to target |
preview |
Boolean |
false |
If true, response data will contain preview images with skeleton for DETECT_PEOPLE model and index numbers for pathing. |
logPerf |
Boolean |
false |
If true, performance metrics will be logged to console. |
skipFrames |
Boolean |
true |
This option is used only for streams. If true, processing will skip as many frames as it's necessary to keep up with video playback. If false, frames will be queued. |
footage |
Object |
|
Include to optimize detection for specific camera type. |
footage.lens |
String |
'standard' |
Type of camera lens. Options are: standard, fisheye, wide |
footage.mod |
String |
'' |
Video modification applied. Options are: dewarped |
footage.movement |
String |
'' |
Movement of camera. Options are: static, handheld, pan |
ffmpeg |
String |
'' |
If you are experiencing issues with RTSP streams please include path to your own ffmpeg binary via this parameter. If ffmpeg is available in your system PATH, you can just pass 'ffmpeg' as parameter. For more help on ffmpeg installation visit Official FFMPEG Download Page. |
let vision = new Vision(e, {
models: ['DETECT_SCENE', 'DETECT_PEOPLE']
});
vision.on('progress', ({ event, progress }) => {}); // events are: upload, frame-extraction & detection.
vision.on('data', (data) => {}); // See Data Schema section for details
vision.on('start', () => {});
vision.on('pause', () => {});
vision.on('resume', () => {});
vision.on('end', () => {});
vision.on('error', (err) => {});
vision.on('metadata', (metadata) => {}); // See Metadata section for details
vision.on('stream', (stream) => {}); // See Stream section for details
vision.pause();
vision.resume();
vision.stop();
vision.started;
vision.ended;
vision.paused;
vision.metadata;
vision.stream;
// check if there is an avilable server
let available = Vision.available({
target: ''
});
console.log(available); // true, false
// fetch complete algorithms
let algorithms = Vision.algorithms({
target: ''
});
console.log(algorithms); // list of algorithms
// fetch available nodes
let nodes = Vision.nodes({
target: ''
});
console.log(nodes); // list of nodes available to be used in algorithms
Field |
Type |
Description |
Note |
data |
Object |
top-level object |
|
data.{model} |
Object |
|
|
data.{model}.mediaId |
String |
WowYow Media Identifier |
|
data.{model}.frame |
Integer |
Frame Number |
|
data.{model}.timestamp |
Decimal |
Processing FPS. |
* Not used in RTSP |
data.{model}.width |
Integer |
Frame Width |
|
data.{model}.height |
Integer |
Frame Height |
|
data.{model}.model |
String |
Model name |
|
data.{model}.predictions |
Array |
List of predictions generated by this model. |
* Check [Model Schema](#Model Schema) to see schema for each model. |
data.{model}.source |
Object |
|
|
data.{model}.source.base64 |
String |
Base64 link of frame |
|
data.{model}.source.jpeg |
String |
JPEG link of frame |
|
data.{model}.preview |
Object |
|
* Included only if preview config is used. |
data.{model}.preview.base64 |
String |
Base64 link of frame including model overlay information |
|
data.{model}.preview.jpeg |
String |
JPEG link of frame including model overlay information |
|
Field |
Type |
Description |
Note |
duration |
Number |
Seconds of video duration. In case of RTSP Stream it will have Infinity value |
|
Field |
Type |
Description |
Note |
url |
String |
Url to access streamable video file from the server which can be directly added as source of video html element |
|
Field |
Type |
Description |
Note |
prediction.index |
Integer |
Scene identifier |
* Do not expect index to always be incremented by 1 |
Field |
Type |
Description |
Note |
prediction |
Object |
top-level object |
|
prediction.score |
Decimal |
|
|
prediction.index |
Integer |
Person identifier used for pathing |
* Do not expect index to always be incremented by 1 |
prediction.keypoints |
Array |
List of keypoints |
|
prediction.keypoints[].part |
String |
Name of the body part. Options are: nose, leftEye, rightEye, leftEar, rightEar, leftShoulder, rightShoulder, leftElbow, rightElbow, leftWrist, rightWrist, leftHip, rightHip, leftKnee, rightKnee, leftAnkle, rightAnkle |
|
prediction.keypoints[].position |
Object |
|
|
prediction.keypoints[].position.x |
Decimal |
Position of keypoint on x axis |
|
prediction.keypoints[].position.y |
Decimal |
Position of keypoint on y axis |
|
prediction.segments |
Object |
Object containing each person segment |
|
prediction.segments.body |
Object |
|
|
prediction.segments.body.bbox |
Object |
Bounding Box of person body segment |
|
prediction.segments.body.bbox.x0 |
Integer |
Coordinate top offset |
* Check BBox. |
prediction.segments.body.bbox.y0 |
Integer |
Coordinate left offset |
* Check BBox. |
prediction.segments.body.bbox.x1 |
Integer |
Coordinate top offset |
* Check BBox. |
prediction.segments.body.bbox.y1 |
Integer |
Coordinate left offset |
* Check BBox. |
prediction.segments.body.preview |
Object |
|
* Included only if preview config is used. |
prediction.segments.body.preview.base64 |
Object |
Base64 link of segment cut-out. |
|
prediction.segments.body.preview.jpeg |
Object |
JPEG link of segment cut-out. |
|
prediction.segments.face |
Object |
|
|
prediction.segments.face.bbox |
Object |
Bounding Box of person face segment |
|
prediction.segments.face.bbox.x0 |
Integer |
Coordinate top offset |
* Check BBox. |
prediction.segments.face.bbox.y0 |
Integer |
Coordinate left offset |
* Check BBox. |
prediction.segments.face.bbox.x1 |
Integer |
Coordinate top offset |
* Check BBox. |
prediction.segments.face.bbox.y1 |
Integer |
Coordinate left offset |
* Check BBox. |
prediction.segments.face.preview |
Object |
|
* Included only if preview config is used. |
prediction.segments.face.preview.base64 |
Object |
Base64 link of segment cut-out. |
|
prediction.segments.face.preview.jpeg |
Object |
JPEG link of segment cut-out. |
|
prediction.position |
String |
Human position detection. Options are: sitting, standing |
|
prediction.clothing |
Object |
Object containing clothing |
* Only if DETECT_CLOTHING model is used. Schema is same as this model's main schema. |
Field |
Type |
Description |
Note |
prediction |
Object |
top-level object |
|
prediction.score |
Decimal |
|
|
prediction.index |
Integer |
Clothing identifier used for pathing |
* Do not expect index to always be incremented by 1 |
prediction.type |
String |
Type of the clothing. Options are: Leggings, Jodhpurs, Capris, Shorts, Jeans, Joggers, Skirt, Gauchos, Culottes, Sweatshorts, Trunks, Cutoffs, Sarong, Sweatpants, Chinos, Halter, Hoodie, Henley, Parka, Cardigan, Tank, Bomber, Peacoat, Top, Poncho, Button-Down, Anorak, Sweater, Blouse, Turtleneck, Blazer, Jacket, Jersey, Tee, Flannel, Jeggings |
|
prediction.bbox |
Object |
Bounding Box of clothing item |
|
prediction.bbox.x0 |
Integer |
Coordinate top offset |
* Check BBox. |
prediction.bbox.y0 |
Integer |
Coordinate left offset |
* Check BBox. |
prediction.bbox.x1 |
Integer |
Coordinate top offset |
* Check BBox. |
prediction.bbox.y1 |
Integer |
Coordinate left offset |
* Check BBox. |
prediction.preview |
Object |
|
* Included only if preview config is used. |
prediction.preview.base64 |
Object |
Base64 link of segment cut-out. |
|
prediction.preview.jpeg |
Object |
JPEG link of segment cut-out. |
|
RTSP processing differs from videos with fixed duration. Serveral things need to be kept in mind when using it:
- Response data frame and timestamp use start of processing as reference point, starting both from 0.
- By default, frames are dropped if processing is lagging behind video playback for any reason. Whether it's because of low processing fps or network issues.
