test code against a certain rate of production traffic
Loops a task function, for a given duration, across multiple threads.
A test is deemed succesful if it ends without creating a cycle backlog.
example: benchmark a recursive fibonacci function across 4 threads
// benchmark.js
import { dyno } from '@nicholaswmin/dyno'
await dyno(async function cycle() {
// <benchmarked-code>
function fibonacci(n) {
return n < 1 ? 0
: n <= 2 ? 1 : fibonacci(n - 1) + fibonacci(n - 2)
}
fibonacci(35)
// </benchmarked-code>
}, {
// test parameters
parameters: { cyclesPerSecond: 100, threads: 4, durationMs: 5 * 1000 },
// log live stats
onTick: list => {
console.clear()
console.table(list().primary().pick('count'))
console.table(list().threads().pick('mean'))
}
})
run it:
node benchmark.js
logs:
cycle stats
┌─────────┬────────┬───────────┬─────────┐
│ uptime │ issued │ completed │ backlog │
├─────────┼────────┼───────────┼─────────┤
│ 4 │ 100 │ 95 │ 5 │
└─────────┴────────┴───────────┴─────────┘
average timings/durations, in ms
┌─────────┬───────────┬────────┐
│ thread │ evt_loop │ cycle │
├─────────┼───────────┼────────┤
│ '46781' │ 10.47 │ 10.42 │
│ '46782' │ 10.51 │ 10.30 │
│ '46783' │ 10.68 │ 10.55 │
│ '46784' │ 10.47 │ 10.32 │
└─────────┴───────────┴────────┘
npm i @nicholaswmin/dyno
npx init
creates a preconfigured sample
benchmark.js
.
Run it:
node benchmark.js
import { dyno } from '@nicholaswmin/dyno'
await dyno(async function cycle() {
// add benchmarked task
// code in this block runs in its own thread
}, {
parameters: {
// add test parameters
},
onTick: list => {
// build logging from the provided measurements
}
})
name | type | default | description |
---|---|---|---|
cyclesPerSecond |
Number |
50 |
global cycle issue rate |
durationMs |
Number |
5000 |
how long the test should run |
threads |
Number |
auto |
number of spawned threads |
auto
means it detects the available cores but can be overridenthese parameters are user-configurable on test startup.
The primary
spawns the benchmarked code as task threads
.
Then, it starts issuing cycle
commands to each one, in round-robin,
at a set rate, for a set duration.
The task threads
must execute their tasks faster than the time it takes for
their next cycle
command to come through, otherwise the test will start
accumulating a cycle backlog
.
When that happens, the test stops; the configured cycle rate
is deemed as
the current breaking point of the benchmarked code.
An example:
A benchmark configured to use
threads: 4
&cyclesPerSecond: 4
.
Each task thread
must execute its own code in < 1 second
since this
is the rate at which it receives cycle
commands.
The main process. Orchestrates the test and the spawned task threads
.
The benchmarked code, running in its own separate process.
Receives cycle
commands from the primary, executes it's code and records
its timings.
The benchmarked code
A command that signals a task thread
to execute it's code.
The rate at which the primary sends cycle
commands to the task threads
Amount of time it takes a task thread
to execute it's own code
Count of issued cycle
commands that have been issued/sent but not
executed yet.
This is how the process model would look, if sketched out.
// assume `fib()` is the benchmarked code
Primary 0: cycles issued: 100, finished: 93, backlog: 7
│
│
├── Thread 1
│ └── function fib(n) {
│ ├── return n < 1 ? 0
│ └── : n <= 2 ? 1 : fib(n - 1) + fib(n - 2)}
│
├── Thread 2
│ └── function fib(n) {
│ ├── return n < 1 ? 0
│ └── : n <= 2 ? 1 : fib(n - 1) + fib(n - 2)}
│
└── Thread 3
└── function fib(n) {
├── return n < 1 ? 0
└── : n <= 2 ? 1 : fib(n - 1) + fib(n - 2)}
The benchmarker comes with a statistical measurement system that can be optionally used to diagnose bottlenecks.
Some metrics are recorded by default; others can be recorded by the user within a task thread.
Every recorded value is tracked as a Metric
, represented as a
histogram with min
, mean
, max
properties.
A metric is represented as a histogram with the following properties:
name | description |
---|---|
count |
number of values/samples. |
min |
minimum value |
mean |
mean/average of values |
max |
maximum value |
stddev |
standard deviation between values |
last |
last value |
snapshots |
last 50 states |
Timing metrics are collected in milliseconds.
Metrics can be queried from the list
argument of the onTick
callback.
// ...
onTick: list => {
// primary metrics
console.log(list().primary())
// task thread metrics
console.log(list().threads())
}
get all primary/main metrics
// log all primary metrics
console.log(list().primary())
get all metrics, for each task thread
// log all metric of every task-thread
console.log(list().threads())
reduce all metrics to a single histogram property
list().threads().pick('min')
// from this: { cycle: [{ min: 4, max: 5 }, evt_loop: { min: 2, max: 8 } ...
// to this : { cycle: 4, evt_loop: 2 ...
available:
min
,mean
,max
,stdev
,snapshots
,count
,last
stddev
: standard deviation between recorded valueslast
: last recorded valuecount
: number of recorded values
reduce all metrics that have been pick
-ed to an array of histograms,
to an array of single histogram values.
list().primary().pick('snapshots').of('max')
// from this: [{ cycle: [{ ... max: 5 }, { ... max: 3 }, { ... max: 2 } ] } ...
// to this : [{ cycle: [5,3,2 ....] } ...
note: only makes sense if it comes after
.pick('snapshots')
get specific metric(s) instead of all of them
const loopMetrics = list().threads().metrics('evt_loop', 'fibonacci')
// only the `evt_loop` and `fibonacci` metrics
sort by specific metric
const sorted = list().threads().pick('min').sort('cycle', 'desc')
// sort by descending min 'cycle' durations
available:
desc
,asc
get result as an Object
, like `Object.groupBy
with the metric name used as the key.
const obj = list().threads().pick('snapshots').of('mean').group()
The following metrics are collected by default:
name | description |
---|---|
issued |
count of issued cycles |
completed |
count of completed cycles |
backlog |
size of cycles backlog |
uptime |
seconds since test start |
name | description |
---|---|
cycles |
cycle timings |
evt_loop |
event loop timings |
any custom metrics will appear here.
Custom metrics can be recorded with either:
both of them are native extensions of the User Timing APIs.
The metrics collector records their timings and attaches the tracked Metric
histogram to its corresponding task thread
.
example: instrumenting a function using
performance.timerify
:
// performance.timerify example
import { dyno } from '@nicholaswmin/dyno'
await dyno(async function cycle() {
performance.timerify(function fibonacci(n) {
return n < 1 ? 0
: n <= 2 ? 1
: fibonacci(n - 1) + fibonacci(n - 2)
})(30)
}, {
parameters: { cyclesPerSecond: 20 },
onTick: list => {
console.log(list().threads().metrics().pick('mean'))
}
})
// logs
// ┌─────────┬───────────┐
// │ cycle │ fibonacci │
// ├─────────┼───────────┤
// │ 7 │ 7 │
// │ 11 │ 5 │
// │ 11 │ 5 │
// └─────────┴───────────┘
note: the stats collector uses the function name for the metric name, so named
function
s should be preffered to anonymous arrow-functions
Each metric contains up to 50 snapshots of its past states.
This allows plotting them as a timeline, using the
console.plot
module.
The following example benchmarks 2
sleep
functions & plots their timings as an ASCII chart
// Requires:
// `npm i @nicholaswmin/console-plot --no-save`
import { dyno } from '@nicholaswmin/dyno'
import console from '@nicholaswmin/console-plot'
await dyno(async function cycle() {
await performance.timerify(function sleepRandom1(ms) {
return new Promise(r => setTimeout(r, Math.random() * ms))
})(Math.random() * 20)
await performance.timerify(function sleepRandom2(ms) {
return new Promise(r => setTimeout(r, Math.random() * ms))
})(Math.random() * 20)
}, {
parameters: { cyclesPerSecond: 15, durationMs: 20 * 1000 },
onTick: list => {
console.clear()
console.plot(list().threads().pick('snapshots').of('mean').group(), {
title: 'Plot',
subtitle: 'mean durations (ms)'
})
}
})
which logs:
Plot
-- sleepRandom1 -- cycle -- sleepRandom2 -- evt_loop
11.75 ┤╭╮
11.28 ┼─────────────────────────────────────────────────────────────────────╮
10.82 ┤│╰───╮ ╭╯ ╰╮ │╰╮ ╭─────────╯╰──────────╮ ╭─────────────────╯ ╰───────────╮╭─╮ ╭──────────
10.35 ┼╯ ╰╮╭╮╭╯ ╰───╯ ╰──╯ ╰─╯ ╰╯ ╰────╯
9.88 ┤ ╰╯╰╯
9.42 ┤
8.95 ┤
8.49 ┤
8.02 ┤
7.55 ┤
7.09 ┤╭╮
6.62 ┼╯╰───╮ ╭─────────╮ ╭──╮
6.16 ┤ ╰╮╭──╯ ╰───╯ ╰───────────────────────╮ ╭─────────────────────╮╭───╮ ╭─────────
5.69 ┤╭╮ ╰╯ ╭───────────╮ ╭╮╭──────╮ ╰╯ ╰──╭╮╭─╮╭─────
5.22 ┤│╰╮╭─╮ ╭──╮ ╭───╮╭─╮ ╭────────────────────╯ ╰──╯╰╯ ╰────────────────╯╰╯ ╰╯
4.76 ┤│ ╰╯ ╰───╯ ╰─────╯ ╰╯ ╰─╯
4.29 ┼╯
mean durations (ms)
- last: 100 items
Using lambdas/arrow functions means the metrics collector has no function name to use for the metric. By their own definition, they are anonymous.
Change this:
const foo = () => {
// test code
}
performance.timerify(foo)()
to this:
function foo() {
// test code
}
performance.timerify(foo)()
The benchmark file self-forks itself. 👀
This means that any code that exists outside the dyno
block will also
run in multiple threads.
This is a design tradeoff, made to provide the ability to create simple,
single-file benchmarks but it can create issues if you intent to run code
after the dyno()
resolves/ends;
or when running this as part of an automated test suite.
In this example,
'done'
is logged3
times instead of1
:
import { dyno } from '@nicholaswmin/dyno'
const result = await dyno(async function cycle() {
// task code, expected to run 3 times ...
}, { threads: 3 })
console.log('done')
// 'done'
// 'done'
// 'done'
To work around this, the before
/after
hooks can be used for setup and
teardown, like so:
await dyno(async function cycle() {
console.log('task')
}, {
parameters: { durationMs: 5 * 1000, },
before: async parameters => {
console.log('before')
},
after: async parameters => {
console.log('after')
}
})
// "before"
// ...
// "task"
// "task"
// "task"
// "task"
// ...
// "after"
Alternatively, the task function can be extracted to it's own file.
// task.js
import { task } from '@nicholaswmin/dyno'
task(async function task(parameters) {
// task code ...
// `benchmark.js` test parameters are
// available here.
})
then referenced as a path in benchmark.js
:
// benchmark.js
import { join } from 'node:path'
import { dyno } from '@nicholaswmin/dyno'
const result = await dyno(join(import.meta.dirname, './task.js'), {
threads: 5
})
console.log('done')
// 'done'
This should be the preferred method when running this as part of a test suite.
This is not a stress-testing tool.
Stress-tests are far more complex and require a near-perfect
replication of an actual production environment.
This is a prototyping tool that helps testing whether some prototype idea is worth proceeding with or whether it has unworkable scalability issues.
It's multi-threaded model is meant to mimic the execution model of horizontally-scalable, share-nothing services.
It's original purpose was for benchmarking a module prototype that heavily interacts with a data store over a network.
It's not meant for side-to-side benchmarking of synchronous code, Google's Tachometer being a much better fit.
install deps:
npm ci
unit & integration tests:
npm test
test coverage:
npm run test:coverage
note: the parameter prompt is suppressed when
NODE_ENV=test
meta checks:
npm run checks
generate a sample benchmark:
npx init
generate Heroku-deployable benchmark:
npx init-cloud
Todos are available here
update README.md
code snippets:
npm run examples:update
source examples are located in:
/bin/example