@cazoo/telemetry
TypeScript icon, indicating that this package has built-in type declarations

0.16.5 • Public • Published

Cazoo Telemetry

A wrapper around open telemetry for getting traces and telemetry into your life

Basic concepts

https://opentelemetry.io/ is a standard for observability.

Instead of logger.info('request sent to aws'), you'll have something more like const trace = parent.startChild('awsRequest') followed at some point by a trace.end()

Any standard logger will contain information about when an event happened but a trace will contain information about when it happened, how long it happened for and what was the hierarchy of operations within the trace.

Our telemetry data is sent through to https://www.honeycomb.io/ where it can be viewed and analysed.

Basic usage

NB: All examples can be found in the examples directory of this repository. Follow directions in the README over there.

yarn add @cazoo/telemetry npm install --save @cazoo/telemetry

the entrypoint is Telemetry. Telemetry.start(name) and this will return a Trace object. You have to end the trace in order for telemetry to be logged

// yarn example:basic
import { Telemetry } from '@cazoo/telemetry'

const trace = Telemetry.start('basic')
trace.end()

/*
{
  "traceId":"38d55155fb57b62757f509288b14ea4f",
  "name":"basic",
  "id":"421f5158d03eac4a",
  "kind":0,
  "timestamp":1658228429531994,
  "duration":0,
  "attributes":{},
  "status": {
    "code":0
  },
  "events":[]
}
*/

Including AWS Context

the syntax for this is Telemetry.startWithContext(name, event, context, options)

The startWithContext method is able to pull relevant information out from your AWS event and context

// yarn example:awsContext
// event and context from the unit test data used in @cazoo/telemetry
import { event, context } from '../tests/data/awsgateway'
import { APIGatewayProxyEvent, Context } from 'aws-lambda'
import { Telemetry } from '@cazoo/telemetry'

function handle(event: APIGatewayProxyEvent, context: Context): void {
  const trace = Telemetry.startWithContext('handler', event, context)
  trace.end()
}

handle(event, context)

/*
{
  "traceId": "4d110d20fdcc8516d23df6c594833f76",
  "name": "handler",
  "id": "e6ba7f033be6f7af",
  "kind": 0,
  "timestamp": 1658228683032974,
  "duration": 1,
  "attributes": {
    "request_id": "request-id",
    "account_id": "12345678912",
    "function.name": "my-function",
    "function.version": "v1.0.1",
    "function.service": "log-stream",
    "http.path": "/hello/world",
    "http.method": "POST",
    "http.stage": "testStage",
    "http.query": "{\"name\":[\"me\"],\"multivalueName\":[\"you\",\"me\"]}"
  },
  "status": {
    "code": 0
  },
  "events": []
}
*/

Child Traces

One of the offerings of open telemetry is the hierarchy of execution.

These are achieved by taking your root trace and creating children.

// yarn example:children
import { Telemetry } from '@cazoo/telemetry'

const queryDynamo = (): void => {
  // dummy function
}

const trace = Telemetry.start('root')

const child = trace.startChild('queryingDynamo')
queryDynamo()
child.end()
trace.end()

/* This generated two traces. This one is a root trace and has no parentId

{
  "traceId": "5d38dc29c40d55c6bb781c44112728dc",
  "name": "root",
  "id": "338340157a29db83",
  "kind": 0,
  "timestamp": 1658228710320277,
  "duration": 2,
  "attributes": {},
  "status": {
    "code": 0
  },
  "events": []
}
*/
/* This trace is a child of the root and is linked by the parent id
{
  "traceId": "5d38dc29c40d55c6bb781c44112728dc",
  "parentId": "338340157a29db83",
  "name": "queryingDynamo",
  "id": "63b7ef930d8e14fe",
  "kind": 0,
  "timestamp": 1658228710320763,
  "duration": 0,
  "attributes": {},
  "status": {
    "code": 0
  },
  "events": []
}
*/

Adding supplementary context

If you need to include additional information in a trace, you can do it using appendContext

// yarn example:appendContext
import { Telemetry } from '@cazoo/telemetry'

const queryDynamo = (): string => {
  // dummy function
  return 'some result'
}

const trace = Telemetry.start('root')
const result = queryDynamo()
trace.appendContext({ result })
trace.end()

/*
{
  "traceId": "8f59474987c2840e8c7b97091099e036",
  "name": "root",
  "id": "2431dbc40ffda91d",
  "kind": 0,
  "timestamp": 1614180894315418,
  "duration": 1,
  "attributes": {
    "result": "some result"
  },
  "status": {
    "code": 0
  },
  "events": []
}
*/

Propagating supplementary context

If you create a child after appending the context, the appended information will be propagated to them.

// yarn example:propagate
import { Telemetry } from '@cazoo/telemetry'

const queryDynamo = (): string => {
  // dummy function
  return 'some result'
}

const trace = Telemetry.start('root')

const child = trace.startChild('queryingDynamo')
const withoutContext = child.startChild('subChildWithoutContext')
withoutContext.end()
const result = queryDynamo()
child.appendContext({ result })
const withContext = child.startChild('subChildWithContext')
withContext.end()
child.end()
trace.end()

/* This time, we're producing 4 traces. Any context appended in *not* propagated to parents
{
  "traceId": "9e964e47b1d0ecd166056c9242391d2b",
  "name": "root",
  "id": "fafddc9e47a060a7",
  "kind": 0,
  "timestamp": 1614180944224978,
  "duration": 3,
  "attributes": {},
  "status": {
    "code": 0
  },
  "events": []
}
*/
/* The context appended is included within the `attributes` property
{
  "traceId": "65f96121416465d10344824fb24379e0",
  "parentId": "3936b85acc29c85d",
  "name": "queryingDynamo",
  "id": "6e05a7bebfa166c7",
  "kind": 0,
  "timestamp": 1658228790300710,
  "duration": 1,
  "attributes": {
    "result": "some result"
  },
  "status": {
    "code": 0
  },
  "events": []
}
*/
/* context is not propagated to a child that has already been created
{
  "traceId": "9e964e47b1d0ecd166056c9242391d2b",
  "parentId": "594302faad4cbb6c",
  "name": "subChildWithoutContext",
  "id": "f673b204193f3884",
  "kind": 0,
  "timestamp": 1614180944225634,
  "duration": 0,
  "attributes": {},
  "status": {
    "code": 0
  },
  "events": []
}
*/
/* context is propagated to any children created afterwards
{
  "traceId": "65f96121416465d10344824fb24379e0",
  "parentId": "6e05a7bebfa166c7",
  "name": "subChildWithContext",
  "id": "29f947c2334e8a59",
  "kind": 0,
  "timestamp": 1658228790302075,
  "duration": 0,
  "attributes": {
    "result": "some result"
  },
  "status": {
    "code": 0
  },
  "events": []
}
*/

Matching the schema in honeycomb

Honeycomb is looking for a set of specific fields. I've added a utility method to help you place them.

// yarn example:schema
import { Telemetry } from '@cazoo/telemetry'

const trace = Telemetry.start('root')

trace.schema({
  error: 'error',
  httpStatusCode: 404,
  route: '/search',
})

trace.end()

/* We've added the error, the httpStatuscode and the route to the attributes. These will then be accessible to honeycomb.
{
  "traceId": "689d4d4a725ba260b961b347af9298c9",
  "name": "root",
  "id": "e71955b42a8c6031",
  "kind": 0,
  "timestamp": 1614181146057262,
  "duration": 1,
  "attributes": {
    "error": "error",
    "isError": true,
    "httpStatusCode": 404,
    "route": "/search"
  },
  "status": {
    "code": 0
  },
  "events": []
}
*/

Handling errors

Errors can be added through the schema({}) method but it will probably be more convenient to use endWithError(error). This does the same except it will also end the trace.

// yarn example:error
import { Telemetry } from '@cazoo/telemetry'

const trace = Telemetry.start('root')
try {
  throw new Error('oops!')
  trace.end()
} catch (error) {
  trace.endWithError(error)
}

/*
{
  "traceId": "3d59dc56bfa4deb5b1f2da66dcb889a5",
  "name": "root",
  "id": "ce3749a2d8510d8f",
  "kind": 0,
  "timestamp": 1614181187125848,
  "duration": 47,
  "attributes": {
    "error": "oops!",
    "errorStackTrace": "Error: oops!\n    at Object.<anonymous> (/Users/jason.luong/Documents/projects/telemetry/examples/error.ts:5:9)\n    at Module._compile (internal/modules/cjs/loader.js:1063:30)\n    at Module.m._compile (/Users/jason.luong/Documents/projects/telemetry/examples/node_modules/ts-node/src/index.ts:1043:23)\n    at Module._extensions..js (internal/modules/cjs/loader.js:1092:10)\n    at Object.require.extensions.<computed> [as .ts] (/Users/jason.luong/Documents/projects/telemetry/examples/node_modules/ts-node/src/index.ts:1046:12)\n    at Module.load (internal/modules/cjs/loader.js:928:32)\n    at Function.Module._load (internal/modules/cjs/loader.js:769:14)\n    at Function.executeUserEntryPoint [as runMain] (internal/modules/run_main.js:72:12)\n    at main (/Users/jason.luong/Documents/projects/telemetry/examples/node_modules/ts-node/src/bin.ts:225:14)\n    at Object.<anonymous> (/Users/jason.luong/Documents/projects/telemetry/examples/node_modules/ts-node/src/bin.ts:512:3)",
    "isError": true
  },
  "status": {
    "code": 0
  },
  "events": []
}
*/

Masking sensitive information

Currently this library provides two exporters designed to remove sensitive details from the attributes.

The masked exporter allows the user to specify a set of attributes to allow into the telemetry backend.

The santised exporter instead allows the user to specify a set of regular expressions and redact every string or subtring that matches any of them.

Exporters combination: Please note that in accordance with the decorator pattern, exporters can be combined as constructor parameters.

The masked exporter

If you need to mask sensitive information in a trace, you can do it using the MaskedExporterDecorator exporter. N.B. This will mask all attributes by default.

If you need to unmask information you can supply an array of allowedFieldPaths to the masker.

// yarn example:masker
import {
  Telemetry,
  StdOutExporter,
  MaskedExporterDecorator,
} from '@cazoo/telemetry'

const stdOutExporter = new StdOutExporter()
const allowedFieldPaths = ['data.id']
const maskedExporterDecorator = new MaskedExporterDecorator(
  stdOutExporter,
  allowedFieldPaths
)
const trace = Telemetry.start('root', { exporter: maskedExporterDecorator })
trace.appendContext({ data: { email: 'test@email.com', id: '1234-5678-9101' } })
trace.end()

/*
{
  "traceId": "0f32f5dd3465771a31cf5c155ae20cbe",
  "name": "root",
  "id": "57ff6ebdb662267d",
  "kind": 0,
  "timestamp": 1617699714136696,
  "duration": 1,
  "attributes": {
    "data.email": "[REDACTED]",
    "data.id": "1234-5678-9101"
  },
  "status": {
      "code": 0
  },
  "events": []
}
*/

The sanitised exporter

If you desire to redact sensitive values from the trace attributes regardless of their location, such as email addresses or phone numbers, you can use the sanitised exporter.

Specify the patterns to mask as a list of RegExp objects and the exporter will replace any occurrance expressions with [REDACTED] or any other custom placeholder.

The module CommonSensitiveInfoPatterns, located at src/utils, provides a set of regular expressions that are usually considered sensitive information.

// yarn example:sanitised
import {
    Telemetry,
    StdOutExporter,
    SanitisedExporterDecorator,
    CommonSensitiveInfoPatterns,
} from '@cazoo/telemetry'

const stdOutExporter = new StdOutExporter()
const sanitisedExporerDecorator = new SanitisedExporterDecorator(
    stdOutExporter,
    [CommonSensitiveInfoPatterns.EMAIL, CommonSensitiveInfoPatterns.PHONE_NUMBER, /bar/],
    '*****'
)
const trace = Telemetry.start('root', { exporter: sanitisedExporerDecorator })
trace.appendContext({
    contacts: {
        email: 'some.mail@provider.com',
        mobile: '+44 8087339090',
        random_list: ['foobar', 'barbaz']
    },
    id: '1234-5678-9101'
})
trace.end()

/*
{
  "traceId": "d4ace7437c073568f07628b1742b45f0",
  "name": "root",
  "id": "69e33bc21cf9f4b4",
  "kind": 0,
  "timestamp": 1634727242512792,
  "duration": 1,
  "attributes": {
    "contacts.email": "*****",
    "contacts.mobile": "*****",
    "contacts.random_list": {
      "0": "foo*****",
      "1": "*****baz"
    },
    "id": "1234-5678-9101"
  },
  "status": {
    "code": 0
  },
  "events": []
}
*/

Timeout logging

The telemetry will close all its traces just before a lambda timeout, as otherwise you will lose all open traces, including the root trace.

Because of the way the lambda works, this has to be logged before the actual timeout happens. The time between the trace close and the timeout we call it buffer. The default buffer is 10ms. This default can be overriden using the environment variable CAZOO_LOGGER_TIMEOUT_BUFFER_MS.

We close the traces adding an error attribute indicating that the timeout happens, with type lambda.timeout. This will also count as an error in honeycomb.

Telemetry Debug mode

It's possible to enable debug logging of the Telemetry library. Please set the environment variable TELEMETRY_DEBUG=1. Any truthy value will work. This'll provide debug logging of the creation and destruction of spans and other Telemetry behaviours

Cross services tracing

Let's say you have a front end server side lambda that is using the library to trace the time it takes to talk to an API endpoint. That API endpoint also uses this package and start its trace with startWithContext, passing in the AWS Proxy event.

You can link those two traces and get a single trace for the whole request across front end and backend.

To do this you first need to change your request to your API to pass headers generated from the frontend trace.

const serviceARootTrace = Telemetry.startWithContext('serviceA', someEvent);

// ...

// We create a child trace to track the call to our API 
const apiCallTrace = serviceARootTrace.startChild("serviceA_queryingServiceB");

try {
    const response = await fetch(
      `/serviceB/api/endpoint`,
      {
        headers: {
          ...apiCallTrace.asHttpHeaders()
        }
      }
    );
} finally {
  apiCallTrace.end();
}

You then need to update your API to use continueFromContext to signal you wish to try and continue the incoming trace:

const serviceBTrace = Telemetry.continueFromContext('serviceB', apiGatewayEvent);

Once this is set up, the API endpoint trace will automatically be a child of the front end one.

This is what the output of the trace will look like:

[
  {
    "traceId": "c210bdfbdc8b731f93d6111ab162bdfc",
    "name": "serviceA",
    "id": "ae805c11916bae76",
    "kind": 0,
    "timestamp": 1658229123052375,
    "duration": 2,
    "attributes": {},
    "status": {
      "code": 0
    },
    "events": []
  },
  {
    "traceId": "5cf3221f8e1904b21aad2190644acfe1",
    "parentId": "d8ea76a78b9ab874",
    "name": "serviceA_queryingServiceB",
    "id": "066b40f1e319a702",
    "kind": 0,
    "timestamp": 1618231896652018,
    "duration": 2,
    "attributes": {},
    "status": {
      "code": 0
    },
    "events": []
  },
  {
    "traceId": "5cf3221f8e1904b21aad2190644acfe1",
    "parentId": "066b40f1e319a702",
    "name": "serviceB",
    "id": "043ff5c29516f2e4",
    "kind": 0,
    "timestamp": 1618231896652238,
    "duration": 0,
    "attributes": {
      "account_id": "missing"
    },
    "status": {
      "code": 0
    },
    "events": []
  }
]

This is what it looks like in Honeycomb:

myAccount.fetchOrders was created by the account-app-main-account My Account SSR Lambda, and getCustomerOrders was created by the order-service-getCustomerOrders API.

Contributing

CI package version check

The GitHub Actions workflows used for continous integration and deployment are configured to automatically test and release new versions of the Cazoo-uk/telemetry package on the NPM registry at https://www.npmjs.com/package/@cazoo/telemetry.

The process includes a verification of the package version set in package.json.

If the version is not updated, the CI "test" workflow is configured to fail, preventing a pull request to be merged.

To skip this check on a workflow run, insert skip-release in the commit message. Doing this on the merge commit will skip the release of a new package version.

Readme

Keywords

none

Package Sidebar

Install

npm i @cazoo/telemetry

Weekly Downloads

72

Version

0.16.5

License

ISC

Unpacked Size

189 kB

Total Files

64

Last publish

Collaborators

  • cazoo-gitlab