@cubejs-backend/cubestore
TypeScript icon, indicating that this package has built-in type declarations

1.1.7 • Public • Published

Cube.js

WebsiteDocsExamplesBlogSlackTwitter

npm version GitHub Actions FOSSA Status

Cube Store

Cube.js pre-aggregation storage layer.

Motivation

Over the past year, we've accumulated feedback around various use-cases with pre-aggregations and how to store them. We've learned that there are a set of problems where relational databases as a storage layer has significant performance and functionality issues.

These problems include:

  • Performance issues with high cardinality rollups (1B and more)
  • Lack of HyperLogLog support
  • Degraded performance for big UNION ALL queries
  • Poor JOIN performance across rolled up tables
  • Table/schema name length issues across different database types
  • SQL type differences between source and external database

Over time, we realized that if we try to fix these issues with existing database engines, we'd end up modifying these databases' codebases in one way or another.

We decided to take another approach and write our own materialized OLAP cache store, designed solely to store and serve rollup tables at scale.

Approach

To optimize performance as much as possible, we went with a native approach and are using Rust to develop Cube Store, utilizing a set of technologies like RocksDB, Apache Parquet, and Arrow that have proven effectiveness in solving data access problems.

Cube Store is fully open-sourced and released under the Apache 2.0 license.

Plans

We intend to start distributing Cube Store with Cube.js, and eventually make Cube Store the default pre-aggregation storage layer for Cube.js. Support for MySQL and Postgres as external databases will continue, but at a lower priority.

We'll also update all documentation regarding pre-aggregations and include usage and deployment instructions for Cube Store.

Supported architectures and platforms

If your platform/architecture is not supported, you can launch Cube Store using Docker.

linux-gnu linux-musl darwin win32
x86 N/A N/A N/A N/A
x86_64
arm64 ✅[1]

[1] It can be launched using Rosetta 2 via the x86_64-apple binary.

Usage

With Cube.js

Starting with v0.26.48, Cube.js ships with Cube Store enabled when CUBEJS_DEV_MODE=true. You don't need to set up any CUBEJS_EXT_DB_* environment variables or externalDriverFactory inside your cube.js configuration file.

For versions prior to v0.26.48, you should upgrade your project to the latest version and install the Cube Store driver:

yarn add @cubejs-backend/cubestore-driver

After starting up, Cube.js will print a message:

🔥 Cube Store (0.26.64) is assigned to 3030 port.

With Docker

Start Cube Store in a Docker container and bind port 3030 to 127.0.0.1:

docker run -d -p 3030:3030 cubejs/cubestore:edge

Configure Cube.js to use the above connection for an external database via the .env file:

CUBEJS_EXT_DB_TYPE=cubestore
CUBEJS_EXT_DB_HOST=127.0.0.1

With Docker Compose

Create a docker-compose.yml file with the following content:

version: '2.2'
services:
  cubestore:
    image: cubejs/cubestore:edge

  cube:
    image: cubejs/cube:latest
    ports:
      - 4000:4000  # Cube.js API and Developer Playground
      - 3000:3000  # Dashboard app, if created
    env_file: .env
    depends_on:
      - cubestore
    links:
      - cubestore
    volumes:
      - ./schema:/cube/conf/schema

Configure Cube.js to use the above connection for an external database via the .env file:

CUBEJS_EXT_DB_TYPE=cubestore
CUBEJS_EXT_DB_HOST=cubestore

Build

docker build -t cubejs/cubestore:latest .
docker run --rm cubejs/cubestore:latest

Development

Debian prerequisites (incomplete): apt-get install lld libssl-dev pkg-config cmake

When changing Datafusion or Arrow:

Check out https://github.com/cube-js/arrow-rs/tree/cube and https://github.com/cube-js/arrow-datafusion/tree/cube and add the following to the current directory's Cargo.toml. (But remember to exclude this from your PR!)


[patch.'https://github.com/cube-js/arrow-rs']
parquet = { path = "../../../arrow-rs/parquet" }
arrow = { path = "../../../arrow-rs/arrow" }

[patch.'https://github.com/cube-js/arrow-datafusion']
datafusion = { path = "../../../arrow-datafusion/datafusion" }

Of course, you can use absolute paths or adjust the paths to your chosen checkout location.

It is possible that uncommenting the arrow-datafusion .cargo/config.toml path line works for you too, but it might not, if you are making changes in arrow-rs.

License

Cube Store is Apache 2.0 licensed.

Readme

Keywords

none

Package Sidebar

Install

npm i @cubejs-backend/cubestore

Weekly Downloads

13,223

Version

1.1.7

License

Apache-2.0

Unpacked Size

35.9 kB

Total Files

29

Last publish

Collaborators

  • cubedevinc
  • statsbot
  • keydunov