ESSE
Essential Source of Schemas and Examples (ESSE) contains data formats and associated examples specifically designed for digital materials science (see refs. 1, 2 below).
Installation
ESSE can be used as a Node.js or Python package on the server side.
Python
ESSE is compatible with Python 3.6+. It can be installed as a Python package either via PyPI or the repository as below.
PyPI
pip install esse
Repository
virtualenv .venv
source .venv/bin/activate
pip install -e PATH_TO_ESSE_REPOSITORY
Node
ESSE can be installed as a Node.js package either via NPM or the repository as below.
NPM
npm install @exabyte-io/esse.js
Repository
Add "esse-js": "file:PATH_TO_ESSE_REPOSITORY"
to package.json
.
Usage
ESSE contains separate but equivalent interfaces for Python and Javascript.
The package provides ESSE
class that can be initialized and used as below.
Python
from esse import ESSE
es = ESSE()
schema = es.get_schema_by_id("material")
Node
import {ESSE} from "esse-js";
const es = new ESSE();
const schema = es.getSchemaById("material");
Structure
ESSE contains 3 main directories, schema, example and src outlined below.
Schema
The schema directory contains the schemas specifying the rules to structure data. A set of core schemas, outlined below, are defined to facilitate the schema modularity.
Primitive
Primitive directory contains a set of custom primitives that extends default standard primitive types allowed by schema, such as String and Number. Primitives are solely defined by the default primitives and can not be re-constructed from each other.
Abstract
Abstract directory contains unit-less schemas that are constructed from default and custom primitives.
Reusable
Reusable directory contains the schemas that are widely used in other schemas to avoid duplication, constructed from the abstract and primitive schemas.
Reference
Reference directory contains the schemas defining the rules to structure the references to data sources.
Example
This directory contains the examples formed according to the schemas and implements the same directory structure as the schema directory.
src
This directory contains Python and Javascript interfaces implementing the functionality to access and validate schemas and examples.
Generative vs Non-generative keys
Generative keys are the fields which allow for user input prior to calculation of the final property values. A flag is included in the schema comments on the fields in property schemas: isGenerative:true
marks which fields to use as subschemas in the generation of a user input schema.
- On properties allowing user inputs, additional fields may be tagged, as in the
file_content
property
Tests
Execute the following command from the root directory of this repository to run the tests. The script will run both Javascript and Python tests in which examples are validated against the corresponding schemas.
bash run-tests.sh
The script has been tested with node.js v12.16.3 and v8.17.0 as well as Python version 2.7 (up to version 2.3.0) and 3.6+ (for version 2020.10.19 and later).
Contribution
This repository is an open-source work-in-progress and we welcome contributions. We suggest forking this repository and introducing the adjustments there, the changes in the fork can further be considered for merging into this repository as it is commonly done on Github (see 3 below).
Best Practices
-
Use unique IDs for schemas. One can run
sh refactor.sh
to automatically set the IDs and reformat examples. -
Do not use circular references in the schemas, instead leave the type as object and add explanation to description.
Links
1: Data-centric online ecosystem for digital materials science
2: CateCom: A Practical Data-Centric Approach to Categorization of Computational Models