tfjs-node-helpers

Introduction

This library was created to simplify the work with TensorFlow.js in Node.

Currently, this library provides the helpers for the binary classification task.
The long-term plan is to implement helpers for other tasks, for example, regression and multiclass classification), as well as to cover different machine learning approaches.

Installation

Before you start using the helpers in your project, you need to install the @ronas-it/tfjs-node-helpers package:

npm install @ronas-it/tfjs-node-helpers --save

Usage

Feature extraction

Before training any model, you need to extract the valuable information from your dataset. This information is usually called features. This library provides a few helpers to streamline the process of feature extraction.

First, you need to define the feature extractors. In the example below we extract the gender feature from the dataset item. For that we create a GenderFeatureExtractor class extending the FeatureExtractor base class provided by the library. Please note that feature extractors also encode the extracted information as a number in the range between 0 and 1, so that it can be consumed when training the model.

type DatasetItem = {
  id: number;
  gender: string;
  age: number;
  annual_salary: number;
  owns_the_car: number;
};

class GenderFeatureExtractor extends FeatureExtractor<DatasetItem, FeatureType> {
  public featureType = FeatureType.GENDER;

  public extract(item: DatasetItem): Feature<FeatureType> {
    return new Feature({
      type: this.featureType,
      label: item.gender,
      value: (item.gender === 'Male') ? 1 : 0
    });
  }
}

That's it! Now we can use the defined feature extractor to extract valuable information from our dataset.

Metrics

After your model has been trained it's important to evaluate it. One way to do this is by analyzing metrics. The library helps measure model performance by passing a list of metric calculators to the model trainer.

We have a list of built-in metric calculators for popular metrics:

AccuracyMetricCalculator
PrecisionMetricCalculator
RecallMetricCalculator
SpecificityMetricCalculator
FNRMetricCalculator
FPRMetricCalculator
NPVMetricCalculator
MCCMetricCalculator
FBetaScoreMetricCalculator
ROCAUCMetricCalculator
PRAUCMetricCalculator
BrierLossMetricCalculator
BinaryCrossentropyMetricCalculator
CohenKappaMetricCalculator

You can implement your own MetricCalculator. In the example below, we define a metric calculator for precision. For that we create a PrecisionMetricCalculator class extending the MetricCalculator base class provided by the library and implementing calculate method.

export class PrecisionMetricCalculator extends MetricCalculator {
  public calculate({ trueValues, predictedValues }: TestingResult): Metric {
    const { tp, fp } = new ConfusionMatrix(trueValues, predictedValues);

    return new Metric({
      title: 'Precision',
      value: tp / (tp + fp)
    });
  }
}

Binary classification

This library provides two classes to train and evaluate binary classification models:

BinaryClassificationTrainer – used for training and testing.
BinaryClassifier – used for evaluation.

Creating the trainer

Before training the model, you need to create an instance of the BinaryClassificationTrainer class first and provide a few parameters:

batchSize – the number of training samples in each batch.
epochs – the maximum number of iterations that we should train the model.
patience – the number of iterations after which the trainer will stop if there is no improvement.
hiddenLayers – a list of hidden layers. You can also provide the custom model by using the optional model parameter instead.
inputFeatureExtractors – a list of feature extractors to extract information that should be fed into the model as inputs.
outputFeatureExtractor – the feature extractor to extract information that we want to predict.
metricCalculators – a list of metric calculators that will be used during test stage.

An example can be found below:

const trainer = new BinaryClassificationTrainer({
  batchSize: BATCH_SIZE,
  epochs: EPOCHS,
  patience: PATIENCE,
  hiddenLayers: [
    layers.dense({ units: 128, activation: 'mish' }),
    layers.dense({ units: 128, activation: 'mish' })
  ],
  inputFeatureExtractors: [
    new AgeFeatureExtractor(),
    new AnnualSalaryFeatureExtractor(),
    new GenderFeatureExtractor()
  ],
  outputFeatureExtractor: new OwnsTheCarFeatureExtractor(),
  metricCalculators: [
    new AccuracyMetricCalculator(),
    new PrecisionMetricCalculator(),
    new SpecificityMetricCalculator(),
    new FPRMetricCalculator()
  ]
});

Training and testing

To train the model, you need to call the trainAndTest method of the instantiated BinaryClassificationTrainer.

You can pass the data parameter, and in this case trainer will extract features from the provided dataset first. If you want something more customized, then you can create the datasets for training, validation and testing manually, and pass them as the trainingDataset, validationDataset and testingDataset parameters.

You can also print the testing results by setting the printTestingResults to true.

An example can be found below:

await trainer.trainAndTest({
  data,
  printTestingResults: true
});

Loading data asynchronously

When working with large dataset, you might find out that the whole dataset can't fit in memory. In this situation you might want to load the data in chunks. To do this, you can define the asynchronous generators for trainingDataset, validationDataset and testingDataset.

This library provides the makeChunkedDataset helper to make it easier to create chunked datasets where chunks are controlled with skip and take parameters.

makeChunkedDataset helper accepts the following parameters:

loadChunk – an asynchronous function accepting the numeric skip and take parameters and returning an array of samples.
chunkSize – the number of samples loaded per chunk.
batchSize – the number of samples in each batch.

const loadTrainingSamplesChunk = async (skip: number, take: number): Promise<Array<Sample>> => {
  // Your samples chunk loading logic goes here. For example, you may want to
  //   load samples from database, or from a remote data source.
};

const makeTrainingDataset = (): data.Dataset<TensorContainer> => makeChunkedDataset({
  loadChunk: loadTrainingSamplesChunk,
  chunkSize: 32,
  batchSize: 32
});

// You should also define similar functions for validationDataset and
//   trainingDataset. We omit this for the sake of brevity.

const trainingDataset = makeTrainingDataset();
const validationDataset = makeValidationDataset();
const testingDataset = makeTestingDataset();

await trainer.trainAndTest({
  trainingDataset,
  validationDataset,
  testingDataset,
  printTestingResults: true
});

Saving the model

To save the trained model, you need to call the save method of the instantiated BinaryClassificationTrainer and pass the path where the model should be saved:

await trainer.save(join(__dirname, './trained_model'));

Creating the classifier

Before evaluating the model, you need to create an instance of the BinaryClassifier class:

const classifier = new BinaryClassifier();

Loading the trained model

To load the trained model, you need to call the load method of the instantiated BinaryClassifier class and pass the path where the model json file is located:

await classifier.load(join(__dirname, './trained_model/model.json'));

Evaluation

To evaluate the trained model, you need to load it first, and then call the predict method of the instantiated BinaryClassifier class and pass an array of encoded inputs which will be fed into the model:

const ownsTheCar = await classifier.predict([0.2, 0.76, 0]);

Roadmap

Contributing

Thank you for considering contributing to tfjs-node-helpers library! The contribution guide can be found in the Contributing guide.

License

tfjs-node-helpers is licensed under the MIT license.

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
.github		.github
packages		packages
tools		tools
.editorconfig		.editorconfig
.eslintrc.json		.eslintrc.json
.gitignore		.gitignore
.prettierignore		.prettierignore
.prettierrc		.prettierrc
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
jest.config.ts		jest.config.ts
jest.preset.js		jest.preset.js
nx.json		nx.json
package-lock.json		package-lock.json
package.json		package.json
tsconfig.base.json		tsconfig.base.json
workspace.json		workspace.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

tfjs-node-helpers

Introduction

Installation

Usage

Feature extraction

Metrics

Binary classification

Creating the trainer

Training and testing

Loading data asynchronously

Saving the model

Creating the classifier

Loading the trained model

Evaluation

Roadmap

Contributing

License

About

Releases 7

Contributors 2

Languages

License

RonasIT/tfjs-node-helpers

Folders and files

Latest commit

History

Repository files navigation

tfjs-node-helpers

Introduction

Installation

Usage

Feature extraction

Metrics

Binary classification

Creating the trainer

Training and testing

Loading data asynchronously

Saving the model

Creating the classifier

Loading the trained model

Evaluation

Roadmap

Contributing

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 7

Contributors 2

Languages