This library contains a improved tSNE implementation that runs in the browser.
You can use tfjs-tsne via a script tag or via NPM
To use tfjs-tsne via script tag you need to load tfjs first. The following tags can be put into the head section of your html page to load the library.
<script src="https://cdn.jsdelivr.net/npm/@tensorflow/[email protected]"></script>
<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs-tsne"></script>
This library will create a tsne
variable on the global scope.
You can then do the following
// Create some data
const data = tf.randomUniform([2000,10]);
// Get a tsne optimizer
const tsneOpt = tsne.tsne(data);
// Compute a T-SNE embedding, returns a promise.
// Runs for 1000 iterations by default.
tsneOpt.compute().then(() => {
// tsne.coordinate returns a *tensor* with x, y coordinates of
// the embedded data.
const coordinates = tsneOpt.coordinates();
coordinates.print();
}) ;
yarn add @tensorflow/tfjs-tsne
or
npm install @tensorflow/tfjs-tsne
Then
import * as tsne from '@tensorflow/tfjs-tsne';
// Create some data
const data = tf.randomUniform([2000,10]);
// Initialize the tsne optimizer
const tsneOpt = tsne.tsne(data);
// Compute a T-SNE embedding, returns a promise.
// Runs for 1000 iterations by default.
tsneOpt.compute().then(() => {
// tsne.coordinate returns a *tensor* with x, y coordinates of
// the embedded data.
const coordinates = tsneOpt.coordinates();
coordinates.print();
}) ;
Creates and returns a TSNE optimizer.
data
must be a Rank 2 tensor. Shape is [numPoints, dataPointDimensions]config
is an optional object with the following params (all are optional):- perplexity: number — defaults to 18. Max value is defined by hardware limitations.
- verbose: boolean — defaults to false
- exaggeration: number — defaults to 4
- exaggerationIter: number — defaults to 300
- exaggerationDecayIter: number — defaults to 200
- momentum: number — defaults to 0.8
The most direct way to get a tsne projection. Automatically runs the knn preprocessing and the tsne optimization. Returns a promise to indicate when it is done.
- iterations the number of iterations to run the tsne optimization for. (The number of knn steps is automatically calculated).
When running tsne iteratively (see section below). This runs runs the knn preprocessing for the specified number of iterations.
When running tsne iteratively (see section below). This runs the tsne step for the specified number of iterations.
Gets the current x, y coordinates of the projected data as a tensor. By default the coordinates are normalized to the range 0-1.
Gets the current x, y coordinates of the projected data as a JavaScript array. By default the coordinates are normalized to the range 0-1. This function is async and returns a promise.
While the .compute
method provides the most direct way to get an embedding. You can also compute the embedding iteratively and have more control over the process.
The first step is computing the KNN graph using iterateKNN.
Then you can compute the tSNE iteratively and examine the result as it evolves.
The code below shows what that would look like
const data = tf.randomUniform([2000,10]);
const tsne = tf_tsne.tsne(data);
async function iterativeTsne() {
// Get the suggested number of iterations to perform.
const knnIterations = tsne.knnIterations();
// Do the KNN computation. This needs to complete before we run tsne
for(let i = 0; i < knnIterations; ++i){
await tsne.iterateKnn();
// You can update knn progress in your ui here.
}
const tsneIterations = 1000;
for(let i = 0; i < tsneIterations; ++i){
await tsne.iterate();
// Draw the embedding here...
const coordinates = tsne.coordinates();
coordinates.print();
}
}
iterativeTsne();
We also have an example of using this library to perform TSNE on the MNIST dataset here.
This library requires WebGL 2 support and thus will not work on certain devices, mobile devices especially. Currently it best works on desktop devices.
From our current experiments we suggest limiting the data size passed to this implementation to data with a shape of [10000,100], i.e. up to 10000 points with 100 dimensions each. You can do more but it might slow down.
Above a certain number of data points the computation of the similarities becomes a bottleneck, a problem that we plan to address in the future.
This work makes use of linear tSNE optimization for the optimization of the embedding and an optimized brute force computation of the kNN graph in the GPU.
Reference to cite if you use this implementation in a research paper:
@article{TFjs:tSNE,
author = {Nicola Pezzotti and Alexander Mordvintsev and Thomas Hollt and Boudewijn P. F. Lelieveldt and Elmar Eisemann and Anna Vilanova},
title = {Linear tSNE Optimization for the Web},
year = {2018},
journal={arXiv preprint arXiv:1805.10817},
}