Skip to content
scotta451 edited this page Jul 5, 2016 · 29 revisions

Welcome to JetBrains Xodus. These pages provide a brief introduction to Xodus concepts and features.

Xodus is licensed under the Apache License, Version 2.0.

Overview
Snapshot Isolation
Garbage Collector
Performance
Getting Started

Overview

Xodus is a transactional schema-less embedded high-performance database that is written in Java. It is currently used in several server-side products at JetBrains, including YouTrack.

  • Xodus is written in pure Java and will run on any platform that is able to run a Java virtual machine.
  • Xodus transactions have a full set of properties that guarantee reliability: atomicity, consistency, isolation, and durability. As such, Xodus is a general-purpose database that can be used in traditional database applications with high requirements for consistency and isolation.
  • On the other hand, Xodus is schema-less. This characteristic makes it different from traditional database applications that require a schema. Xodus helps agile development teams avoid hassles such as migrations and schema refactorings. This makes it easier for developers when applications are required to be compatible with different versions of the database.
  • An embedded database runs inside your application. Its main features are zero deployment and zero administration. It does not require a dedicated server to store and access data. Applications that use Xodus do not require overhead to, for example, establish connections with a database server and parse SQL.

Snapshot Isolation

Xodus supports only one isolation level, snapshot isolation. It doesn't allow dirty reads, read-committed, repeatable-read, or serializable isolation. In a transaction, snapshot isolation guarantees that all reads see a consistent snapshot of the whole database.

Snapshot isolation follows from the log-structured design of Xodus. In log-structured databases, all changes are written sequentially to a log. In Xodus, this log is an infinite sequence of .xd files. Any data that is stored in the log is never modified. Each change is appended to the log, thereby creating a new version of the data. A committed transaction creates a new snapshot (version) of the database, and any new transaction created right after a commit holds (references) this snapshot. Thus, any Xodus database can be represented as a persistent functional data structure which naturally provides lock-free multi-version concurrency control (MVCC).

Garbage Collector

With append-only modifications, modified data records become outdated and are no longer used. As such records are nothing but garbage, the database should compact itself to maintain suitable physical size. In Xodus, you don't need to worry about that, since it collects garbage in the background. In most cases, garbage collection (GC) works seamlessly with the default settings in a single background thread. GC tries to balance the need to minimize database size and the need to affect user transactions as little as possible. In some cases, GC can be tweaked with a set of additional properties.

Performance

The main production instance of JetBrains YouTrack contains a database of issues that span more than 10 years. The total number of issues exceeds one million; physical database size exceeds 100 GB. YouTrack runs on a moderate 8-CPU server with a Java heap of 24 GB. Xodus provides outstanding performance due to very compact data storing, lock-free reads, lock-free optimistic writes, and intelligent lock-free caching. Xodus is a highly concurrent database since it has zero contention of read operations even if there are parallel write operations.

Getting started

There are three different ways to deal with data, which results in three different kinds of API or API layers:

  1. Environments are a transactional key-value storage.
  2. Entity Stores describe a data model as a set of typed entities with named properties (attributes) and named entity links (relations).
  3. Virtual File Systems (VFS) deal transactionally with files and their contents.

Before you start coding, choose the API layer that is most suitable for your project needs. The choice determines which set of artifacts your project depends on. Whichever API you chose, you have to create an instance of Environment. Entity Store and VFS both work on top of Environment.

Consider the simplest sample using the Environments layer. Start by creating an instance of Environment:

final Environment env = Environments.newInstance("/Users/me/.myAppData");

All Environment data is physically stored in the directory /Users/me/.myAppData. Create a named Store to store your data:

final Store store = env.computeInTransaction(new TransactionalComputable<Store>() {
    @Override
    public Store compute(@NotNull final Transaction txn) {
        return env.openStore("MyStore", StoreConfig.WITHOUT_DUPLICATES, txn);
    }
});

Here, a transactional closure is used as the simplest way to manage transactions and updates within a transaction. Once you get a Store object, you can put values by keys in it and get values by keys from it. On the Environment layer, all data is binary and untyped, and it is represented by ByteIterable instances. ByteIterable is a kind of byte array or Iterable<Byte>. Prepare the data and proceed with a closure to put it into the store:

final ByteIterable key = StringBinding.stringToEntry("myKey");
final ByteIterable value = StringBinding.stringToEntry("myValue");

env.executeInTransaction(new TransactionalExecutable() {
    @Override
    public void execute(@NotNull final Transaction txn) {
        store.put(txn, key, value);
    }
});

Note that here we used the TransactionalExecutable closure instead of TransactionalComputable. Unlike TransactionalComputable, TransactionalExecutable doesn't return a value. You can then can get a value by key:

env.computeInReadonlyTransaction(new TransactionalComputable<ByteIterable>() {
    @Override
    public ByteIterable compute(@NotNull final Transaction txn) {
        return store.get(txn, key);
    }
});

After you stop using Environment, invoke env.close().

Read more about Environments, Entity Stores and Virtual File Systems. See how to manage dependencies of your project.