Skip to content

πŸ” JoliCode's Elastica wrapper to bootstrap Elasticsearch PHP integrations

License

Notifications You must be signed in to change notification settings

jolicode/elastically

Folders and files

NameName
Last commit message
Last commit date

Latest commit

3ca142f Β· Nov 21, 2024
Nov 13, 2024
Nov 21, 2024
Nov 21, 2024
Apr 29, 2019
Jan 6, 2023
Jul 23, 2021
Jan 23, 2023
Nov 20, 2024
Sep 23, 2024
Nov 13, 2024
Nov 21, 2024
Nov 21, 2024
Nov 13, 2024
Jul 12, 2024
Jun 16, 2023
Jan 31, 2024

Repository files navigation

Elastically, Elastica based framework

Opinionated Elastica based framework to bootstrap PHP and Elasticsearch implementations.

Main features:

  • DTO are first class citizen, you send PHP object as documents, and get objects back on search results, like an ODM;
  • All indexes are versioned and aliased automatically;
  • Mappings are done via YAML files, PHP or custom via MappingProviderInterface;
  • Analysis is separated from mappings to ease reuse;
  • 100% compatibility with ruflin/elastica;
  • Mapping migration capabilities with ReIndex;
  • Symfony HttpClient compatible transport (optional);
  • Symfony support (optional):
    • See dedicated chapter;
    • Tested with Symfony 5.4 to 7;
    • Symfony Messenger Handler support (with or without spool);

Important

Require PHP 8.0+ and Elasticsearch 8+.

Works with Elasticsearch 7 as well, but is not officially supported by Elastica 8. Use with caution.

Version 2+ does not work with OpenSearch anymore due to restrictions added by Elastic on their client.

You can check the changelog and the upgrade documents.

Installation

composer require jolicode/elastically

Demo

Tip

If you are using Symfony, you can move to the Symfony chapter

Quick example of what the library do on top of Elastica:

// Your own DTO, or one generated by Jane (see below)
class Beer
{
    public string $foo;
    public string $bar;
}

use JoliCode\Elastically\Factory;
use JoliCode\Elastically\Model\Document;

// Factory object with Elastica options + new Elastically options in the same array
$factory = new Factory([
    // Where to find the mappings
    Factory::CONFIG_MAPPINGS_DIRECTORY => __DIR__.'/mappings',
    // What objects to find in each index
    Factory::CONFIG_INDEX_CLASS_MAPPING => [
        'beers' => Beer::class,
    ],
]);

// Class to perform request, same as the Elastica Client
$client = $factory->buildClient();

// Class to build Indexes
$indexBuilder = $factory->buildIndexBuilder();

// Create the Index in Elasticsearch
$index = $indexBuilder->createIndex('beers');

// Set the proper aliases
$indexBuilder->markAsLive($index, 'beers');

// Class to index DTO(s) in an Index
$indexer = $factory->buildIndexer();

$dto = new Beer();
$dto->bar = 'American Pale Ale';
$dto->foo = 'Hops from Alsace, France';

// Add a document to the queue
$indexer->scheduleIndex('beers', new Document('123', $dto));
$indexer->flush();

// Set parameters on the Bulk
$indexer->setBulkRequestParams([
    'pipeline' => 'covfefe',
    'refresh' => 'wait_for'
]);

// Force index refresh if needed
$indexer->refresh('beers');

// Get the Document (new!)
$results = $client->getIndex('beers')->getDocument('123');

// Get the DTO (new!)
$results = $client->getIndex('beers')->getModel('123');

// Perform a search
$results = $client->getIndex('beers')->search('alsace');

// Get the Elastic Document
$results->getDocuments()[0];

// Get the Elastica compatible Result
$results->getResults()[0];

// Get the DTO πŸŽ‰ (new!)
$results->getResults()[0]->getModel();

// Create a new version of the Index "beers"
$index = $indexBuilder->createIndex('beers');

// Slow down the Refresh Interval of the new Index to speed up indexation
$indexBuilder->slowDownRefresh($index);
$indexBuilder->speedUpRefresh($index);

// Set proper aliases
$indexBuilder->markAsLive($index, 'beers');

// Clean the old indices (close the previous one and delete the older)
$indexBuilder->purgeOldIndices('beers');

// Mapping change? Just call migrate and enjoy a full reindex (use the Task API internally to avoid timeout)
$newIndex = $indexBuilder->migrate($index);
$indexBuilder->speedUpRefresh($newIndex);
$indexBuilder->markAsLive($newIndex, 'beers');

Note

scheduleIndex is here called with "beers" index because the index was already created before. If you are creating a new index and want to index documents into it, you should pass the Index object directly.

mappings/beers_mapping.yaml

# Anything you want, no validation
settings:
    number_of_replicas: 1
    number_of_shards: 1
    refresh_interval: 60s
mappings:
    dynamic: false
    properties:
        foo:
            type: text
            analyzer: english
            fields:
                keyword:
                    type: keyword

Configuration

This library add custom configurations on top of Elastica's:

Factory::CONFIG_MAPPINGS_DIRECTORY (required with default configuration)

The directory Elastically is going to look for YAML.

When creating a foobar index, a foobar_mapping.yaml file is expected.

If an analyzers.yaml file is present, all the indices will get it.

Factory::CONFIG_INDEX_CLASS_MAPPING (required)

An array of index name to class FQN.

[
  'indexName' => My\AwesomeDTO::class,
]

Factory::CONFIG_MAPPINGS_PROVIDER

An instance of MappingProviderInterface.

If this option is not defined, the factory will fall back to YamlProvider and will use Factory::CONFIG_MAPPINGS_DIRECTORY option.

There are two providers available in Elastically: YamlProvider and PhpProvider.

Factory::CONFIG_SERIALIZER (optional)

A SerializerInterface compatible object that will be used on indexation.

Default to Symfony Serializer with Object Normalizer.

A faster alternative is to use Jane to generate plain PHP Normalizer, see below. Also, we recommend customization to handle things like Date.

Factory::CONFIG_DENORMALIZER (optional)

A DenormalizerInterface compatible object that will be used on search results to build your objects back.

If this option is not defined, the factory will fall back to Factory::CONFIG_SERIALIZER option.

Factory::CONFIG_SERIALIZER_CONTEXT_BUILDER (optional)

An instance of ContextBuilderInterface that build a serializer context from a class name.

If it is not defined, Elastically, will use a StaticContextBuilder with the configuration from Factory::CONFIG_SERIALIZER_CONTEXT_PER_CLASS.

Factory::CONFIG_SERIALIZER_CONTEXT_PER_CLASS (optional)

Allow to specify the Serializer context for normalization and denormalization.

[
    Beer::class => ['attributes' => ['title']],
];

Default to [].

Factory::CONFIG_BULK_SIZE (optional)

When running indexation of lots of documents, this setting allow you to fine-tune the number of document threshold.

Default to 100.

Factory::CONFIG_INDEX_PREFIX (optional)

Add a prefix to all indexes and aliases created via Elastically.

Default to null.

Usage in Symfony

Configuration

You'll need to add the bundle in bundles.php:

// config/bundles.php
return [
    // ...
    JoliCode\Elastically\Bridge\Symfony\ElasticallyBundle::class => ['all' => true],
];

Then configure the bundle:

# config/packages/elastically.yaml
elastically:
    connections:
        # You can create multiple clients
        default:
            # Any Elastica option works here
            client:
                hosts:
                    - '127.0.0.1:9200'

            # Path to the mapping directory (in YAML)
            mapping_directory:       '%kernel.project_dir%/config/elasticsearch'

            # Size of the bulk sent to Elasticsearch (default to 100)
            bulk_size:               100

            # Mapping between an index name and a FQCN
            index_class_mapping:
                my-foobar-index:     App\Dto\Foobar

            # Configuration for the serializer
            serializer:
                # Fill a static context
                context_mapping:
                    foo:                 bar

            # If you want to add a prefix for your index in elasticsearch (you can still call it by its base name everywhere!)
            # prefix: '%kernel.environment%'

            # Use HttpClient component
            transport_config:
                http_client: 'Psr\Http\Client\ClientInterface'

Finally, inject one of those service (autowirable) in you code where you need it:

JoliCode\Elastically\Client (elastically.default.client)
JoliCode\Elastically\IndexBuilder (elastically.default.index_builder)
JoliCode\Elastically\Indexer (elastically.default.indexer)

Advanced Configuration

Multiple Connections and Autowiring

If you define multiple connections, you can define a default one. This will be useful for autowiring:

elastically:
    default_connection: default
    connections:
        default: # ...
        another: # ...

To use class for other connection, you can use Autowirable Types. To discover them, run:

bin/console debug:autowiring elastically
Use a Custom Serializer Context Builder
elastically:
    default_connection: default
    connections:
        default:
            serializer:
                context_builder_service: App\Elastically\Serializer\ContextBuilder
                # Do not define "context_mapping" option anymore
Use a Custom Mapping provider
elastically:
    default_connection: default
    connections:
        default:
            mapping_provider_service: App\Elastically\MappingProvider
            # Do not define "index_class_mapping" option anymore
Using HttpClient as Transport

You can also use the Symfony HttpClient for all Elastica communications:

JoliCode\Elastically\Transport\HttpClientTransport: ~

JoliCode\Elastically\Client:
    arguments:
        $config:
            hosts:
                - '127.0.0.1:9200'
            transport_config:
                http_client: 'Psr\Http\Client\ClientInterface'

See the official documentation on how to get a PSR-18 client.

Reference

You can run the following command to get the default configuration reference:

bin/console config:dump elastically

Using Messenger for async indexing

Elastically ships with a default Message and Handler for Symfony Messenger.

Register the message in your configuration:

framework:
    messenger:
        transports:
            async: "%env(MESSENGER_TRANSPORT_DSN)%"

        routing:
            # async is whatever name you gave your transport above
            'JoliCode\Elastically\Messenger\IndexationRequest':  async

services:
    JoliCode\Elastically\Messenger\IndexationRequestHandler: ~

The IndexationRequestHandler service depends on an implementation of JoliCode\Elastically\Messenger\DocumentExchangerInterface, which isn't provided by this library. You must provide a service that implements this interface, so you can plug your database or any other source of truth.

Then from your code you have to call:

use JoliCode\Elastically\Messenger\IndexationRequest;
use JoliCode\Elastically\Messenger\IndexationRequestHandler;

$bus->dispatch(new IndexationRequest(Product::class, '1234567890'));

// Third argument is the operation, so for a "delete" add this argument:
// new IndexationRequest(Product::class, 'ref9999', IndexationRequestHandler::OP_DELETE);

And then consume the messages:

php bin/console messenger:consume async

Grouping IndexationRequest in a spool

Sending multiple IndexationRequest during the same Symfony Request is not always appropriate, it will trigger multiple Bulk operations. Elastically provides a Kernel listener to group all the IndexationRequest in a single MultipleIndexationRequest message.

To use this mechanism, we send the IndexationRequest in a memory transport to be consumed and grouped in a really async transport:

messenger:
    transports:
        async: "%env(MESSENGER_TRANSPORT_DSN)%"
        queuing: 'in-memory:///'

    routing:
        'JoliCode\Elastically\Messenger\MultipleIndexationRequest': async
        'JoliCode\Elastically\Messenger\IndexationRequest': queuing

You also need to register the subscriber:

services:
    JoliCode\Elastically\Messenger\IndexationRequestSpoolSubscriber:
        arguments:
            - '@messenger.transport.queuing' # should be the name of the memory transport
            - '@messenger.default_bus'
        tags:
            - { name: kernel.event_subscriber }

Using Jane to build PHP DTO and fast Normalizers

Install JanePHP json-schema tools to build your own DTO and Normalizers. All you have to do is setting the Jane-completed Serializer on the Factory:

$factory = new Factory([
    Factory::CONFIG_SERIALIZER => $serializer,
]);

Caution

Elastically is not compatible with Jane < 6.

To be done

  • some "todo" in the code
  • optional Doctrine connector
  • extra commands to monitor, update mapping, reindex... Commonly implemented tasks
  • optional Symfony integration:
    • web debug toolbar!
  • scripts / commands for common tasks:
    • auto-reindex when the mapping change, handle the aliases and everything
    • micro monitoring for cluster / indexes
    • health-check method

Sponsors

JoliCode

Open Source time sponsored by JoliCode.