Skip to content

Replace Solr helm chart with Solr Operator and SolrCloud deployments #71

@jeanetteclark

Description

@jeanetteclark

It's looking increasingly likely that the best option for replacing the solr bitnami chart is going to be the Apache Solr operator.

Tasks:

  • Deploy operator on dev (Matt or Matthew)
  • Update application-context.yaml in k8s-cluster/authorization to add whatever rules we need for the SolrCluster CRD
  • Create a SolrCloud
  • Determine a strategy to migrate data from existing collections run with the bitnami chart to the SolrCloud

That last step is one that I'm not sure exactly how it will go...I can't find very much documentation on migration. For some applications you might be able to just regenerate the entire collection by doing a reindex, but I don't think that is possible with metadig (most of the information in solr is calculated and indexed at runtime, and stored nowhere else). Since I couldn't find anything, I asked Gemini what it would do and this is what it said.

  1. Find the Persistent Volume Claim (PVC)

First, you need to locate the Persistent Volume Claim (PVC) that stores your data from the Bitnami Solr Helm chart. The Bitnami chart typically stores data in the /bitnami/solr/data/ directory.

  1. Back up the Solr Core

Once you've identified the data's location, you'll need to back up your Solr core using the Solr Replication Handler's Backup command.

This creates a snapshot of your index files. You can execute this command on the Solr pod using kubectl exec:
Bash

kubectl exec -it <solr-pod-name> -- curl "http://localhost:8983/solr/<core_name>/replication?command=backup&name=<backup_name>"

The backup command is asynchronous and saves the snapshot to the data directory of your core, which is on your PVC.

You can check the status of the backup by inspecting the pod's logs or by using the details command.

  1. Copy the Backup to a Shared Location

The next step is to get the backup files from your Bitnami Solr's PVC to a location accessible by your new SolrCloud cluster. This can be a new, shared PVC, an S3 bucket, or another cloud storage solution.

You'll need to mount the existing Bitnami PVC to a temporary pod to copy the files out.

Once mounted, you can use kubectl cp or a simple rsync command from within the temporary pod to move the files to your shared location.

  1. Create a New SolrCloud Collection

Now, deploy your new SolrCloud cluster using the non-Bitnami operator. Once it's up and running, you'll need to create a new collection that will receive the data. This collection must have the same schema and configuration as your original Solr core.
Bash

Example using the Solr Collections API
kubectl exec -it <solrcloud-pod-name> -- curl -X POST -H 'Content-Type: application/json' 'http://localhost:8983/solr/admin/collections' -d '{"create-collection": {"name": "my-new-collection", "numShards": 1, "replicationFactor": 2, "config": "_default"}}'

Important: SolrCloud collections use configsets, which are stored in ZooKeeper.

You will need to upload your solrconfig.xml and schema.xml files from the old Solr core to ZooKeeper so the new collection can use them.

  1. Restore the Data to the New Collection

Finally, use the Solr Collections API Restore command to restore the backup you created earlier.

This command will re-create the index from the snapshot files in your new collection.
Bash

Example using the Solr Collections API Restore command
kubectl exec -it <solrcloud-pod-name> -- curl "http://localhost:8983/solr/admin/collections?action=RESTORE&collection=<new_collection_name>&name=<backup_name>&location=<shared_location>"

The location parameter must point to the path where you copied the backup files in the previous step.

This process ensures data integrity and properly migrates your index from a single-core instance to a distributed SolrCloud collection.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions