-
Notifications
You must be signed in to change notification settings - Fork 1
Description
It's looking increasingly likely that the best option for replacing the solr bitnami chart is going to be the Apache Solr operator.
Tasks:
- Deploy operator on dev (Matt or Matthew)
- Update
application-context.yamlink8s-cluster/authorizationto add whatever rules we need for the SolrCluster CRD - Create a SolrCloud
- Determine a strategy to migrate data from existing collections run with the bitnami chart to the SolrCloud
That last step is one that I'm not sure exactly how it will go...I can't find very much documentation on migration. For some applications you might be able to just regenerate the entire collection by doing a reindex, but I don't think that is possible with metadig (most of the information in solr is calculated and indexed at runtime, and stored nowhere else). Since I couldn't find anything, I asked Gemini what it would do and this is what it said.
- Find the Persistent Volume Claim (PVC)
First, you need to locate the Persistent Volume Claim (PVC) that stores your data from the Bitnami Solr Helm chart. The Bitnami chart typically stores data in the /bitnami/solr/data/ directory.
- Back up the Solr Core
Once you've identified the data's location, you'll need to back up your Solr core using the Solr Replication Handler's Backup command.
This creates a snapshot of your index files. You can execute this command on the Solr pod using kubectl exec:
Bash
kubectl exec -it <solr-pod-name> -- curl "http://localhost:8983/solr/<core_name>/replication?command=backup&name=<backup_name>"The backup command is asynchronous and saves the snapshot to the data directory of your core, which is on your PVC.
You can check the status of the backup by inspecting the pod's logs or by using the details command.
- Copy the Backup to a Shared Location
The next step is to get the backup files from your Bitnami Solr's PVC to a location accessible by your new SolrCloud cluster. This can be a new, shared PVC, an S3 bucket, or another cloud storage solution.
You'll need to mount the existing Bitnami PVC to a temporary pod to copy the files out.
Once mounted, you can use kubectl cp or a simple rsync command from within the temporary pod to move the files to your shared location.
- Create a New SolrCloud Collection
Now, deploy your new SolrCloud cluster using the non-Bitnami operator. Once it's up and running, you'll need to create a new collection that will receive the data. This collection must have the same schema and configuration as your original Solr core.
BashExample using the Solr Collections API
kubectl exec -it <solrcloud-pod-name> -- curl -X POST -H 'Content-Type: application/json' 'http://localhost:8983/solr/admin/collections' -d '{"create-collection": {"name": "my-new-collection", "numShards": 1, "replicationFactor": 2, "config": "_default"}}'Important: SolrCloud collections use configsets, which are stored in ZooKeeper.
You will need to upload your solrconfig.xml and schema.xml files from the old Solr core to ZooKeeper so the new collection can use them.
- Restore the Data to the New Collection
Finally, use the Solr Collections API Restore command to restore the backup you created earlier.
This command will re-create the index from the snapshot files in your new collection.
BashExample using the Solr Collections API Restore command
kubectl exec -it <solrcloud-pod-name> -- curl "http://localhost:8983/solr/admin/collections?action=RESTORE&collection=<new_collection_name>&name=<backup_name>&location=<shared_location>"The location parameter must point to the path where you copied the backup files in the previous step.
This process ensures data integrity and properly migrates your index from a single-core instance to a distributed SolrCloud collection.