Dataflow job which reads from Datastore, converts entities to json, and writes the newline seperated json to a GCS folder.
Not going to cover the specifics here, just google jdk + gradle installation for your specific platform.
Take a look at the script. The following flags need to be set:
- project, your gcp project id
- stagingLocation, gcs path to a place gcs can use to store staging files
- tempLocation, gcs path to a scratch disk for the dataflow job
- backupGCSPrefix, where your backups should be placed
- datastoreEntityKind, the Datastore Entity Kind
If you'd like the java program to block until the dataflow job is complete add
the --isBlocking
Once the job is done you'll find newline seperated json sharded between a few files located under the specified backupGCSPrefix.
For example say you had the following command:
java -jar build/libs/*.jar gcsbackup \
--stagingLocation=gs://superman-backups/staging/ \
--tempLocation=gs://superman-backups/temp/ \
--backupGCSPrefix=gs://superman-backups/datastore/ \
--datastoreEntityKind=ComicBooks \
And you ran this command on 11/29/2016 on 11:10:55 am, your backups would in the
And that directory would have files with a naming scheme of something like:
Take a look at the script. The following flags need to be set:
- project, your gcp project id
- stagingLocation, gcs path to a place gcs can use to store staging files
- tempLocation, gcs path to a scratch disk for the dataflow job
- backupGCSPrefix, where your backups are stored
If you'd like the java program to block until the dataflow job is complete add
the --isBlocking
For example say you wanted to restore a backup ran on 11/29/2016 at 11:10:55 am on the entity kind
of ComicBooks. And you specified a backupGCSPrefix of
Your restore command would look like:
java -jar build/libs/*.jar gcsrestore \
--project=superman \
--stagingLocation=gs://superman-backups/staging/ \
--tempLocation=gs://superman-backups/temp/ \