- Command Line Options
- Version History
- Protocol
- Metrics
- Serialisation
- Testing
- Build
- Clients
- Acknowledgements
A TCP service for routing all requests to MongoDB via a centralised service. A few common features made available by this service are:
- All documents are versioned in a version history collection.
- All operations are timed and the metrics stored in a metrics collection.
Note: A C++17 compiler with coroutines-ts, or a C++20 compiler with coroutines support is required to build this project.
The service can be configured via the following command line options:
port
- Specify the port the service is to bind to via the-p
or--port
option. Default is2020
.threads
- The number of Boost ASIO IO Context threads to use via the-n
or--threads
option. Default is the value returned bystd::thread::hardware_concurrency
.mongoUri
- The full MongoDB connection uri to use. This should include the user credentials as well. Specify via the-m
or--mongo-uri
option. Mandatory option to start the service.versionHistoryDatabase
- The data to use to store version history documents. Specify via the-d
or--version-history-database
option. DefaultversionHistory
.versionHistoryCollection
- The collection to store version history documents in. Specify via the-c
or--version-history-collection
option. Defaultentities
.metricsDatabase
- The database in which to store request processing metrics to. Specify via the-s
or--metric-database
option. DefaultversionHistory
.metricsCollection
- The collection in which to store the metric documents. Specify via the-t
or--metric-collection
option. Defaultmetrics
.metricBatchSize
- The number of metrics to accumulate before saving to the desired store. Specify via the-w
or--metric-batch-size
option. Default100
.ilpServer
- The host name for the time series database that supports the ILP. Specify via the-i
or--ilp-server
option.ilpPort
- The port for the time series database that supports the ILP. Specify via the-x
or--ilp-port
option.ilpMeasurement
- The series/measurement name for metrics. Specify via the-y
or--ilp-series-name
option. Defaultmetrics
.logLevel
- The logging level for the service. Specify via-l
or--log-level
option. Defaultinfo
. Allowed values -debug, info, warn, critical
.logAsync
- Use asynchronous logger for the service (non-guaranteed and may lose some logs). Specify via-z
or--log-async
option. Defaulttrue
. Allowed values -true, false
.console
- Whether log messages are to be echoed to theconsole
as well. Specifytrue
via the-c
or--console
option. Defaultfalse
.dir
- Specify the output directory under which log files are to be stored via the-o
or--dir
option. Note that a trailing slash/
is mandatory. Defaultlogs/
.
All documents stored in the database will automatically be versioned on save. Deleting a document will move the current document into the version history database collection. This makes it possible to retrieve previous versions of a document as needed, as well as restore a document (regardless of whether it has been deleted or not).
All interactions are via BSON documents sent to the service. Each request must conform to the following document model:
action (string)
- The type of database action being performed. One ofcreate|retrieve|update|delete|count|distinct|index|dropCollection|dropIndex|bulk|pipeline|transaction|createTimeseries|createCollection|renameCollection
.database (string)
- The Mongo database the action is to be performed against.- Not needed for
transaction
action.
- Not needed for
collection (string)
- The Mongo collection the action is to be performed against.- Not needed for
transaction
action.
- Not needed for
document (document)
- The document payload to associate with the database operation. Forcreate
andupdate
this is assumed to be the document that is being saved. Forretrieve
orcount
this is the query to execute. Fordelete
this is a simpledocument
with an_id
field.options (document)
- The options to associate with the Mongo request. These correspond to the appropriate options as used by the Mongo driver.metadata (document)
- Optional metadata to attach to the version history document that is created (not relevant forretrieve
obviously). This typically will include information about the user performing the action, any other information as relevant in the system that uses this service.application
- Optional name of the application accessing the service. Helps to retrieve database metrics for a specific application.correlationId (string)
- Optional correlation id to associate with the metric record created by this action. This is useful for tracing logs originating from a single operation/request within the overall system. This value is stored as a string value in the metric document to allow for sensible data types to used as the correlation id.skipVersion (bool)
- Optionalbool
value to indicate not to create a version history document for thisaction
. Useful when creating non-critical data such as logs.skipMetric (bool)
- Optionalbool
value to indicate not to create a metric document for thisaction
. Useful when calls are made a part of a monitoring framework, and volume of metrics generated overwhelms storage requirements.
The document payload contains all the information necessary to execute the
specified action
in the request.
The document
as specified will be inserted into the specified database
and
collection
. The BSON ObjectId property/field (_id
) must be included in
the document. The _id
is needed within the version history document associated
with the create action. If using unacknowledged writes, the auto-generated
_id
is not available, and hence we would need to create a new insert document
with the _id
set within the service. This involves a wasteful copy of data,
and hence we enforce the requirement on client specified _id
value.
Sample request payload (see create.hpp):
{
"action": "create",
"database": "<database name>",
"collection": "<collection name>",
"document": {
"_id": {
"$oid": "5f35e5e1e799c52186039122"
},
"intValue": 123,
"floatValue": 123.0,
"boolValue": true,
"stringValue": "abc123",
"nested": {"key": "value"}
}
}
Sample response payload when version history document is created (default option) (see Create
struct in
create.hpp):
{
"_id": {
"$oid": "5f35e5e19e48c37186539141"
},
"database": "versionHistory",
"collection": "entities",
"entity": {
"$oid": "5f35e5e1e799c52186039122"
}
}
Note: The _id
in the response is the object id of the version history document that was created.
Sample response payload when version history document is not created:
{
"_id": {
"$oid": "5f35e5e19e48c37186539141"
},
"skipVersion": true
}
Note: The _id
in the response is the object id for the document as specified in the input payload.
The following options are supported for the create
action (see Create
struct in insert.hpp):
bypassValidation
- boolean. Whether or not to bypass document validationordered
- boolean. Whether or not theinsert_many
will be orderedwriteConcern
- document. The write concern for the operation.journal
- boolean. Iftrue
confirms that the database has written the data to the on-disk journal before reporting a write operations was successful. This ensures that data is not lost if the database shuts down unexpectedly.nodes
- integer. Sets the number of nodes that are required to acknowledge the write before the operation is considered successful. Write operations will block until they have been replicated to the specified number of servers in a replica set.acknowledgeLevel
- integer. Sets the acknowledgement level for the write operation.0
- Represent the implicit default write concern.1
- Represent write concern withw: "majority"
.2
- Represent write concern withw: <custom write concern name>
.3
- Represent write concern for un-acknowledged writes.4
- Represent write concern for acknowledged writes.
majority
- integer. The amount of time (milliseconds) to wait before the write operation times out if it cannot reach the majority of nodes in the replica set. If the value is zero, then no timeout is set.tag
- string. Sets the name representing the server-sidegetLastErrorMode
entry containing the list of nodes that must acknowledge a write operation before it is considered a success.timeout
- integer. Sets an upper bound on the time (milliseconds) a write concern can take to be satisfied. If the write concern cannot be satisfied within the timeout, the operation is considered a failure.
Retrieve obviously does not have any interaction with the version history system (unless you are retrieving versions). We provide this since one of the other purposes behind this service is to route/proxy all datastore interactions via this service.
Sample retrieve payload (see retrieve.hpp):
{
"action": "retrieve",
"database": "itest",
"collection": "test",
"document": {
"_id": {
"$oid": "5f35e6d8c7e3a976365b3751"
}
}
}
Sample response (see Retrieve
struct in retrieve.hpp):
{
"result": {
"_id": {
"$oid": "5f35e5e1e799c52186039122"
},
"intValue": 123,
"floatValue": 123.0,
"boolValue": true,
"stringValue": "abc123",
"nested": {"key": "value"}
}
}
The following options are supported for the retrieve
action (See Find
struct in find.hpp):
partialResults
- boolean. Whether to allow partial results from database if some shards are down (instead of throwing an error).batchSize
- integer. The number of documents to return per batch.collation
- document. Sets the collation for this operation.comment
- string. Attaches a comment to the query.commentOption
- document. Set the value of the comment option.hint
- document. Sets the index to use for this operation.let
- document. Set the value of the let option.limit
- integer. The maximum number of documents to return.max
- document. Gets the current exclusive upper bound for a specific index.maxTime
- integer. The maximum amount of time for this operation to run (server-side) in milliseconds.min
- document. Gets the current exclusive lower bound for a specific index.projection
- document. Sets a projection which limits the returned fields for all matching documents.readPreference
- document. The read preference for the operation.tags
- document. Sets the tag set list.hedge
- document. Sets the hedge document to be used for the read preference. Sharded clusters running MongoDB 4.4 or later can dispatch read operations in parallel, returning the result from the fastest host and cancelling the unfinished operations.maxStaleness
- integer. Sets the max staleness (seconds) setting. Secondary servers with an estimated lag greater than this value will be excluded from selection under modes that allow secondaries.mode
- integer. Sets the read preference mode. Valid values are:0
- Only read from a primary node.1
- Prefer to read from a primary node.2
- Only read from secondary nodes.3
- Prefer to read from secondary nodes.4
- Read from the node with the lowest latency irrespective of state.
returnKey
- boolean. Whether to return the index keys associated with the query results, instead of the actual query results themselves.showRecordId
- boolean. Whether to include the record identifier for each document in the query results.skip
- integer. The number of documents to skip before returning results.sort
- document. The order in which to return matching documents.
Count the number of documents matching the specified query document. The query document can be empty to get the count of all documents in the specified collection.
Sample count payload (see Count
struct in count.hpp):
{
"action": "count",
"database": "itest",
"collection": "test",
"document": {}
}
Sample response (see Count
struct in count.hpp):
{ "count" : 11350 }
The following options are supported for the count
action (see Count
struct in count.hpp):
collation
- document. Sets the collation for this operation.hint
- document. Sets the index to use for this operation.limit
- integer. The maximum number of documents to count.maxTime
- integer. The maximum amount of time for this operation to run (server-side) in milliseconds.skip
- integer. The number of documents to skip before counting documents.readPreference
- document. The read preference for the operation.tags
- document. Sets the tag set list.hedge
- document. Sets the hedge document to be used for the read preference. Sharded clusters running MongoDB 4.4 or later can dispatch read operations in parallel, returning the result from the fastest host and cancelling the unfinished operations.maxStaleness
- integer. Sets the max staleness (seconds) setting. Secondary servers with an estimated lag greater than this value will be excluded from selection under modes that allow secondaries.mode
- integer. Sets the read preference mode. Valid values are:0
- Only read from a primary node.1
- Prefer to read from a primary node.2
- Only read from secondary nodes.3
- Prefer to read from secondary nodes.4
- Read from the node with the lowest latency irrespective of state.
Retrieve distinct values for the specified field in documents in a collection. The payload document must contain
the field
for which distinct values are to be retrieved. An optional filter
field can be used to specify the
filter query to use when retrieving distinct values.
Sample distinct payload (see Distinct
struct in distinct.hpp):
{
"action": "distinct",
"database": "itest",
"collection": "test",
"document": {
"field": "myProp",
"filter": {
"deleted": {"$ne": true}
}
}
}
Sample response (see Distinct
struct in distinct.hpp):
{ "results" : [ { "values" : [ "value", "value1", "value2" ], "ok" : 1.0 } ] }
Note:
An array of results
is returned since the mongo C++ API implementation for the distinct
command returns a cursor. The documentation indicates only a single document is returned,
which implies we could return a result
with just the document. We chose not to make that
assumption in case the interface changes down the road (again because of the cursor returned by
the driver).
An empty values
array is returned if the specified field
does not exist in the documents
in the specified collection.
The following options are supported for the distinct
action (see Distinct
struct in distinct.hpp):
collation
- document. Sets the collation for this operation.maxTime
- integer. The maximum amount of time for this operation to run (server-side) in milliseconds.readPreference
- document. The read preference for the operation.tags
- document. Sets the tag set list.hedge
- document. Sets the hedge document to be used for the read preference. Sharded clusters running MongoDB 4.4 or later can dispatch read operations in parallel, returning the result from the fastest host and cancelling the unfinished operations.maxStaleness
- integer. Sets the max staleness (seconds) setting. Secondary servers with an estimated lag greater than this value will be excluded from selection under modes that allow secondaries.mode
- integer. Sets the read preference mode. Valid values are:0
- Only read from a primary node.1
- Prefer to read from a primary node.2
- Only read from secondary nodes.3
- Prefer to read from secondary nodes.4
- Read from the node with the lowest latency irrespective of state.
Update is the most complex scenario. The service supports the two main update modes supported by Mongo:
- update - The data specified in the
document
sub-document is merged into the existing document(s). - replace - The data specified in the
document
is used to replace an existing document.
Updates are possible either by an explicit _id
field in the input document
,
or via a filter
sub-document that expresses the query used to identify the
candidate document(s) to update.
If both _id
and filter
are omitted an error document is returned.
The returned BSON document depends on whether a single-document or multi-document update request was made:
- Single-Document - For single document updates the full updated stored document
(
document
) and basic information about the associated version history document (history
) are returned. - Multi-Document - For multi-document updates, an array of BSON object ids for
successful updates (
success
), failed updates (failure
), and the basic information about the version history documents (history
).
The simple and direct update use case. If the document
has an _id
property,
the remaining properties are merged into the stored document. A version history
document with the resulting stored document is also created.
Sample update request by _id
(see MergeForId
struct in update.hpp):
{
"action": "update",
"database": "itest",
"collection": "test",
"document": {
"key1": "value1",
"_id": {
"$oid": "5f35e887bb516401e02b4701"
}
}
}
Sample response (see Update
struct in update.hpp):
{
"document": {
"_id": {
"$oid": "5f35e887bb516401e02b4701"
},
"key": "value",
"key1": "value1"
},
"history": {
"_id": {
"$oid": "5f35e887e799c5218603915b"
},
"database": "itest",
"collection": "test",
"entity": {
"$oid": "5f35e887bb516401e02b4701"
}
}
}
Sample request payload:
{
"action": "update",
"database": "itest",
"collection": "test",
"skipVersion": true,
"document": {
"key1": "value1",
"_id": {
"$oid": "5f35e932d3698352cb3bd2d1"
}
}
}
In this case, only a placeholder response is returned as follows:
{ "skipVersion" : true }
If the document
has a replace
sub-document, then the existing document as
specified by the filter
query will be replaced. MongoDB will return as error
if an attempt is made to replace multiple documents (the query filter must return
a single document). A version history document is created with the replaced
document.
The following sample shows an example of performing a replace
action (see struct Replace
in update.hpp).
{
"action": "update",
"database": "itest",
"collection": "test",
"document": {
"filter": {
"_id": { "$oid": "5f3bc9e2502422053e08f9f1" }
},
"replace": {
"_id": { "$oid": "5f3bc9e2502422053e08f9f1" },
"key": "value",
"key1": "value1"
}
},
"options": { "upsert": true },
"application": "version-history-api",
"metadata": {
"revertedFrom": { "$oid": "5f3bc9e29ba4f45f810edf29" }
}
}
Response data is identical to Update by Id
If the document
has a update
sub-document, then existing document(s) are
updated with the information contained in it. This is a merge operation where
only the fields specified in the update
are set on the candidate document(s).
A version history document is created for each updated document.
If the input filter
sub-document has an _id
property, and is of type BSON
ObjectId, then a single document update is made.
As a simplification, it is possible to omit the $set
operator, if used in conjunction with a $unset
operator. All
top level properties other than _id
and $unset
are implicitly added to a $set
document in the actual update document
sent to MongoDB.
Sample request payload with explicit $set
(see Update
struct in update.hpp):
{
"action": "update",
"database": "itest",
"collection": "test",
"document": {
"filter": {
"_id": {"$oid": "6435a62316d2310e800e4bf2"}
},
"update": {
"$unset": {"obsoleteProperty": 1},
"$set": {
"metadata.modified": {"$date": 1681237539583},
"metadata.user._id": {"$oid": "5f70ee572fc09200086c8f24"},
"metadata.user.username": "mqtt"
}
}
}
}
Sample request payload without $set
:
{
"action": "update",
"database": "itest",
"collection": "test",
"document": {
"filter": {
"_id": {"$oid": "6435a62316d2310e800e4bf2"}
},
"update": {
"$unset": {"obsoleteProperty": 1},
"metadata.modified": {"$date": 1681237539583},
"metadata.user._id": {"$oid": "5f70ee572fc09200086c8f24"},
"metadata.user.username": "user"
}
}
}
Response data has the following structure (see UpdateMany
struct in update.hpp)
{
"success": [{"$oid": "6435a62316d2310e800e4bf2"}],
"failure": [{"$oid": "6435a62316d2310e800e4bf2"}],
"history": [
{
"_id": {"$oid": "5f35e887e799c5218603915b"},
"database": "itest",
"collection": "test",
"entity": {"$oid": "6435a62316d2310e800e4bf2"}
}
]
}
The following options are supported for the update
action (see Update
struct in update.hpp):
bypassValidation
- boolean. Whether to bypass document validationcollation
- document. Sets the collation for this operation.upsert
- boolean. By default, if no document matches the filter, the update operation does nothing. However, by specifying upsert astrue
, this operation either updates matching documents or inserts a new document using the update specification if no matching document exists.writeConcern
- document. The write concern for the operation.journal
- boolean. Iftrue
confirms that the database has written the data to the on-disk journal before reporting a write operations was successful. This ensures that data is not lost if the database shuts down unexpectedly.nodes
- integer. Sets the number of nodes that are required to acknowledge the write before the operation is considered successful. Write operations will block until they have been replicated to the specified number of servers in a replica set.acknowledgeLevel
- integer. Sets the acknowledgement level for the write operation.0
- Represent the implicit default write concern.1
- Represent write concern withw: "majority"
.2
- Represent write concern withw: <custom write concern name>
.3
- Represent write concern for un-acknowledged writes.4
- Represent write concern for acknowledged writes.
majority
- integer. The amount of time (milliseconds) to wait before the write operation times out if it cannot reach the majority of nodes in the replica set. If the value is zero, then no timeout is set.tag
- string. Sets the name representing the server-sidegetLastErrorMode
entry containing the list of nodes that must acknowledge a write operation before it is considered a success.timeout
- integer. Sets an upper bound on the time (milliseconds) a write concern can take to be satisfied. If the write concern cannot be satisfied within the timeout, the operation is considered a failure.
arrayFilters
- array. Array representing filters determining which array elements to modify.
The document
represents the query to execute to find the candidate documents
to delete from the database
:collection
. The query is executed to retrieve
the candidate documents, and the documents removed from the specified
database:collection
. The retrieved documents are then written to the version
history database.
Sample delete request (see Delete
struct in delete.hpp):
{
"action": "delete",
"database": "itest",
"collection": "test",
"document": {
"_id": { "$oid": "5f35ea61aa4ef01184492d71" }
}
}
Sample delete response (see Delete
struct in delete.hpp):
{
"success": [{
"$oid": "5f35ea61aa4ef01184492d71"
}],
"failure": [],
"history": [{
"_id": {
"$oid": "5f35ea61e799c521860391a9"
},
"database": "versionHistory",
"collection": "entities",
"entity": {
"$oid": "5f35ea61aa4ef01184492d71"
}
}]
}
The following options are supported for the delete
action (see Delete
struct in delete.hpp):
collation
- document. Sets the collation for this operation.writeConcern
- document. The write concern for the operation.journal
- boolean. Iftrue
confirms that the database has written the data to the on-disk journal before reporting a write operations was successful. This ensures that data is not lost if the database shuts down unexpectedly.nodes
- integer. Sets the number of nodes that are required to acknowledge the write before the operation is considered successful. Write operations will block until they have been replicated to the specified number of servers in a replica set.acknowledgeLevel
- integer. Sets the acknowledgement level for the write operation.0
- Represent the implicit default write concern.1
- Represent write concern withw: "majority"
.2
- Represent write concern withw: <custom write concern name>
.3
- Represent write concern for un-acknowledged writes.4
- Represent write concern for acknowledged writes.
majority
- integer. The amount of time (milliseconds) to wait before the write operation times out if it cannot reach the majority of nodes in the replica set. If the value is zero, then no timeout is set.tag
- string. Sets the name representing the server-sidegetLastErrorMode
entry containing the list of nodes that must acknowledge a write operation before it is considered a success.timeout
- integer. Sets an upper bound on the time (milliseconds) a write concern can take to be satisfied. If the write concern cannot be satisfied within the timeout, the operation is considered a failure.
hint
- document. Sets the index to use for this operation.let
- document. Set the value of the let option.
Drop the specified collection and all its containing documents. Specify an
empty document
in the payload to satisfy payload requirements. If you wish
to also remove all version history documents for the dropped collection, specify
clearVersionHistory
true
in the document
(revision history documents
will be removed asynchronously). Specify the write concern settings in the
optional options
sub-document.
Sample drop payload specifying removal of all associated revision history documents (see DropCollection
struct
in dropcollection.hpp):
{
"action": "dropCollection",
"database": "itest",
"collection": "test",
"document": {"clearVersionHistory": true}
}
Sample response (see DropCollection
struct in collection.hpp):
{ "dropCollection" : true }
The following options are supported for the dropCollection
action (see DropCollection
struct in dropcollection.hpp):
writeConcern
- document. The write concern for the operation.journal
- boolean. Iftrue
confirms that the database has written the data to the on-disk journal before reporting a write operations was successful. This ensures that data is not lost if the database shuts down unexpectedly.nodes
- integer. Sets the number of nodes that are required to acknowledge the write before the operation is considered successful. Write operations will block until they have been replicated to the specified number of servers in a replica set.acknowledgeLevel
- integer. Sets the acknowledgement level for the write operation.0
- Represent the implicit default write concern.1
- Represent write concern withw: "majority"
.2
- Represent write concern withw: <custom write concern name>
.3
- Represent write concern for un-acknowledged writes.4
- Represent write concern for acknowledged writes.
majority
- integer. The amount of time (milliseconds) to wait before the write operation times out if it cannot reach the majority of nodes in the replica set. If the value is zero, then no timeout is set.tag
- string. Sets the name representing the server-sidegetLastErrorMode
entry containing the list of nodes that must acknowledge a write operation before it is considered a success.timeout
- integer. Sets an upper bound on the time (milliseconds) a write concern can take to be satisfied. If the write concern cannot be satisfied within the timeout, the operation is considered a failure.
The document
represents the specification for the index to be created.
Additional options for the index (such as unique) can be specified via the
optional options
sub-document.
See Index
struct in index.hpp
Sample response (see Index
struct in index.hpp):
{"name": "unused_1"}
The following options are supported for the index
action (see Index
struct in index.hpp):
collation
- document. Sets the collation for this operation.background
- boolean. Whether or not to build the index in the background so that building the index does not block other database activities. The default is to build indexes in the foreground.unique
- boolean. Whether or not to create a unique index so that the collection will not accept insertion of documents where the index key or keys match an existing value in the index.hidden
- boolean. Whether or not the index is hidden from the query planner. A hidden index is not evaluated as part of query plan selection.name
- string. The name of the index.sparse
- boolean. Whether or not to create a sparse index. Sparse indexes only reference documents with the indexed fields.expireAfterSeconds
- integer. Set a value, in seconds, as a TTL to control how long MongoDB retains documents in the collection.version
- integer. Sets the index version.weights
- document. For text indexes, sets the weight document. The weight document contains field and weight pairs.defaultLanguage
- string. For text indexes, the language that determines the list of stop words and the rules for the stemmer and tokenizer.languageOverride
- string. For text indexes, the name of the field, in the collection’s documents, that contains the override language for the document.partialFilterExpression
- document. Sets the document for the partial filter expression for partial indexes.twodSphereVersion
- integer. For 2dsphere indexes, the 2dsphere index version number. Version can be either 1 or 2.twodBitsPrecision
- integer (0-255). For 2d indexes, the precision of the stored geohash value of the location data.twodLocationMin
- double. For 2d indexes, the lower inclusive boundary for the longitude and latitude values.twodLocationMax
- double. For 2d indexes, the upper inclusive boundary for the longitude and latitude values.
The document
represents the index specification for the
dropIndexes
command. Additional options for the index (such as write concern) can be specified
via the optional options
sub-document.
One of the following properties must be specified in the document
:
name
- Thename
of the index to drop. Should be astring
value.specification
- The full document specification of the index that was created.
See DropIndex
struct in dropindex.hpp
Sample response (see DropIndex
struct in index.hpp):
{"dropIndex": true}
Same as index
Bulk insert/delete documents. Corresponding version history documents for
inserted and/or deleted documents are created unless skipVersion
is specified.
The documents to insert or delete in bulk must be specified as
BSON array properties in the document
part of the payload. Multiple arrays
may be specified as appropriate.
insert
- Array of documents which are to be inserted. All documents must have a BSON ObjectId_id
property.remove
- Array of document specifications which represent the deletes. Deletes are slow since the query specifications are used to retrieve the documents being deleted and create the corresponding version history documents. Retrieving the documents in a loop adds significant processing time. For example the bulk delete test (deleting 10000 documents) takes about 15 seconds to run.
Sample bulk create payload (see Bulk
struct in bulk.hpp):
{
"action": "bulk",
"database": "itest",
"collection": "test",
"document": {
"insert": [{
"_id": {
"$oid": "5f6ba5f9de326c57bd64efb1"
},
"key": "value1"
},
{
"_id": {
"$oid": "5f6ba5f9de326c57bd64efb2"
},
"key": "value2"
}],
"remove": [{
"_id": {
"$oid": "5f6ba5f9de326c57bd64efb1"
}
}]
}
}
Sample response for the above payload (see Bulk
struct in bulk.hpp):
{ "create" : 2, "history": 3, "remove" : 1 }
Basic support for using aggregation pipeline features. This feature will be expanded as use cases expand over a period of time.
The document
in the payload must include a specification
array of documents which correspond to the match
,
lookup
... specifications for the aggregation pipeline operation (stage). The matching documents
will be returned in a results
array in the response.
The following operators have been tested:
$match
$lookup
$unwind
$group
$sort
$limit
$project
$addFields
$facet
$search
- Note requirement forsearch
to be the first stage in a pipeline.$unionWith
Sample request payload (see Pipeline
struct in pipeline.hpp):
{
"action": "pipeline",
"database": "itest",
"collection": "test",
"document": {
"specification": [
{"$match": {"_id": {"$oid": "5f861c8452c8ca000d60b783"}}},
{"$sort": {"_id": -1 }},
{"$limit": 20},
{"$lookup": {
"localField": "user._id",
"from": "user",
"foreignField": "_id",
"as": "users"
}},
{"$lookup": {
"localField": "group._id",
"from": "group",
"foreignField": "_id",
"as": "groups"
}}
]
}
}
Response structure is the same as for the retrieve
command.
Execute a sequence of actions in a transaction. Nest the individual actions
that are to be performed in the transaction within the document
sub-document.
The document
in the payload must include an items
array of documents.
Each document in the array represents the full specification for the action in
the transaction. The document specification is the same as the
document specification for using the service.
The specification for the action document in the items
array is (see TransactionBuilder
struct
in transaction.hpp):
action (string)
- The type of action to perform. Should be one ofcreate|update|delete
.database (string)
- The database in which the step is to be performed.collection (string)
- The collection in which the step is to be performed.document (document)
- The BSON specification for executing theaction
.skipVersion (bool)
- Do not create version history document for this action.
The response to a transaction request has the following structure (see Transaction
struct in transaction.hpp):
created (int)
- The number of documents that were created in this transaction.updated (int)
- The number of documents that were updated in this transaction.deleted (int)
- The number of documents that were deleted in this transaction.history (document)
- Metadata about version history documents that were created.database (string)
- The database used to store version history data.collection (string)
- The collection used to store version history data.created (array<oid>)
- History document object ids for new documents created in the transaction.updated (array<oid>)
- History document object ids for documents updated in the transaction.deleted (array<oid>)
- History document object ids for documents deleted in the transaction.
See samples for sample request/response payloads from the integration test suite.
Update at present in only partially supported. Strong assumption is made that
the document being updated is a full replacement. In other words, there is a
strong assumption that the document includes the _id
property, and that the
intention is to replace the existing document.
The document
as specified will be inserted into the specified database
and
timeseries collection
. The BSON ObjectId property/field (_id
) may be omitted
in the document. The response will include the server generated _id
for the inserted document
if using acknowledged writes. No version history is created for timeseries data.
Sample request payload (see CreateTimeseries
struct in createtimeseries.hpp):
{
"action": "createTimeseries",
"database": "<database name>",
"collection": "<collection name>",
"document": {
"value": 123.456,
"tags": {
"property1": "string",
"property2": false
},
"timestamp": "2024-11-21T17:36:28Z"
}
}
Sample response payload when document is created (see Create
struct in create.hpp):
{
"_id": {
"$oid": "5f35e5e19e48c37186539141"
},
"database": "versionHistory",
"collection": "entities"
}
The following options are supported for the createTimeseries
action (subset of Insert
struct in insert.hpp):
writeConcern
- document. The write concern for the operation.journal
- boolean. Iftrue
confirms that the database has written the data to the on-disk journal before reporting a write operations was successful. This ensures that data is not lost if the database shuts down unexpectedly.nodes
- integer. Sets the number of nodes that are required to acknowledge the write before the operation is considered successful. Write operations will block until they have been replicated to the specified number of servers in a replica set.acknowledgeLevel
- integer. Sets the acknowledgement level for the write operation.0
- Represent the implicit default write concern.1
- Represent write concern withw: "majority"
.2
- Represent write concern withw: <custom write concern name>
.3
- Represent write concern for un-acknowledged writes.4
- Represent write concern for acknowledged writes.
majority
- integer. The amount of time (milliseconds) to wait before the write operation times out if it cannot reach the majority of nodes in the replica set. If the value is zero, then no timeout is set.tag
- string. Sets the name representing the server-sidegetLastErrorMode
entry containing the list of nodes that must acknowledge a write operation before it is considered a success.timeout
- integer. Sets an upper bound on the time (milliseconds) a write concern can take to be satisfied. If the write concern cannot be satisfied within the timeout, the operation is considered a failure.
Create a collection
in the specified database
. If a collection
already exists in the
database
with the same name, an error is returned. This is primarily useful when clients
wish to specify additional options when creating a collection (eg. create a timeseries collection).
Sample create collection payload (see CreateCollection
struct in createcollection.hpp):
{
"action": "createCollection",
"database": "itest",
"collection": "timeseries",
"document": {
"timeseries": {
"timeField" : "date",
"metaField": "tags",
"granularity": "minutes"
}
}
}
Sample response payload (see CreateCollection
struct in collection.hpp):
{
"database": "itest",
"collection": "timeseries"
}
Options are not needed for the createCollection
action. Instead, the normal document
payload
is used as the options when creating the collection. Refer the mongodb
documentation for the supported options.
Rename a collection
in the specified database
. If a collection
already exists with the
document.target
name, an error is returned. The option to automatically drop a pre-existing
collection as supported by MongoDB is not supported. For such cases, use the dropCollection
action prior to invoking this action. Specify the write concern settings in the
optional options
sub-document.
This is a potentially heavy-weight operation. All version history documents for the specified database::collection combination are also updated. Version history document update is performed asynchronously. The operation enqueues an update operation to the version history documents, and returns. This can lead to queries against version history returning stale information for a short period of time.
Note: Renaming the collection in all associated version history documents may be the wrong choice.
In chronological terms, those documents were associated with the previous collection
. Only future revisions
are associated with the renamed target
. However, this can create issues in terms of retrieval, or if
iterating over records for some other purpose, or if a new collection with the previous name is created
in future.
Sample rename payload (see RenameCollection
struct in renamecollection.hpp):
{
"action": "renameCollection",
"database": "itest",
"collection": "test",
"document": {"target": "test-renamed"}
}
Sample response payload (see CreateCollection
struct in collection.hpp):
{
"database": "itest",
"collection": "test-renamed"
}
Same write options as for the dropCollection
action.
Create, update and delete actions only return some meta information about the action that was performed. The assumption is that caller already has all the document information needed, and there is no need for the service to return that information.
The retrieve
action of course returns the results of executing the database
query encapsulated in the document
property of the request payload. The
following document model is returned as the response:
error
- Astring
value in case an error was encountered as a result of executing the query. Caller should always check for existence of this property.result
- A BSON document that is returned if the query included an_id
property. In such a case it is assumed that the query is a lookup for a single document.results
- A BSON array with document(s) that were retrieved from the database for the query.
See transaction for sample request/response payloads for transaction requests.
At present only documents with BSON ObjectId _id
is supported. Streaming responses
(cursor
interface) is not supported.
Metrics are collected for all requests to the service (unless client specifies skipMetric
). Metrics may be
stored in MongoDB itself, or a service that supports the ILP.
Metrics are collected in the specified database
and collection
combination
(or their defaults). No TTL index is set on this (as it is left to the user's
requirements). A date
property is stored to create a TTL index as required.
The schema for a metric is as follows:
{
"_id": {"$oid": "5fd4b7e55f1ba96a695d1446"},
"action": "retrieve",
"database": "wpreading2",
"collection": "databaseVersion",
"size": 88,
"time": 414306,
"timestamp": 437909021088978,
"date": {"$date": 437909021},
"application": "bootstrap"
}
- action - The database action performed by the client.
- database - The database against which the action was performed.
- collection - The collection against which the action was performed.
- size - The total size of the response document.
- time - The time in
nanoseconds
for the action (includes any interaction with version history). - timestamp - The time since UNIX epoch in
nanoseconds
for use when exporting to other timeseries databases. - date - The BSON date at which the metric was created. Use to define a TTL index as appropriate.
- application - The application that invoked the service if specified in the request payload.
Metrics may be stored in a time series database of choice that supports the ILP. We have only tested
storing metrics in QuestDB. All the fields (except the _id
) are stored in
the TSDB. The duration
, and size
values are stored as field sets and the other values stored as tag sets.
The name for the series (measurement) can be specified via the command line argument, or will default to
the name of the metrics
collection.
A simple serialisation framework is also provided. Uses the visit_struct library to automatically serialise and deserialise visitable classes/structs.
A simple serialisation framework to serialise and deserialise visitable classes/structs to and from BSON. See the test suite for sample use of the framework.
The framework provides the following primary functions to handle (de)serialisation:
marshall<Model>( const Type& )
- to marshall the specified object to a BSON document.unmarshall<Model>( bsoncxx::document::view view )
- unmarshall the BSON document into a default constructed object.unmarshall<Model>( Model& m, bsoncxx::document::view view )
- unmarshall the BSON document into the specified model instance.
The framework handles non-visitable members within a visitable root object. Custom implementations can be implemented.
- For non-visitable classes/structs, implement the following functions as appropriate:
bsoncxx::types::bson_value::value bson( const <Class/Struct Type>& model )
that will produce a BSON document as a value variant for the data encapsulated in the object.void set( <Class/Struct Type>& field, bsoncxx::types::bson_value::view value )
that will populate the model instance from the BSON value variant.
- For partially visitable classes/structs, implement the following
populate
callback functions as appropriate:void populate( const <Class/Struct Type>& model, bsoncxx::builder::stream::document& doc )
to add the non-visitable fields to the BSON stream builder.void populate( <Class/Struct Type>& model, bsoncxx::document::view view )
to populate the non-visitable fields in the object from the BSON document.
A simple serialisation framework to serialise and deserialise visitable classes/structs to and from JSON. See the test suite for sample use of the framework.
Similar to the BSON framework, the JSON framework also handles non-visitable members within a visitable root object. Custom implementations can be implemented.
- For non-visitable classes/structs, implement the following functions as appropriate:
boost::json::value json( const <Class/Struct Type>& model )
that will produce a JSON value for the data encapsulated in the object.void set( const char* name, <Class/Struct Type>& field, simdjson::ondemand::value& value )
that will populate the model instance from the JSON value.
- For partially visitable classes/structs, implement the following
populate
callback functions as appropriate:void populate( const <Class/Struct Type>& model, boost::json::object& object )
to add non-visitable fields to the JSON object.void populate( const <Class/Struct Type>& model, simdjson::ondemand::object& object )
to populate non-visitable fields from the JSON object.
JSON input mostly comes via HTTP from untrusted sources. Consequently, there is a need for validating the JSON
input. Basic support for input validation is provided via a validate
function.
A validate( const char*, M& )
function is defined. This is to for validating the JSON input being parsed. A
default specialisation is provided for std::string
fields. This rejects strings with more than 40%
(configurable)
special characters. Users are advised to implement specific implementations specific to their domain requirements.
Users may also use environment variables to influence the default implementation.
JSON_PARSE_VALIDATION_IGNORE
- Environment variable that expects a comma or space separated list of field names that should be ignored by the validator. Default values arepassword, version
. Example:export SPT_JSON_PARSE_VALIDATION_IGNORE='password, file, version, firmware, identifier'
SPT_JSON_PARSE_VALIDATION_RATIO
- Environment variable (double
) that sets the maximum allowed ratio of special characters in the input string. Default is0.4
. Example:export SPT_JSON_PARSE_VALIDATION_RATIO='0.35'
Integration tests for the service will be developed in a few different languages
to ensure full interoperability. The test suites will be available under the
test
directory or under the client
directory. The following suites are present at present:
C++
- Integration - Integration test suite under the
test/integration
directory. - Performance - Performance test suite under the
test/performance
directory.
- Integration - Integration test suite under the
Python
- See features for the test suite.Julia
- See test for the test suite.go
- Simple test program under thetest/go
directory.(cd mongo-service/test/go; go build -o /tmp/gomongo; /tmp/gomongo)
A simple connection pool implementation is provided in the api, along with its associated test suite. The implementation is based on a factory function that can create valid connections as needed.
The pool is managed using a std::deque
. Connections returned to the pool are
added to the back (least idle), while acquiring a connection pops it from the
front (most idle) of the deque
.
Configuration is via a simple structure for common options such as initial size,
max pool size, max connection size, and maximum idle time for a connection. It
supports a simple validity check (via a mandatory bool valid()
function)
of the connection before adding it back to the pool.
A maxIdleTime
property is used to close connections that have been idling more
than the specified time. The initialSize
property is also used as a minimum
pool size configuration. The minimum size is always maintained regardless of
idle time.
Acquiring a connection from the pool returns a std::optional<Proxy>
instance.
If the maximum number of connections has been reached, std::nullopt
is returned.
The Proxy implements RAII by returning the connection to the pool when the
instance is destroyed.
A wrapper is used to associate a last used timestamp to the connection. This is used to enforce maximum idle time policy on the underlying connection.
A sample async client using coroutines with connection pool implementation is available under the client directory.
The performance test suite performs a simple CRUD operation using the service. Each test creates a document, retrieves the document, updates the document and finally deletes the document. All operations other than retrieve will create an associated version history document in the database. All operations also create a corresponding metric document in the database. Thus a CRUD operation involves approximately 12 database operations internally.
The tests are set up to run each set of CRUD operations 10
times (iterations),
and a run is repeated a second time to get better average and variability numbers.
Separate runs are set up with 10, 50, 100, 500
and 1000
concurrent threads.
All tests are against the simple docker stack
running on the same machine. A key goal of the test is to ensure that there are
no errors while running the test.
The following numbers as recorded on my laptop during normal use (plenty of other applications and processes running) and with the Docker daemon restricted to using half the available CPU cores.
[==========] Running 5 benchmarks.
[ RUN ] SocketClientFixture.crud(int concurrency = 10) (2 runs, 10 iterations per run)
[ DONE ] SocketClientFixture.crud(int concurrency = 10) (1356.777305 ms)
[ RUNS ] Average time: 678388.652 us (~303172.494 us)
Fastest time: 464013.326 us (-214375.326 us / -31.601 %)
Slowest time: 892763.979 us (+214375.327 us / +31.601 %)
Median time: 678388.652 us (1st quartile: 464013.326 us | 3rd quartile: 892763.979 us)
Average performance: 1.47408 runs/s
Best performance: 2.15511 runs/s (+0.68103 runs/s / +46.20025 %)
Worst performance: 1.12012 runs/s (-0.35396 runs/s / -24.01254 %)
Median performance: 1.47408 runs/s (1st quartile: 2.15511 | 3rd quartile: 1.12012)
[ITERATIONS] Average time: 67838.865 us (~30317.249 us)
Fastest time: 46401.333 us (-21437.533 us / -31.601 %)
Slowest time: 89276.398 us (+21437.533 us / +31.601 %)
Median time: 67838.865 us (1st quartile: 46401.333 us | 3rd quartile: 89276.398 us)
Average performance: 14.74081 iterations/s
Best performance: 21.55111 iterations/s (+6.81029 iterations/s / +46.20025 %)
Worst performance: 11.20117 iterations/s (-3.53964 iterations/s / -24.01254 %)
Median performance: 14.74081 iterations/s (1st quartile: 21.55111 | 3rd quartile: 11.20117)
[ RUN ] SocketClientFixture.crud(int concurrency = 50) (2 runs, 10 iterations per run)
[ DONE ] SocketClientFixture.crud(int concurrency = 50) (3786.378171 ms)
[ RUNS ] Average time: 1893189.086 us (~131220.814 us)
Fastest time: 1800401.958 us (-92787.127 us / -4.901 %)
Slowest time: 1985976.213 us (+92787.127 us / +4.901 %)
Median time: 1893189.086 us (1st quartile: 1800401.958 us | 3rd quartile: 1985976.213 us)
Average performance: 0.52821 runs/s
Best performance: 0.55543 runs/s (+0.02722 runs/s / +5.15369 %)
Worst performance: 0.50353 runs/s (-0.02468 runs/s / -4.67212 %)
Median performance: 0.52821 runs/s (1st quartile: 0.55543 | 3rd quartile: 0.50353)
[ITERATIONS] Average time: 189318.909 us (~13122.081 us)
Fastest time: 180040.196 us (-9278.713 us / -4.901 %)
Slowest time: 198597.621 us (+9278.713 us / +4.901 %)
Median time: 189318.909 us (1st quartile: 180040.196 us | 3rd quartile: 198597.621 us)
Average performance: 5.28209 iterations/s
Best performance: 5.55432 iterations/s (+0.27222 iterations/s / +5.15369 %)
Worst performance: 5.03531 iterations/s (-0.24679 iterations/s / -4.67212 %)
Median performance: 5.28209 iterations/s (1st quartile: 5.55432 | 3rd quartile: 5.03531)
[ RUN ] SocketClientFixture.crud(int concurrency = 100) (2 runs, 10 iterations per run)
[ DONE ] SocketClientFixture.crud(int concurrency = 100) (6902.353423 ms)
[ RUNS ] Average time: 3451176.712 us (~85190.281 us)
Fastest time: 3390938.086 us (-60238.626 us / -1.745 %)
Slowest time: 3511415.337 us (+60238.625 us / +1.745 %)
Median time: 3451176.712 us (1st quartile: 3390938.086 us | 3rd quartile: 3511415.337 us)
Average performance: 0.28976 runs/s
Best performance: 0.29490 runs/s (+0.00515 runs/s / +1.77646 %)
Worst performance: 0.28479 runs/s (-0.00497 runs/s / -1.71551 %)
Median performance: 0.28976 runs/s (1st quartile: 0.29490 | 3rd quartile: 0.28479)
[ITERATIONS] Average time: 345117.671 us (~8519.028 us)
Fastest time: 339093.809 us (-6023.863 us / -1.745 %)
Slowest time: 351141.534 us (+6023.863 us / +1.745 %)
Median time: 345117.671 us (1st quartile: 339093.809 us | 3rd quartile: 351141.534 us)
Average performance: 2.89756 iterations/s
Best performance: 2.94904 iterations/s (+0.05147 iterations/s / +1.77646 %)
Worst performance: 2.84785 iterations/s (-0.04971 iterations/s / -1.71551 %)
Median performance: 2.89756 iterations/s (1st quartile: 2.94904 | 3rd quartile: 2.84785)
[ RUN ] SocketClientFixture.crud(int concurrency = 500) (2 runs, 10 iterations per run)
[ DONE ] SocketClientFixture.crud(int concurrency = 500) (37825.690178 ms)
[ RUNS ] Average time: 18912845.089 us (~77556.025 us)
Fastest time: 18858004.698 us (-54840.391 us / -0.290 %)
Slowest time: 18967685.480 us (+54840.391 us / +0.290 %)
Median time: 18912845.089 us (1st quartile: 18858004.698 us | 3rd quartile: 18967685.480 us)
Average performance: 0.05287 runs/s
Best performance: 0.05303 runs/s (+0.00015 runs/s / +0.29081 %)
Worst performance: 0.05272 runs/s (-0.00015 runs/s / -0.28913 %)
Median performance: 0.05287 runs/s (1st quartile: 0.05303 | 3rd quartile: 0.05272)
[ITERATIONS] Average time: 1891284.509 us (~7755.602 us)
Fastest time: 1885800.470 us (-5484.039 us / -0.290 %)
Slowest time: 1896768.548 us (+5484.039 us / +0.290 %)
Median time: 1891284.509 us (1st quartile: 1885800.470 us | 3rd quartile: 1896768.548 us)
Average performance: 0.52874 iterations/s
Best performance: 0.53028 iterations/s (+0.00154 iterations/s / +0.29081 %)
Worst performance: 0.52721 iterations/s (-0.00153 iterations/s / -0.28913 %)
Median performance: 0.52874 iterations/s (1st quartile: 0.53028 | 3rd quartile: 0.52721)
[ RUN ] SocketClientFixture.crud(int concurrency = 1000) (2 runs, 10 iterations per run)
[ DONE ] SocketClientFixture.crud(int concurrency = 1000) (70436.871301 ms)
[ RUNS ] Average time: 35218435.650 us (~1036096.530 us)
Fastest time: 34485804.768 us (-732630.883 us / -2.080 %)
Slowest time: 35951066.533 us (+732630.883 us / +2.080 %)
Median time: 35218435.650 us (1st quartile: 34485804.768 us | 3rd quartile: 35951066.533 us)
Average performance: 0.02839 runs/s
Best performance: 0.02900 runs/s (+0.00060 runs/s / +2.12444 %)
Worst performance: 0.02782 runs/s (-0.00058 runs/s / -2.03786 %)
Median performance: 0.02839 runs/s (1st quartile: 0.02900 | 3rd quartile: 0.02782)
[ITERATIONS] Average time: 3521843.565 us (~103609.653 us)
Fastest time: 3448580.477 us (-73263.088 us / -2.080 %)
Slowest time: 3595106.653 us (+73263.088 us / +2.080 %)
Median time: 3521843.565 us (1st quartile: 3448580.477 us | 3rd quartile: 3595106.653 us)
Average performance: 0.28394 iterations/s
Best performance: 0.28997 iterations/s (+0.00603 iterations/s / +2.12444 %)
Worst performance: 0.27816 iterations/s (-0.00579 iterations/s / -2.03786 %)
Median performance: 0.28394 iterations/s (1st quartile: 0.28997 | 3rd quartile: 0.27816)
[==========] Ran 5 benchmarks.
Check out the sources and use cmake
to build and install the project locally.
Install Boost
BOOST_VERSION=1.83.0
INSTALL_DIR=/usr/local/boost
cd /tmp
ver=`echo "${BOOST_VERSION}" | awk -F'.' '{printf("%d_%d_%d",$1,$2,$3)}'`
curl -OL https://boostorg.jfrog.io/artifactory/main/release/${BOOST_VERSION}/source/boost_${ver}.tar.bz2
tar xfj boost_${ver}.tar.bz2
sudo rm -rf $INSTALL_DIR
cd boost_${ver} \
&& ./bootstrap.sh \
&& sudo ./b2 -j8 cxxflags=-std=c++20 install link=static threading=multi runtime-link=static --prefix=$INSTALL_DIR --without-python --without-mpi
Install mongocxx driver
PREFIX=/usr/local/mongo
MONGOC_VERSION=1.24.2
MONGOCXX_VERSION=3.8.0
sudo rm -rf $PREFIX
cd /tmp
sudo rm -rf mongo-c-driver*
curl -L -O https://github.com/mongodb/mongo-c-driver/releases/download/${MONGOC_VERSION}/mongo-c-driver-${MONGOC_VERSION}.tar.gz
tar xzf mongo-c-driver-${MONGOC_VERSION}.tar.gz
cd /tmp/mongo-c-driver-${MONGOC_VERSION}
mkdir cmake-build && cd cmake-build
cmake \
-DENABLE_AUTOMATIC_INIT_AND_CLEANUP=OFF \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_INSTALL_PREFIX=$PREFIX \
-DCMAKE_INSTALL_LIBDIR=lib \
-DBUILD_SHARED_LIBS=OFF \
-DENABLE_SASL=OFF \
-DENABLE_TESTS=OFF \
-DENABLE_EXAMPLES=OFF \
..
make -j8
sudo make install
cd /tmp
sudo rm -rf mongo-cxx-driver
curl -OL https://github.com/mongodb/mongo-cxx-driver/releases/download/r${MONGOCXX_VERSION}/mongo-cxx-driver-r${MONGOCXX_VERSION}.tar.gz
tar -xzf mongo-cxx-driver-r${MONGOCXX_VERSION}.tar.gz
cd mongo-cxx-driver-r${MONGOCXX_VERSION}/build
cmake \
-DCMAKE_CXX_STANDARD=20 \
-DCMAKE_CXX_STANDARD_REQUIRED=ON \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_PREFIX_PATH=$PREFIX \
-DCMAKE_INSTALL_PREFIX=$PREFIX \
-DCMAKE_INSTALL_LIBDIR=lib \
-DBUILD_SHARED_LIBS=OFF \
-DCMAKE_TESTING_ENABLED=OFF \
-DENABLE_TESTS=OFF \
-DBSONCXX_POLY_USE_STD=ON \
..
make -j8
sudo make install
Check out, build and install the project.
cd /var/tmp
git clone https://github.com/sptrakesh/mongo-service.git
cd mongo-service
cmake -DCMAKE_BUILD_TYPE=Release \
-DCMAKE_PREFIX_PATH=/usr/local/boost \
-DCMAKE_PREFIX_PATH=/usr/local/mongo \
-DCMAKE_INSTALL_PREFIX=/usr/local/spt \
-DBUILD_TESTING=OFF -S . -B build
cmake --build build -j12
sudo cmake --install build
Install dependencies to build the project. The following instructions at times reference arm
or arm64
architecture. Modify
those values as appropriate for your hardware. These instructions are based on steps I followed to set up the project on a
Windows 11 virtual machine running via Parallels Desktop on a M2 Mac.
Install Boost
- Download and extract Boost 1.82 (or above) to a temporary location (eg.
\opt\src
). - Launch the Visual Studio Command utility and cd to the temporary location.
cd \opt\src
curl -OL https://boostorg.jfrog.io/artifactory/main/release/1.82.0/source/boost_1_82_0.tar.gz
tar -xfz boost_1_82_0.tar.gz
cd boost_1_82_0
.\bootstrap.bat
.\b2 -j8 install threading=multi address-model=64 architecture=arm asynch-exceptions=on --prefix=\opt\local --without-python --without-mpi
cd ..
del /s /q boost_1_82_0
rmdir /s /q boost_1_82_0
Install mongocxx driver
- Download Mongo C Driver (1.24.2 or above) and extract sources to a temporary location.
- Launch the Visual Studio Command utility and cd to the temporary location.
cd \opt\src
curl -OL https://github.com/mongodb/mongo-c-driver/releases/download/1.24.2/mongo-c-driver-1.24.2.tar.gz
tar -xfz mongo-c-driver-1.24.2.tar.gz
cd mongo-c-driver-1.24.2
cmake -DENABLE_AUTOMATIC_INIT_AND_CLEANUP=OFF -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=c:\opt\local -DBUILD_SHARED_LIBS=OFF -DENABLE_SASL=OFF -DENABLE_TESTS=OFF -DENABLE_EXAMPLES=OFF -S . -B cmake-build
cmake --build cmake-build --target install --parallel 8
cd ..
del /s /q mongo-c-driver-1.24.2
rmdir /s /q mongo-c-driver-1.24.2
- Downlaod Mongo CXX Driver (3.8 or above) and extract sources to a temporary location.
- Launch the Visual Studio Command utility and cd to the temporary location.
cd \opt\src
curl -OL https://github.com/mongodb/mongo-cxx-driver/releases/download/r3.8.0/mongo-cxx-driver-r3.8.0.tar.gz
tar -xvf mongo-cxx-driver-r3.8.0.tar.gz
cd mongo-cxx-driver-r3.8.0
cmake -DCMAKE_CXX_STANDARD=20 -DCMAKE_CXX_STANDARD_REQUIRED=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_PREFIX_PATH=c:\opt\local -DCMAKE_INSTALL_PREFIX=c:\opt\local -DCMAKE_INSTALL_LIBDIR=lib -DBUILD_SHARED_LIBS=OFF -DCMAKE_TESTING_ENABLED=OFF -DENABLE_TESTS=OFF -DBSONCXX_POLY_USE_STD=ON -S . -B build
cmake --build build --target install --parallel 8
cd ..
del /s /q mongo-cxx-driver-r3.8.0
rmdir /s /q mongo-cxx-driver-r3.8.0
Install fmt library
Launch the Visual Studio Command utility and cd to a temporary location.
cd \opt\src
git clone https://github.com/fmtlib/fmt.git --branch 9.1.0
cd fmt
cmake -DCMAKE_CXX_STANDARD=20 -DCMAKE_CXX_STANDARD_REQUIRED=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=\opt\local -DCMAKE_INSTALL_LIBDIR=lib -DFMT_TEST=OFF -DFMT_MODULE=ON -S . -B build
cmake --build build --target install -j8
Install range-v3 library
Launch the Visual Studio Command utility and cd to a temporary location.
git clone https://github.com/ericniebler/range-v3.git --branch 0.12.0
cd range-v3
cmake -DCMAKE_CXX_STANDARD=20 -DCMAKE_CXX_STANDARD_REQUIRED=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=\opt\local -DCMAKE_INSTALL_LIBDIR=lib -DRANGE_V3_DOCS=OFF -DRANGE_V3_EXAMPLES=OFF -DRANGE_V3_PERF=OFF -DRANGE_V3_TESTS=OFF -DRANGE_V3_INSTALL=ON -B build -S .
cmake --build build --target install -j8
Install vcpkg manager
Launch the Visual Studio Command utility.
cd \opt\src
git clone https://github.com/Microsoft/vcpkg.git
cd vcpkg
.\bootstrap-vcpkg.bat -disableMetrics
.\vcpkg integrate install --vcpkg-root \opt\src\vcpkg
.\vcpkg install curl:arm64-windows
.\vcpkg install cpr:arm64-windows
Check out, build and install the project.
Launch the Visual Studio Command utility.
cd %homepath%\source\repos
git clone https://github.com/sptrakesh/mongo-service.git
cd mongo-service
cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_PREFIX_PATH=\opt\local -DCMAKE_INSTALL_PREFIX=\opt\spt -DBUILD_TESTING=ON -DCMAKE_TOOLCHAIN_FILE="C:/opt/src/vcpkg/scripts/buildsystems/vcpkg.cmake" -S . -B build
cmake --build build -j8
cmake --build build --target install
Run the service by specifying options similar to the following:
cd %homepath%\source\repos\mongo-service
build\src\service\Debug\mongo-service.exe -e true -o %temp%\ -p 2020 -m "mongodb://test:[email protected]/admin?authSource=admin&compressors=snappy&w=1" -l debug --metric-batch-size 2
Run other targets such as unitTest
, integration
, client
as appropriate directly from the IDE.
The following limitations have been encountered when running the test suites. A few tests have been disabled when running on Windows to avoid running into these issues.
- Cannot create an index with a specific name. For some reason, this leads to the index name being stored with non-UTF-8 characters, which then causes further issues down the road.
Sample clients in other languages that use the service.
- Julia - Sample client package for Julia is available under the julia directory.
- Python - Sample client package for Python is available under the python directory.
The API can be used to communicate with the TCP service. Initialise the library
(init
function) before using the other api functions. A higher level abstraction is also provided
via the repository.hpp interface.
Client code bases can use cmake to link against the library.
# In your CMakeLists.txt file
find_package(MongoService REQUIRED COMPONENTS api)
if (APPLE)
include_directories(/usr/local/spt/include)
else()
include_directories(/opt/spt/include)
endif (APPLE)
target_link_libraries(${Target_Name} PRIVATE mongo-service::api ...)
# Run cmake
if [ `uname` = "Darwin" ]
then
cmake -DCMAKE_PREFIX_PATH=/usr/local/boost;/usr/local/mongo;/usr/local/spt -S . -B build
else
cmake -DCMAKE_PREFIX_PATH=/opt/local;/opt/spt -S . -B build
fi
cmake --build build -j12
A simple command line utility is available for generating BSON ObjectId values. This utility is installed as
bin/genoid
under the destination bin
directory.
- Run without any arguments to generate a ObjectId at current time.
- Run with
--at <ISO Format date-time>
. Example:/usr/local/spt/bin/genoid --at 2024-10-26T07:28:57Z
The following components are used to build this project:
- Boost:Asio - We use Asio for the
TCP socket
server implementation. - MongoCXX - MongoDB C++ driver.
- **magic_enum - Static reflection for enums.
- visit_struct - Struct visitor library used for the serialisation utility functions.
- concurrentqueue - Lock free concurrent queue implementation for metrics.
- NanoLog - Logging framework used for the server. I modified the implementation for daily rolling log files.
- Catch2 - Unit testing framework.
- Clara - Command line options parser.
- hayai - Performance testing framework.