Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mongodb backend uses wrong pipeline order, sums aggregate field name instead of measure (physical) field name #336

Closed
miraculixx opened this issue Jan 20, 2016 · 4 comments

Comments

@miraculixx
Copy link

Based on the tutorial I converted the tutorial dataset to MongoDB, using the cube with the mongo backend. After fixing the mongo backend it almost works, except that the MongoBrowser creates a pipeline that returns wrong results:

from cubes import Workspace
workspace = Workspace(config="/home/patrick/projects/cubestest/slicer-mongo.ini")
result = browser.aggregate(drilldown=["year"])
for record in result:
    print record
=>
{u'record_count': 31, u'amount_sum': 0, u'year': 2009}
{u'record_count': 31, u'amount_sum': 0, u'year': 2010}

As you can see the amount_sum fields are 0, which is wrong. Looking at the pipeline generated reveals why:

[{"$match": {"year": {"$exists": 1}}}, 
 {"$project":        <<< wrong order, project pipeline goes after group
        {"record_count": "$record_count", 
        "amount_sum": "$amount_sum", "year": 1}
 }, 
 {"$group": {"record_count": {"$sum": 1}, 
         "_id": {"year": {"$year": "$year"}}, 
         "amount_sum": {"$sum": "$amount_sum"}}},    <<< should be "$sum:" : "$amount"
  {"$sort": {"_id.year": 1}}
]

when the two problems are fixed manually and the aggregation directly run against mongodb the results are correct:

db.ibrd.aggregate([{ "$match": { "year": { "$exists": 1 } } }, 
{ "$group": { "record_count": { "$sum": 1 }, "_id": { "year": { "$year": "$year" } }, "amount": { "$sum": "$amount"} } }, 
{ "$project": { "record_count": "$record_count", "amount": "$amount", "year": 1 } }, 
{ "$sort": { "_id.year": 1 } }])
=>
/* 1 */
{
    "_id" : {
        "year" : 2009
    },
    "record_count" : 31,
    "amount" : 550840
},
/* 2 */
{
    "_id" : {
        "year" : 2010
    },
    "record_count" : 31,
    "amount" : 566020
}

The problem probably occurs within the mapping code. Any hints?

Data ready to import into mongodb: https://gist.github.com/miraculixx/5ffc89241478e8b09883
MongoDB: 3.2
cubes: release-1.0.1 with fix for mongodb applied as per above

#slices.ini
[store]
type: mongo
url: mongodb://localhost:27017/
database: cubes
collection: ibrd
[models]
main: tutorial_model.json
@miraculixx
Copy link
Author

found a way to get it right, see PR #337. Not sure this is as intended, as it uses .measure directly without otherwise resolving the mapping.

@voldesh
Copy link

voldesh commented May 31, 2016

Are there any changes in model.json when mongodb is used instead of sqlite ?

@voldesh
Copy link

voldesh commented May 31, 2016

I am getting error "Unknown stores extension mongo". Help please.

@Stiivi
Copy link
Member

Stiivi commented Jun 3, 2016

Moved to DataBrewery/cubes-mongo#2 as the backends were separated.

@Stiivi Stiivi closed this as completed Jun 3, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants