Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Queries take tremendous amount of time (4-15 seconds) average 6 seconds #19

Open
danielbenzvi opened this issue Jan 31, 2012 · 10 comments
Assignees

Comments

@danielbenzvi
Copy link

Hello,

We have implemented echoprint and ingested 600,000 songs (full length) into the database.
The Tokyo database size is 65GB and the SOLR database size is 20GB.
Unfortunately queries take a tremendous amount of time. We tried to optimize the solr database but the query time didn't improve.

Example query:

INFO: [fp] webapp=/solr path=/select params={echoParams=none&fl=track_id,score&q=909794+913+1003402+913+303386+913+584877+913+232554+913+956476+913+431834+950+679300+950+240931+950+955331+950+995773+950+357692+950+593061+976+70763+976+782224+976+782173+976+622726+976+601533+976+1011732+1027+796909+1027+763780+1027+566013+1027+753588+1027+312100+1027+193929+1051+316240+1051+332598+1051+19627+1051+692188+1051+430606+1051+281612+1164+845652+1164+397749+1164+200843+1164+468276+1164+764632+1164+527099+1181+731221+1181+394872+1181+947331+1181+160401+1181+800128+1181+1018565+1203+470445+1203+242926+1203+616082+1203+339907+1203+566977+1203+5230+1258+280182+1258+156696+1258+48596+1258+673917+1258+210986+1258+957346+1296+859097+1296+32268+1296+153048+1296+447624+1296+425653+1296+328097+1347+297986+1347+888504+1347+658455+1347+998338+1347+755650+1347+127671+1425+539061+1425+152124+1425+1932+1425+490734+1425+259543+1425+777147+1489+143020+1489+532966+1489+332011+1489+610997+1489+294746+1489+531412+1566+705565+1566+891476+1566+994477+1566+398781+1566+896153+1566+868163+1630+773449+1630+836516+1630+142472+1630+1010289+1630+163754+1630+684193+1682+803149+1682+657239+1682+732748+1682+732748+1682+732748+1682+94257+1770+732748+1770+732748+1770+732748+1770+732748+1770+732748+1770+327016+898+323107+898+1036138+898+636940+898+395567+898+275946+898+317417+926+376363+926+890914+926+705682+926+636726+926+202564+926+845159+950+728869+950+828839+950+94394+950+423738+950+701545+950+887760+1038+308181+1038+263679+1038+914857+1038+354865+1038+624759+1038+316618+1116+751629+1116+130128+1116+896612+1116+848012+1116+926584+1116+553706+1181+919921+1181+782385+1181+1037873+1181+800983+1181+920557+1181+528646+1258+712675+1258+37742+1258+204631+1258+708609+1258+803397+1258+601702+1348+821948+1348+279844+1348+46205+1348+933870+1348+770909+1348+959027+1412+857557+1412+948365+1412+325558+1412+411016+1412+269648+1412+442965+1578+949268+1578+1018985+1578+471072+1578+303080+1578+23059+1578+735935+1682+104536+1682+889609+1682+288667+1682+765904+1682+1047370+1682+387943+1770+36761+1770+269789+1770+823578+1770+528495+1770+274497+1770+398917+1849+720708+1849+444872+1849+24326+1849+347110+1849+554892+1849+883660+1989+300268+1989+112239+1989+1047381+1989+171935+1989+525106+1989+150300+2014+775632+2014+1039872+2014+66747+2014+66747+2014+66747+2014+1044341+2079+66747+2079+66747+2079+66747+2079+66747+2079+66747+2079+98556+898+746599+898+819520+898+625029+898+739693+898+514987+898+292281+950+923614+950+1016085+950+799882+950+8204+950+48660+950+317089+1027+933171+1027+496108+1027+83259+1027+426712+1027+8204+1027+727312+1103+76705+1103+565244+1103+229512+1103+276642+1103+621608+1103+402455+1131+993209+1131+142593+1131+664342+1131+591259+1131+826795+1131+294290+1181+474509+1181+2220+1181+276642+1181+115336+1181+312690+1181+841782+1258+617667+1258+163360+1258+790518+1258+833952+1258+217401+1258+420326+1411+581324+1411+933171+1411+973988+1411+678469+1411+142734+1411+689071+1437+901208+1437+163616+1437+401241+1437+529431+1437+919267+1437+801471+1488+235059+1488+614337+1488+670384+1488+446479+1488+685010+1488+493673+1566+947773+1566+533214+1566+252420+1566+742507+1566+625488+1566+581324+1604+457499+1604+617667+1604+177010+1604+628783+1604+115336+1604+819520+1630+260405+1630+384528+1630+986218+1630+811808+1630+183164+1630+292281+1758+83259+1758+8204+1758+747297+1758+790518+1758+158959+1758+933171+1835+83259+1835+590701+1835+747297+1835+237128+1835+504687+1835+623755+1912+1015745+1912+187178+1912+907455+1912+907455+1912+907455+1912+933171+1989+907455+1989+907455+1989+907455+1989+907455+1989+907455+1989+644707+898+355989+898+262809+898+901066+898+197301+898+846082+898+97489+950+809516+950+547829+950+206587+950+923296+950+361389+950+84536+1026+192110+1026+241444+1026+356019+1026+547829+1026+979165+1026+465926+1103+46194+1103+367139+1103+273667+1103+537071+1103+874329+1103+46673+1180+175512+1180+46194+1180+1036550+1180+315117+1180+803334+1180+839228+1202+349948+1202+88536+1202+262368+1202+668559+1202+36704+1202+648336+1258+308864+1258+10297+1258+536353+1258+918294+1258+54691+1258+874204+1335+272900+1335+97489+1335+171709+1335+356019+1335+547829+1335+646927+1378+351318+1378+1016527+1378+688694+1378+679940+1378+930878+1378+84536+1412+868276+1412+433288+1412+948093+1412+180168+1412+16059+1412+299279+1488+2843+1488+904122+1488+91558+1488+213518+1488+991955+1488+334613+1566+827953+1566+730944+1566+5278+1566+36280+1566+11265+1566+84536+1682+597540+1682+115352+1682+940566+1682+929096+1682+678580+1682+392847+1758+335563+1758+423247+1758+209740+1758+373538+1758+727531+1758+325153+1835+254323+1835+704166+1835+1026159+1835+1026159+1835+1026159+1835+931697+1875+1026159+1875+1026159+1875+1026159+1875+1026159+1875+1026159+1875+413450+899+827430+899+377816+899+342246+899+1001758+899+602740+899+546034+950+190166+950+1026816+950+997142+950+525328+950+684191+950+499217+1026+910321+1026+42130+1026+451494+1026+749446+1026+946414+1026+326065+1056+55811+1056+56244+1056+950756+1056+911180+1056+587607+1056+231252+1163+809979+1163+936140+1163+544586+1163+912950+1163+944283+1163+912546+1180+559888+1180+751143+1180+1026461+1180+52526+1180+676102+1180+599902+1258+264487+1258+129208+1258+338714+1258+206006+1258+351953+1258+536701+1283+98059+1283+31802+1283+925981+1283+418261+1283+224128+1283+25289+1349+870120+1349+495037+1349+172361+1349+492381+1349+425795+1349+744587+1413+61147+1413+777664+1413+1025932+1413+780273+1413+477990+1413+648544+1482+226590+1482+524066+1482+926283+1482+1017760+1482+932879+1482+373392+1566+45268+1566+650641+1566+30935+1566+577205+1566+274517+1566+345040+1605+200688+1605+440760+1605+49828+1605+189296+1605+605412+1605+633776+1631+658728+1631+988114+1631+609538+1631+191745+1631+969433+1631+958651+1758+379088+1758+656887+1758+704445+1758+381911+1758+736590+1758+134776+1791+643598+1791+284679+1791+1040459+1791+240343+1791+397235+1791+50740+1849+86300+1849+778393+1849+393888+1849+521884+1849+589014+1849+633526+1989+479631+1989+428713+1989+707586+1989+707586+1989+707586+1989+950298+2066+707586+2066+707586+2066+707586+2066+707586+2066+707586+2066+567370+898+856052+898+417830+898+607505+898+575141+898+1023879+898+198147+949+906970+949+852357+949+410162+949+519528+949+360064+949+840947+1026+86416+1026+852357+1026+410162+1026+519528+1026+821038+1026+937532+1103+580606+1103+419146+1103+417963+1103+1004729+1103+1046870+1103+937532+1180+994180+1180+788786+1180+417963+1180+1004729+1180+838045+1180+855512+1258+994180+1258+319803+1258+417963+1258+519528+1258+360064+1258+198147+1335+86416+1335+852357+1335+397752+1335+22433+1335+931971+1335+840947+1411+413342+1411+635013+1411+1021675+1411+11367+1411+862073+1411+402613+1488+233303+1488+21159+1488+520725+1488+145256+1488+913842+1488+887192+1566+998048+1566+328388+1566+589346+1566+526930+1566+182214+1566+74333+1629+423390+1629+417830+1629+475237+1629+502441+1629+56572+1629+198147+1682+410162+1682+519528+1682+555287+1682+45885+1682+519824+1682+86416+1758+410162+1758+1046870+1758+555287+1758+956708+1758+212352+1758+788786+1835+1004729+1835+821038+1835+755198+1835+755198+1835+755198+1835+840947+1989+755198+1989+755198+1989+755198+1989+755198+1989+755198+1989+522420+912+782348+912+743398+912+659944+912+532858+912+645434+912+418126+950+244147+950+480019+950+793161+950+485121+950+1019099+950+58179+1026+147904+1026+261231+1026+1004888+1026+212699+1026+55080+1026+55623+1050+46122+1050+157427+1050+736076+1050+531490+1050+272849+1050+217998+1103+352202+1103+817059+1103+466481+1103+648026+1103+599913+1103+285160+1142+769579+1142+98482+1142+274099+1142+592962+1142+915113+1142+817059+1180+310085+1180+993131+1180+316164+1180+653951+1180+48965+1180+899550+1258+627427+1258+447399+1258+834624+1258+484882+1258+727121+1258+189081+1335+94272+1335+481689+1335+345585+1335+1045346+1335+726652+1335+88655+1437+989447+1437+614883+1437+30805+1437+234430+1437+643247+1437+249221+1489+605851+1489+19438+1489+160094+1489+352315+1489+829851+1489+798933+1566+866675+1566+158715+1566+529925+1566+801105+1566+803628+1566+537603+1682+87298+1682+360185+1682+107280+1682+911687+1682+29681+1682+602499+1771+76392+1771+779655+1771+85794+1771+961288+1771+643938+1771+810763+1835+566764+1835+668780+1835+412470+1835+412470+1835+412470+1835+501111+1989+412470+1989+412470+1989+412470+1989+412470+1989+412470+1989+225288+898+110450+898+34699+898+377652+898+297841+898+727573+898+882726+950+607266+950+801232+950+84196+950+1004925+950+874779+950+233656+1026+607266+1026+292406+1026+84196+1026+126734+1026+401335+1026+581373+1103+946649+1103+292406+1103+932103+1103+386298+1103+685425+1103+581373+1181+406763+1181+986394+1181+713214+1181+172253+1181+211470+1181+866441+1258+778577+1258+781727+1258+713214+1258+721558+1258+874779+1258+1027650+1335+496259+1335+801232+1335+469418+1335+964605+1335+620965+1335+820042+1358+136096+1358+891277+1358+640040+1358+564206+1358+809061+1358+117152+1488+362914+1488+43642+1488+334647+1488+828162+1488+580511+1488+422394+1566+845017+1566+265278+1566+257711+1566+609593+1566+738605+1566+687628+1622+365286+1622+1017500+1622+767586+1622+178714+1622+377700+1622+310182+1682+233656+1682+634147+1682+607266+1682+781727+1682+960386+1682+921346+1758+1027650+1758+677987+1758+496259+1758+946649+1758+292406+1758+518035+1782+836578+1782+820042+1782+436089+1782+772516+1782+577434+1782+233656+1835+607266+1835+157632+1835+84196+1835+172253+1835+401335+1835+677987+1912+778577+1912+292406+1912+1038773+1912+718850+1912+643934+1912+233656+1989+696335+1989+8780+1989+809256+1989+809256+1989+809256+1989+409865+2066+809256+2066+809256+2066+809256+2066+809256+2066+809256+2066&qt=/hashq&wt=standard&rows=30&version=2.2} hits=5923078 status=0 QTime=6849

Are we doing anything wrong here?

Sincerely,
Daniel.

@ghost ghost assigned alnesbit Feb 8, 2012
@alnesbit
Copy link
Contributor

Hello Daniel,

Apologies for the delay in getting back to you regarding this. Are you still having this problem?

Did you split the fingerprint codes into overlapping segments upon ingestion or are you ingesting full codes? In other words, how did you run the ingestion, and was split=True or split=False set when calling fp.py:ingest?

What happens when you make rapid, repeated queries of the Solr index? Do all the queries take a long time or

Andrew

@danielbenzvi
Copy link
Author

Hello Andrew,
We are still receiving this problem and is consistent through upgrades.

Multiple queries to the exactly same result set will become faster but not significantly faster. The lowest we can get is 1.9 seconds. The highest was 28 seconds and it was the only query being performed on the system.

We ingested full length codes and we used split=True (as defined in fp.py).

All the queries take long time.

@alnesbit alnesbit reopened this Feb 22, 2012
@Tomtomgo
Copy link

I experience exactly the same issue... Did you find a solution @danielbenzvi ?

@alnesbit
Copy link
Contributor

Lately we have been investigating this issue in detail and have found that when the Solr database becomes very large then performance upon querying can indeed suffer in this way, if the entire index is deployed onto a single Solr core on one server.

We've found various solutions that have helped tremendously in reducing the time required to perform a query (e.g., improvements in time of about an order of magnitude). One of these solutions involves sharding, which requires a more complicated Solr setup. Another solution involves changes to the way the fingerprints are actually indexed and queried. We have had great success in running these improvements on our servers that are behind the song/identify method on our API.

We will push out source code when it is ready for GitHub, for example, to make a more sophisticated Solr configuration easier to deploy out-of-the-box (no ETA yet). But this will most likely involve large changes to the back end rather than tweaking the current setup.

@ranger123
Copy link

Andrew, Could you provide a little more detail as to how you've adjusted the indexing and queries to improve the Solr query times? I'm struggling getting an acceptable response time for a large collection and am interesting in any direction you may be able to provide to assist. thanks.

@zemariamm
Copy link

Same problem here guys, Solr is taking too long to answer.. I get response times around 5 seconds per query, I used the patches suggest by Justin Haygood (https://groups.google.com/forum/#!topic/echoprint/J7MQftCfpCM) which improved the recognition significantly. Any ideas ?

@alnesbit
Copy link
Contributor

Increasing the density of hash codes will improve the OTA recognition rate, but this will also make the Solr part of the search significantly slower.

The overall ideas in improving scalability of the index are the following:

  • to reduce the number of hash codes down by omitting uninformative hash codes from the index and queries
  • improving the efficiency of the search algorithm in Solr (Solr 4.x already has a patch for this but to use it we obviously need to upgrade from Solr 1.4 to Solr 4).
  • changing the architecture of the index itself

We've already tried the first approach. It improves the results but it is a hack, and the other approaches are better.

@zemariamm
Copy link

Thanks for the fast answer Andrew! I actually ran a few tests that surprised me (with Justin Haygood's patch):

  • the OTA recognition quality is paired with some proprietary stuff that I tried in the past
  • Every time I run a query - it doesn't matter if I only have 1 song, 100 or 2000 on the DB) it always takes around 4 or 5 seconds (if I run the same query again it takes around 20 ms on my local machine), how can this be happening ? Shouldn't it be blazing fast with a basically empty database ?

So replacing Solr for the newest version should fasten it right ? I'll give it a try :)

Thanks for the help!

@ranger123
Copy link

Hi Andrew, Thanks for the response. I wasn't able to reach the C experimental repo either. I'd be interested in taking a look.

I did take a look at migrating to Solr 4.x, but it looks like there are a few functions that have been deprecated that prevent the hashr from compiling. I did try using a version from another user that utilizes Maven to compile other versions, but it would only compile to 3.x.

When you mention an uninformative hash, could you help me understand what type of hash value would be uninformative?
Thanks.

@danicuki
Copy link

I am having the same issue here. Does anyone have a solution?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants