Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Periodic and seemingly random errors returned from TokyoTyrant on misc getlist commands #27

Open
bgood-clip opened this issue Aug 21, 2012 · 6 comments

Comments

@bgood-clip
Copy link

At random points in time we are getting errors back from tyrant on an echoprint query. The error codes include: 49 54, 32, 57, 53, 51 and others. Also here is the stack trace that accompanies the errors:

Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/web.py-0.37-py2.7.egg/web/application.py", line 239, in process
return self.handle()
File "/usr/local/lib/python2.7/dist-packages/web.py-0.37-py2.7.egg/web/application.py", line 230, in handle
return self._delegate(fn, self.fvars, args)
File "/usr/local/lib/python2.7/dist-packages/web.py-0.37-py2.7.egg/web/application.py", line 420, in _delegate
return handle_class(cls)
File "/usr/local/lib/python2.7/dist-packages/web.py-0.37-py2.7.egg/web/application.py", line 396, in handle_class
return tocall(*args)
File "/usr/local/echoprint/API/api.py", line 83, in GET
response = fp.best_match_for_query(stuff.fp_code)
File "/usr/local/echoprint/API/fp.py", line 194, in best_match_for_query
tcodes = get_tyrant().multi_get(trackids)
File "/usr/local/echoprint/API/pytyrant.py", line 292, in multi_get
rval = self.t.misc("getlist", opts, keys)
File "/usr/local/echoprint/API/pytyrant.py", line 540, in misc
return list(self._misc(func, opts, args))
File "/usr/local/echoprint/API/pytyrant.py", line 524, in _misc
socksuccess(self.sock)
File "/usr/local/echoprint/API/pytyrant.py", line 172, in socksuccess
raise TyrantError(fail_code)
TyrantError: 57

Restarting the api.py process temporarily makes the errors go away. Does anyone have an insights into what is going on here?

@bgood-clip
Copy link
Author

I can consistently reproduce the issue using:

ab -n 1000 http://amazon-ec2-host-here:8080/query?fp_code=eJylllGOazkIRLdkDDawHMBm_0uYys9kFOl5Pp6iPpHcUS5QVThjDKbxgOgLe79g84V4osYL9wXSfGBMekH2C-u8YPXCs1-SeuCvarZ44dAL3Q8Q1QPwzwtTX2B74d3R21d3v0D-wnvOn78_463-x7d_huQD6PqFv-nIzwu1X3jP-X-88Xb7cyO9vVH2wrNm0vPAX7n93ZHKC0-NiPOFx8457HpCbLfKLddcUtvzRKqT581jSiFqHVMZn7tyb9u6rOK668jW4hSX6tl0eTftWIub8uABXTDvjOnUJvN09N4yg5N1TvUlQqtkXSfuqKMVca7lXKbrQqy1blipjuPlp1Aq7bVGVPq5uWlNzXmlvph-lvycfVESxhdtiZbeaTKmy-VOYkI5TTxm5XJlPGdGmlltW5Ycnul8iWVh6LrixhWTi-uzhPSS3B2bdpFG6djBRMP3tXa1KXyIufXUCnROQzK_kL4ZP2dfjErcLB23IMPwiUFDh9Jz9uw2PdACVc6tLhw7MOuoBaObzvC9l81EkTZ7bYznTrppzYM5a6wyl5RgvyLe4ZXX2eZKLo81qciNMLtxjmV_Qd57_Zx9sQ_HsSLiWDH8JmmPZVBQ3VAoX3YbtPCOaMCIPqRbPlXVMFwsyt7MJqrb2bPhTbE17_kod_0qT0gHCbeMla2488vX8KLQsZQsN4zVAWd_AZHafs6-kOKxq4nSBlSmceIaRB2TP2Mks3npxKRW45aI64t4hUXCIJDBEJHPqz8jbvUWm3QSVlHCxwRp8DmhNgV0w6LywHDXPq7Y_9H4uYQdE1VjfoEvbP05-2LvQAp49ZCBoJgixA5BV-e2YX0oGPlqRGnjjWoOvls0CJYfG7XIMBs54IkND5x5OwsN3UFz7c0b7RQ1w7LFB-Yx34Gc4uHVKavq5NC0BQv_Cz25--fsC4n8jK9uz31o1kT8eSFKkNZ028aTpHNsP4FvDjLMcO4B9TaMhFedxSdrnrWyuvksvXxkwWxCLggZhqsI-SFEJ2zRxRqbo6juhdC4ij2x1SIRVWwFdB0bWaDEtB0qeWxJ3l_khPo_Z__577oYlWGPwJyrjsHKGYl40YETD_Wn1mhZQTCySetdKZ9VBcn9nIk9XTCXroEdgRlxJWP2h-h8boctyJTIZ-XmlBgHX9frzjBsAtd517XDpfDDRMYxFL80IhC8AUuccTDv5dO-aAzr_px98Q8_-QGr

The fp_code in the query returns a successful result. I ran ab on an c1.medium and the echoprint server is also running on a c1.meduium both are running Ubuntu 12.04

@bwhitman
Copy link
Contributor

Are you using the web.py thing straight up (using its webserver)? If so can
you try it under a real webserver via wsgi

On Wed, Aug 22, 2012 at 3:19 PM, bgood-clip [email protected]:

I can consistently reproduce the issue using:

ab -n 1000 http://amazon-ec2-host-here:8080/query?fp*
code=eJylllGOazkIRLdkDDawHMBm_0uYys9kFOl5Pp6iPpHcUS5QVThjDKbxgOgLe79g84V4osYL9wXSfGBMekH2C-u8YPXCs1-SeuCvarZ44dAL3Q8Q1QPwzwtTX2B74d3R21d3v0D-wnvOn78_463-x7d_huQD6PqFv-nIzwu1X3jP-X-88Xb7cyO9vVH2wrNm0vPAX7n93ZHKC0-NiPOFx8457HpCbLfKLddcUtvzRKqT581jSiFqHVMZn7tyb9u6rOK668jW4hSX6tl0eTftWIub8uABXTDvjOnUJvN09N4yg5N1TvUlQqtkXSfuqKMVca7lXKbrQqy1blipjuPlp1Aq7bVGVPq5uWlNzXmlvph-lvycfVESxhdtiZbeaTKmy-VOYkI5TTxm5XJlPGdGmlltW5Ycnul8iWVh6LrixhWTi-uzhPSS3B2bdpFG6djBRMP3tXa1KXyIufXUCnROQzK_kL4ZP2dfjErcLB23IMPwiUFDh9Jz9uw2PdACVc6tLhw7MOuoBaObzvC9l81EkTZ7bYznTrppzYM5a6wyl5RgvyLe4ZXX2eZKLo81qciNMLtxjmV_Qd57_Zx9sQ_HsSLiWDH8JmmPZVBQ3VAoX3YbtPCOaMCIPqRbPlXVMFwsyt7MJqrb2bPhTbE17_kod_0qT0gHCbeMla2488vX8KLQsZQsN4zVAWd_AZHafs6-kOKxq4nSBlSmceIaRB2TP2Mks3npxKRW45aI64t4hUXCIJDBEJHPqz8jbvUWm3QSVlHCxwRp8DmhNgV0w6LywHDXPq7Y_9H4uYQdE1VjfoEvbP05-2LvQAp49ZCBoJgixA5BV-e2YX0oGPlqRGnjjWoOvls0CJYfG7XIMBs54IkND5x5OwsN3UFz7c0b7RQ1w7LFB-Yx34Gc4uHVKavq5NC0BQv_Cz25--fsC4n8jK9uz31o1kT8eSFKkNZ028aTpHNsP4FvDjLMcO4B9TaMhFedxSdrnrWyuvksvXxkwWxCLggZhqsI-SFEJ2zRxRqbo6juhdC4ij2x1SIRVWwFdB0bWaDEtB0qeWxJ3l_khPo_Z
**
577oYlWGPwJyrjsHKGYl40YETD_Wn1mhZQTCySetdKZ9VBcn9nIk9XTCXroEdgRlxJWP2h-h8boctyJTIZ-XmlBgHX9frzjBsAtd517XDpfDDRMYxFL80IhC8AUuccTDv5dO-aAzr_px98Q8
*-QGr

The fp_code in the query returns a successful result. I ran ab on an
c1.medium and the echoprint server is also running on a c1.meduium both are
running Ubuntu 12.04


Reply to this email directly or view it on GitHubhttps://github.com//issues/27#issuecomment-7945822.

@bgood-clip
Copy link
Author

I'm running this using web.py.

Is there a sample apache config to run echoprint using the mod_wsgi for apache?

@bgood-clip
Copy link
Author

I upgraded to Apache and the errors are still occurring.....

For reference my .htaccess looks like:

RewriteEngine on
RewriteBase /
RewriteCond %{REQUEST_URI} !^/icons
RewriteCond %{REQUEST_URI} !^/favicon.ico$
RewriteCond %{REQUEST_URI} !^(/.)+api.py/
RewriteRule ^(.
)$ api.py/$1 [PT,QSA]

Apache site config looks like:
<VirtualHost *:8080>
ServerAdmin [email protected]

    DocumentRoot /usr/local/echoprint/API

<Directory /usr/local/echoprint/API>
            SetHandler wsgi-script
            Options ExecCGI FollowSymLinks
    </Directory>

    # Possible values include: debug, info, notice, warn, error, crit,
    # alert, emerg.
    LogLevel warn

    CustomLog /var/log/apache2/access.log combined
    ServerSignature Off

@iscra
Copy link

iscra commented Sep 14, 2012

I am also getting those errors on current version of the server, just running api.py
It seems the error happen randomly when there is more load and more queries are executed. For me it happens when a lot of /ingest queries come in.

The problem is critical, since soon after the first error, the server blocks and does not respond to anymore queries.
Is it possible that there is some issue with threads? Not sure how it is handled in the python server.

When TokyoTyrant is configured with more threads, the server runs a bit longer without blocking. Eventually with high load it always blocks, sometimes even without showing any TokyoTyrant errors, so maybe the real problem is in the python server.

Some example traces (soon before server blocks):

10.228.66.255:52295 - - [13/Sep/2012 13:19:27] "HTTP/1.1 POST /ingest" - 200 OK
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/web/application.py", line 239, in process
return self.handle()
File "/usr/local/lib/python2.7/dist-packages/web/application.py", line 230, in handle
return self._delegate(fn, self.fvars, args)
File "/usr/local/lib/python2.7/dist-packages/web/application.py", line 420, in _delegate
return handle_class(cls)
File "/usr/local/lib/python2.7/dist-packages/web/application.py", line 396, in handle_class
return tocall(*args)
File "/home/ubuntu/echoprint-server/API/api.py", line 55, in POST
fp.ingest(data, do_commit=True, local=False)
File "/home/ubuntu/echoprint-server/API/fp.py", line 586, in ingest
get_tyrant().multi_set(codes)
File "/home/ubuntu/echoprint-server/API/pytyrant.py", line 307, in multi_set
self.t.misc("putlist", opts, lst)
File "/home/ubuntu/echoprint-server/API/pytyrant.py", line 540, in misc
return list(self._misc(func, opts, args))
File "/home/ubuntu/echoprint-server/API/pytyrant.py", line 524, in _misc
socksuccess(self.sock)
File "/home/ubuntu/echoprint-server/API/pytyrant.py", line 172, in socksuccess
raise TyrantError(fail_code)
TyrantError: 56

10.228.66.255:52297 - - [13/Sep/2012 13:19:28] "HTTP/1.1 POST /ingest" - 500 Internal Server Error
10.228.66.255:52300 - - [13/Sep/2012 13:19:29] "HTTP/1.1 POST /ingest" - 200 OK

10.32.7.67:34282 - - [13/Sep/2012 15:12:18] "HTTP/1.1 POST /ingest" - 200 OK
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/web/application.py", line 239, in process
return self.handle()
File "/usr/local/lib/python2.7/dist-packages/web/application.py", line 230, in handle
return self._delegate(fn, self.fvars, args)
File "/usr/local/lib/python2.7/dist-packages/web/application.py", line 420, in _delegate
return handle_class(cls)
File "/usr/local/lib/python2.7/dist-packages/web/application.py", line 396, in handle_class
return tocall(*args)
File "/home/ubuntu/echoprint-server/API/api.py", line 55, in POST
fp.ingest(data, do_commit=True, local=False)
File "/home/ubuntu/echoprint-server/API/fp.py", line 586, in ingest
get_tyrant().multi_set(codes)
File "/home/ubuntu/echoprint-server/API/pytyrant.py", line 307, in multi_set
self.t.misc("putlist", opts, lst)
File "/home/ubuntu/echoprint-server/API/pytyrant.py", line 540, in misc
return list(self._misc(func, opts, args))
File "/home/ubuntu/echoprint-server/API/pytyrant.py", line 524, in _misc
socksuccess(self.sock)
File "/home/ubuntu/echoprint-server/API/pytyrant.py", line 172, in socksuccess
raise TyrantError(fail_code)
TyrantError: 51

10.32.7.67:34283 - - [13/Sep/2012 15:12:20] "HTTP/1.1 POST /ingest" - 500 Internal Server Error

@bgood-clip
Copy link
Author

@iscra Turning off Keep-Alives made my situation a lot better. That said I still have problem about 2-3 per day at our current loads.

Now when this problem crops up, the Apache worker has eaten a ton of memory (700-1.5 GB). Once I restart apache a log statement come out for a request that happened hours early, much like a thread got hung.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants