Skip to content

Text (source code) search engine with indexer and a front end web interface to search. Uses Python v2/3.

License

Notifications You must be signed in to change notification settings

keyboardWitch/text-sherlock

 
 

Repository files navigation

Text Sherlock (or Sherlock or TS)

Provides a fast, easy to install and use search engine for text but, mostly for source code. OpenGrok requires too much time to install (though it may be worth it for some). Sherlock will give you a much easier setup, a text indexer, and a web app interface for searching.

Basic Setup

Instructions:

  1. Download sherlock source from GitHub.
  2. Extract/place the sherlock source code in the desired (install) directory. This will be where sherlock lives.
  3. Run sh setup/virtualenv-setup.sh to setup an isolated environment and download core packages.
  4. Configure settings. The defaults in settings.py provide documentation for each setting.
    • Copy example.local_settings.yml to local_settings.yml.
    • Override/copy any setting from settings.py to local_settings.yml (change the values as needed). All YAML keys/options must be lowercase.
  5. Run source sherlock_env/bin/activate to enter the virtual environment.
  6. Run python main.py --index update or --index rebuild to index the path specified in the settings. Watch indexing output.
  7. Run python main.py --runserver to start the web server.
  8. Go to http://localhost:7777 to access the web interface. Uses the twitter bootstrap for its UI.

You may need to install some packages before a Ubuntu installation will run without error.

  • Install curl: sudo apt-get install curl
  • Install uuid libs: sudo apt-get install uuid-dev
  • Install python dev: sudo apt-get install python-dev

Includes:

  • Settings/Configuration
  • Setup script (read contents of script for more information)
  • Main controller script
    • Run main.py -h for more information.
  • End-to-end interface
    • Indexing and searching text (source code). Built-in support for whoosh (fast searching) or xapian (much faster searching).
      • Easily extend indexing or searching via custom backends.
    • Front end web app served using werkzeug or cherrypy.
      • werkzeug is for development to small traffic.
      • cherrypy is a high-speed, production ready, thread pooled, generic HTTP server.
    • Settings and configuration using Python.

Web Interface

Features:

Append to document URL.

  • To highlight lines, append to URL: &hl=3,7,12-14,21
  • To jump to a line, append to end of URL: #line-3

screenshot

screenshot

Using other backends

In settings.py:

  • Change the default_indexer and default_searcher values to match the name given to the backend.
    • Possible values:
      • whoosh the default, no extra work needed.
      • xapian must be installed separately using the included setup/install-xapian.sh setup script.

Using other web servers

Text Sherlock has built-in support for werkzeug and cherrypy WSGI compliant servers.

In settings.py:

  • Change the server_type value to one of the available server types.
    • Possible values:
      • default, werkzeug web server (default).
      • cherrypy, production ready web server.

Core packages

Requires Python 2.6/3+

Other References

Project Goals

  1. Provide an easy to setup, fast, and adequate text search engine solution.
  2. Be a respectable alternative to OpenGrok.
  3. Influence the authors of OpenGrok to provide a simpler setup process.
    • I successfully setup two installations on CentOS and Ubuntu 11.x and each time it took more than two hours. TS setup takes less than 10 minutes (excluding package download time).

Contributors

About

Text (source code) search engine with indexer and a front end web interface to search. Uses Python v2/3.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 59.4%
  • C++ 19.9%
  • C 7.3%
  • CSS 5.3%
  • Shell 3.2%
  • HTML 2.1%
  • Other 2.8%