Skip to content

suryanshagnihotri/wiki-search-engine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

wiki-search-engine

Search engine built on 75 gb wiki dump This project generates a sorted indexer for the dump specified. It is optimized by compression techniques. Given a dump, it will create the inverted index file in Index/ folder, create a tree of indexers in Split/ folder for the inverted index and tree in Title/ for title-docID mappings file. Inverted index and title mapping file can be found in Index/ folder

run start.sh
indexer : make index
time python indexer.py ./wiki-search-small.xml (get xml file which is our dump)

run Kwaymerge.py : merge small files
time python Kwaymerge.py

copy file from output_files to index folder and make split folder inside index folder.
run create_index.py : create multilevel index
python create_index.py

finally run query.py : answer query
python query.py

Releases

No releases published

Packages

No packages published