Thousands of patients and visitors come through the hospital every single day. Every single one of them will have various cases, which will need to be handled by separate departments, and the hospital management system will need to store different records for each and every one of those cases. When done manually, this process is quite laborious. It is vital for hospitals to preserve documents in an organized manner so that it is beneficial for physicians and researchers to choose the particular type of document that they require in order to overcome this difficulty. Only then will it be possible for hospitals to find a solution to this problem.
This project proposes the classification of huge medical documents automatically. The aim of the project is to get improved accuracy as well as consistency in classifying with minimum error rate.
In this project the preprocessing on the dataset which consists of medical files from five medical domains namely, Gastroenterology, Neurology, Orthopedic, Radiology and Urology was done. Using this dataset, classifiers were built that can classify the documents into domains. The models are evaluated over test data, and it is observed that Support Vector Machine exhibits the best performance.