A web application that takes a user's question, surveys open access research articles about a given policy, and returns an answer to the user.
Every year, millions of research articles on different policies and their consequences are published. Each of these is a rich source of information that can help policy advisors in determining the appropriate policies to implement. This application carries out a number of functions.
- Provide a chatbot that the user interacts with.
- Based on the user's answers, generate a question about a given policy consequence.
- Process the question to obtain keywords and then use these keywords to fetch open access research articles about the policy.
- Perform claim detection on the abstract of each article to determine whether the article found evidence for or against a given policy consequence.
- Count the total number of articles that are for or against a given policy consequence.
- Display this data in a visualization and also provide short summaries of the policy and its consequence.
Make sure that you have Redis, Python 3.6+, pip, and virtualenv installed on your computer.
Clone the git repository and change into the top directory.
git clone https://github.com/Prosper21/Policy-Question-Answering-System
cd Policy-Question-Answering-System
Create a virtual environment and activate it.
virtualenv venv
. venv/bin/activate
Install all the required packages by running the command:
pip install -r requirements.txt.
Create a .env file to store your environment variables. In our case, this would have the following values.
SECRET_KEY = 'xxx' # Use your own secret key here
REDIS_URL = 'redis://localhost:6379'
We now have all the files and packages that we need to run the application on our local host. In one terminal, start the redis server by running:
redis-server
In a second terminal, change into the Policy-Question-Answering-System directory, activate the virtual environment, and start the celery workers by running:
celery worker -A answer_policy_question.celery -O fair
In a third terminal, change into the Policy-Question-Answering-System directory, activate the virtual environment, and start your application by running:
python app.py.
You should see the application running and you can access it on localhost:5000.
Open an account on Heroku at https://signup.heroku.com/.
Install the Heroku CLI on your computer and carry out any other required configurations. The details can be found here https://devcenter.heroku.com/articles/heroku-cli.
Clone the git repository and change into the top directory.
git clone https://github.com/Prosper21/Policy-Question-Answering-System
cd Policy-Question-Answering-System
Create an app on Heroku, which prepares Heroku to receive your source code.
heroku create
In this case, heroku generates a random name for your app. You can also provide your own name by running:
heroku create <app-name>
Because this application uses Heroku addons, we need to create those first. In our case, we need the redis addon. We do this by running:
heroku addons:create heroku-redis:hobby-dev
This will automatically set the REDIS_URL configuration variable for our application but we also need to set up our SECRET_KEY configuration variable. We do this by running:
heroku config:set SECRET_KEY = 'xxx' # Use your own secret key here
We can now deploy to Heroku by running:
git push heroku master
Go to the Heroku dashboard and makes sure that your app's web and worker dynos are both switched on.
You can now visit the app at the url generated by its name i.e. herokuapp.app-name.com or simply open the app by running:
heroku open
- Python 3.6+
- JavaScript/HTML/CSS
- Heroku
If you make any changes for example by installing new packages and running
pip freeze > requirements.txt
then you will have to add
https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.1.0/en_core_web_sm- 2.1.0.tar.gz#egg=en_core_web_sm
to your requirements.txt. This ensures that the required spaCy model is loaded. If this causes a 'Double requirement given' error upon deployment, look for
en-core-web-sm=='x.x.x'
in your requirements.txt file and delete it.
Bing Liu, Minqing Hu and Junsheng Cheng. "Opinion Observer: Analyzing and Comparing Opinions on the Web." Proceedings of the 14th International World Wide Web conference (WWW-2005), May 10-14, 2005, Chiba, Japan.
At the moment, the chatbot assumes that the user provides very specific replies rather than full sentences. For example, when the bot asks 'What policy would you like to research today?', it expects a straight answer like 'cap and trade' rather than 'I would like to research cap and trade'. The same applies to the question 'And what effect of cap and trade on carbon emissions are you interested in?' where we expect an answer like 'carbon emissions' rather than 'I would like to know its effect on carbon emissions'. The reason for this is that the user's replies are being handled using JavaScript in the HTML files that render the page and thus there is no way to apply natural language processing techniques to automatically identify the policy or phenomenon of interest from full sentence replies. I have not yet figured out a way to do the processing of full sentence replies from the backend while maintaining a conversational flow but I believe this can be done.
Another possible improvement is moving all the JavaScript code that is in the HTML files to the JavaScript folder in the static directory. The recommended practice is that JavaScript and HTML code should not me mixed.
Since we are fetching abstracts from three APIs, there is a chance of the same abstract appearing more than once in our results. It would be a good idea if these duplicates could be removed but some times, an extra character or space in the duplicate essentailly makes the two versions of the abstract different. I think one could use regex to clean up the abstracts and then be able to identify the duplicates.
At the moment, the claim detection algorithm is still a work in progress and could be improved durther for better results.