Dockerfiles for Twitter sentiment analysis with Spark MLlib and visualization referenced by https://github.com/P7h/Spark-MLlib-Twitter-Sentiment-Analysis.
The image is available directly from https://index.docker.io.
Docker image to analyse and visualize sentiment of tweets in real-time on a world map using Apache Spark ecosystem (Spark MLlib + Spark Streaming + Spark SQL).
For more details on this project and the code associated with it, please check the blogpost.
Also, README of https://github.com/P7h/Spark-MLlib-Twitter-Sentiment-Analysis has details on how to execute this project.
I had actually written this as a blog post on my personal website, but unfortunately I managed to corrupt my Octopress GitHub repo. 😧 😩 😡 So, till the time I salvage it, I thought of publishing it as GitHub wiki for the time being.
There are 2 ways of getting this image:
- Build the image using Dockerfile
- Pull the image from Docker Hub
Copy the Dockerfile
and the other 2 supporting files: bootstrap.sh
and exec_spark_jobs.sh
to a folder on your local machine and then invoke the following command.
docker build -t p7hb/p7hb-docker-mllib-twitter-sentiment:1.6.2 .
This will build the docker image on your machine.
Please wait as this might take a bit of time depending on your internet speed.
With this approach, we are pulling the image hosted on Docker Hub instead of building it ourselves.
docker pull p7hb/p7hb-docker-mllib-twitter-sentiment:1.6.2
Please check the README of the Spark-MLlib-Twitter-Sentiment-Analysis project for detailed instructions on executing this prototype.
I am currently hosting this web app on Amazon EC2: http://54.84.252.184:9999/. I will bring it down sometime next week.Update on 19th September, 2016: After running the live app on EC2 for almost a month, I have shutdown this instance today.- Docker Image on Docker Hub Registry: https://hub.docker.com/r/p7hb/p7hb-docker-mllib-twitter-sentiment/.
- GitHub URL for source code of the project: https://github.com/P7h/Spark-MLlib-Twitter-Sentiment-Analysis.
- GitHub URL for blog post on code walkthru: https://github.com/P7h/Spark-MLlib-Twitter-Sentiment-Analysis/wiki/.
- Dockerfile GitHub repo: https://github.com/P7h/p7hb-docker-mllib-twitter-sentiment.
If you find any issues or would like to discuss further, please ping me on my Twitter handle @P7h or drop me an email. Appreciate your help. Thanks!
Copyright © 2016 Prashanth Babu.
Licensed under the Apache License, Version 2.0.