This serves a the final project to the UC Berkeley extension course COMPSCI X433.6: Introduction to Machine Learning Using Python
The final project is open ended. I chose to attempt to predict MPAA content ratings (G, PG, PG-13, R) given a complete movie script. The accuracy hovers around 70% for the two best classifiers I trained. With more judicious feature selection, this could probably be improved.
See the pdf file in this repo for the complete explanation. All data and work is present in this repo. In case I decide to go back and improve this classifier, the final project submission will forever have the 'final_submission' git tag.