-
Notifications
You must be signed in to change notification settings - Fork 1
Home
renatotn7 edited this page Mar 23, 2016
·
5 revisions
This project has a several subprojects for language processing
Statistics of unknowns words in captions
This scripts make statistics of unknowns words linking with wordnet dictionary from captions. Produces a output with csv format
Script
At directory wordsFromCaptions there is the script:
- wordsStatisticsFromCaptions.py
Input:
For this script, must exists in same directory the files
- known.csv (file with previous known words)
- captions.srt(file with the captions)
- captionsForStatistics.txt (file with other captions that will basis for the statistics)
Output:
Number of word ocurrencies ; word ; wordnet dictionary