Weekly Update 2
19 Sep 2014This week, as team, we continued to do more research. Luckily, this time it involved some "demo-ing" and programming with NLTK & Python and all it's capabilities. Using NLTK will make performing sentiment analysis much "easier" than if we had to write all the functions available from scratch. It provides a strong starting platform for us.
What we Did
Tom: brushed up on Python skills & finally set up GitHub account!
Kelly:wrote a simple pre-processing script for Tweets, pulling out "stop words" and putting into a feature vector and started to play around with different classifier demos available from the NLTK Cookbook
Logan: started writing our report & read more papers about opinion mining and sentiment analysis
Word of the Week
Last week, on our list of tasks to do next, we mentioned picking a Classifier. Machine learning has various methods to classify data. Classifiers may or may not need training data. Fortunately for us, NLTK defines several classifier classes: ConditionalExponentialClassifier, DecisionTreeClassifier, MaxentClassifier, NaiveBayesClassifier, WekaClassifier. This potentially adds another "research element" to our project, which classifier is the most accurate and efficient.
What to Do Next
The next couple weeks our tasks will be writing the scripts to implement these different classifiers using NLTK & Python, gathering the data we need from Twitter, and visualizing the results.
Dr. Joyce mentioned to us today that he would love to see the "Future Plans" from the last 5 or so years analyzed to see how things have changed. This is definitely on our to-do list and adds a new and relevant angle to our project!
As always, Much to be done! - AttitudeAnalytiks