As you have been working through this course, you have heard about the challenges of Twitter data collection, some of the ethical questions surrounding using and publishing on the data, looking into your university’s IRB requirements for working with social media data, and so forth. You’ve gotten a lot of good information on social media data research as a whole and questions you need to be asking yourself as you work on your research. Throughout this module, you’ll be learning about the next steps for working with data collected from social media. 

The process for Module 4.2 will look a little different than earlier modules, as you’ll be completing your discussion assignment before finishing your review of materials. However, you should still begin by reviewing the assigned readings, which will introduce you to the process and outcomes of data visualization.

The Orange tutorials are aimed at examining your data after it has been collected and thinking about how you will categorize it to tease out some of the patterns. The tutorials focus specifically on the text of the tweet. We are not looking at emojis or any of the visual elements. Orange has a couple of different sentiment analysis tools and these are the main focus of the tutorials. The tutorials in this module will get you familiar with the interface and create a word cloud based on your own dataset. Otherwise, use the sample dataset: Tweet-Profiled-ReadyForOrangeNEH for all tutorials.

There are two PDF files in this section on Orange: The first, Intro to Orange part 1 and part 2 is a full tutorial from installing Orange, categorizing and cleaning data, and then two workflow tutorials for today and tomorrow. Do not feel intimidated by the length of this file. It is meant as a long term reference that you will use and go back to in the future. Since you will be using the data I sent to you in its (mostly) raw form, you can just skim the section on pre-processing your data before importing it into Orange. I would read the overall Orange tips before starting the first tutorial for today. Video instructions for Workflow 1 can be found here. You can also review the slides from the video (part 1 and part 2). Following the review of these materials, complete your assignment, Discussion 8: Orange Homework.

After completing Discussion 8, check out the next Orange tutorial (part 1 and part 2), which will teach you about three sentiment analysis options in Orange.

The software has a Tweet Profiler widget that you will use first. The choices of model within this widget are described in the homework article by Niko Colneriĉ and Janez Demsar, and today we will use just the POMS (Multi Class). You will then add the general Sentiment Analysis widget and examine its two model options (Liu Hu and Vader) to see how they differ. The steps in the tutorial begin with the workflow that you created yesterday. If you have not made it through yesterday’s tutorial, you can still complete today’s. Although the second tutorial’s directions and slides begin with yesterday’s workflow (saved to a new file), most of the widgets within the workflow are accessed again to make tweaks to the settings. Therefore, just add each widget to the workflow as it is discussed and then make the settings match the directions. This tutorial uses the sample file instead of your own data as the sample file has categorized data that is required for the sentiment analysis.