19 Lab Set 4
The preview of this template is rendered in HTML. However, all assignments must be rendered in PDF for submission on Canvas. The textstat_tools repository is already set up to do this for you. Be sure to follow the directions including the installation of tinytex.
19.1 Cluster analysis
19.1.1 Task 1
Which linkage method gives the strongest clustering structure?
Your response
19.1.2 Task 2
Looking at both the biplot and the variable contributions, first describe the clustering patterns, then posit an explanation for them.
Your response
19.2 Time series
19.2.1 Task 1
Why might it be important to periodize data from “the ground up” (using a technique like VNC), rather than just splitting data into intervals of, say, 10, 25, or 50 years?
Your response
Following the conventions described in Brezina (pgs. 240-241) report the results produced by the “witch hunt” VNC.
Your response
19.3 Sentiment analysis
19.3.1 Task 1
From the last plot, it’s clear that most sentences return zero counts of sentiment (either positive or negative). Additionally, the data are noisy. It’s worth asking then: Is sentiment a useful measure here? Sure, the plots are cool, at least in so much as we can return something that broadly conforms to our human understanding of these novels (whether they’re a rags-to-riches story, a tragedy, etc.). But is this useful? Can you imagine a potential application that would engage with an interesting research question?
Your response