19  Lab Set 4

Note

The preview of this template is rendered in HTML. However, all assignments must be rendered in PDF for submission on Canvas. The textstat_tools repository is already set up to do this for you. Be sure to follow the directions including the installation of tinytex.

19.1 Cluster analysis

19.1.1 Task 1

Which linkage method gives the strongest clustering structure?

Your response

19.1.2 Task 2

Looking at both the biplot and the variable contributions, first describe the clustering patterns, then posit an explanation for them.

Your response

19.2 Time series

19.2.1 Task 1

Why might it be important to periodize data from “the ground up” (using a technique like VNC), rather than just splitting data into intervals of, say, 10, 25, or 50 years?

Your response

Following the conventions described in Brezina (pgs. 240-241) report the results produced by the “witch hunt” VNC.

Your response

19.3 Sentiment analysis

19.3.1 Task 1

From the last plot, it’s clear that most sentences return zero counts of sentiment (either positive or negative). Additionally, the data are noisy. It’s worth asking then: Is sentiment a useful measure here? Sure, the plots are cool, at least in so much as we can return something that broadly conforms to our human understanding of these novels (whether they’re a rags-to-riches story, a tragedy, etc.). But is this useful? Can you imagine a potential application that would engage with an interesting research question?

Your response