biber

parse_functions.biber(tokens, normalize=True, force_ttr=False)

Extract Biber features from a parsed corpus.

Parameters

Name Type Description Default
tokens pl.DataFrame A polars DataFrame with the output of the spacy_parse function. required
normalize Optional[bool] Normalize counts per 1000 tokens. True
force_ttr Optional[bool] Force the calcuation of type-token ratio rather than moving average type-token ratio. False

Returns

Name Type Description
pl.DataFrame A polars DataFrame with, counts of feature frequencies.

Notes

MATTR is the default as it is less sensitive than TTR to variations in text lenghth. However, the function will automatically use TTR if any of the corpus texts are less than 200 words. Thus, forcing TTR can be necessary when processing multiple corpora that you want to be consistent.