biber
parse_functions.biber(tokens, normalize=True, force_ttr=False)Extract Biber features from a parsed corpus.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| tokens | pl.DataFrame | A polars DataFrame with the output of the spacy_parse function. | required |
| normalize | Optional[bool] | Normalize counts per 1000 tokens. | True |
| force_ttr | Optional[bool] | Force the calcuation of type-token ratio rather than moving average type-token ratio. | False |
Returns
| Name | Type | Description |
|---|---|---|
| pl.DataFrame | A polars DataFrame with, counts of feature frequencies. |
Notes
MATTR is the default as it is less sensitive than TTR to variations in text lenghth. However, the function will automatically use TTR if any of the corpus texts are less than 200 words. Thus, forcing TTR can be necessary when processing multiple corpora that you want to be consistent.