PybiberPipeline
pipeline.PybiberPipeline(
    nlp=None,
    model='en_core_web_sm',
    disable_ner=True,
    n_process=CONFIG.DEFAULT_N_PROCESS,
    batch_size=CONFIG.DEFAULT_BATCH_SIZE,
    show_progress=None,
)End-to-end convenience wrapper for common pybiber workflows.
Parameters
| Name | Type | Description | Default | 
|---|---|---|---|
| nlp | Optional[Language] | Pre-loaded spaCy model. If None, the model named by modelwill be loaded lazily on first use. | None | 
| model | str | Name of the spaCy model to load when nlpis None. Defaults to “en_core_web_sm”. | 'en_core_web_sm' | 
| disable_ner | bool | When True, disable spaCy’s NER component for speed and stability. Parser is still enabled and required. Defaults to True. | True | 
| n_process | int | Number of processes to use for spaCy’s pipe. | CONFIG.DEFAULT_N_PROCESS | 
| batch_size | int | Batch size for spaCy’s pipe. | CONFIG.DEFAULT_BATCH_SIZE | 
| show_progress | Optional[bool] | Whether to show internal progress indicators when processing. If None, it is determined based on corpus size. | None | 
Methods
| Name | Description | 
|---|---|
| features | Compute Biber features from token-level parses. | 
| from_folder | Read .txt files from a folder into a corpus DataFrame. | 
| parse | Parse a corpus with spaCy using the configured settings. | 
| run | Parse and compute features from an in-memory corpus DataFrame. | 
| run_from_folder | Read, parse, and compute features from a folder of .txt files. | 
| to_analyzer | Create a BiberAnalyzer from a Biber feature matrix. | 
features
pipeline.PybiberPipeline.features(tokens, normalize=True, force_ttr=False)Compute Biber features from token-level parses.
from_folder
pipeline.PybiberPipeline.from_folder(directory, recursive=False)Read .txt files from a folder into a corpus DataFrame.
parse
pipeline.PybiberPipeline.parse(corpus)Parse a corpus with spaCy using the configured settings.
run
pipeline.PybiberPipeline.run(
    corpus,
    return_tokens=False,
    normalize=True,
    force_ttr=False,
)Parse and compute features from an in-memory corpus DataFrame.
run_from_folder
pipeline.PybiberPipeline.run_from_folder(
    directory,
    recursive=False,
    return_tokens=False,
    normalize=True,
    force_ttr=False,
)Read, parse, and compute features from a folder of .txt files.
to_analyzer
pipeline.PybiberPipeline.to_analyzer(biber_df)Create a BiberAnalyzer from a Biber feature matrix.