PybiberPipeline
pipeline.PybiberPipeline(=None,
nlp='en_core_web_sm',
model=True,
disable_ner=CONFIG.DEFAULT_N_PROCESS,
n_process=CONFIG.DEFAULT_BATCH_SIZE,
batch_size=None,
show_progress )
End-to-end convenience wrapper for common pybiber workflows.
Parameters
Name | Type | Description | Default |
---|---|---|---|
nlp | Optional[Language] | Pre-loaded spaCy model. If None, the model named by model will be loaded lazily on first use. |
None |
model | str | Name of the spaCy model to load when nlp is None. Defaults to “en_core_web_sm”. |
'en_core_web_sm' |
disable_ner | bool | When True, disable spaCy’s NER component for speed and stability. Parser is still enabled and required. Defaults to True. | True |
n_process | int | Number of processes to use for spaCy’s pipe. | CONFIG.DEFAULT_N_PROCESS |
batch_size | int | Batch size for spaCy’s pipe. | CONFIG.DEFAULT_BATCH_SIZE |
show_progress | Optional[bool] | Whether to show internal progress indicators when processing. If None, it is determined based on corpus size. | None |
Methods
Name | Description |
---|---|
features | Compute Biber features from token-level parses. |
from_folder | Read .txt files from a folder into a corpus DataFrame. |
parse | Parse a corpus with spaCy using the configured settings. |
run | Parse and compute features from an in-memory corpus DataFrame. |
run_from_folder | Read, parse, and compute features from a folder of .txt files. |
to_analyzer | Create a BiberAnalyzer from a Biber feature matrix. |
features
=True, force_ttr=False) pipeline.PybiberPipeline.features(tokens, normalize
Compute Biber features from token-level parses.
from_folder
=False) pipeline.PybiberPipeline.from_folder(directory, recursive
Read .txt files from a folder into a corpus DataFrame.
parse
pipeline.PybiberPipeline.parse(corpus)
Parse a corpus with spaCy using the configured settings.
run
pipeline.PybiberPipeline.run(
corpus,=False,
return_tokens=True,
normalize=False,
force_ttr )
Parse and compute features from an in-memory corpus DataFrame.
run_from_folder
pipeline.PybiberPipeline.run_from_folder(
directory,=False,
recursive=False,
return_tokens=True,
normalize=False,
force_ttr )
Read, parse, and compute features from a folder of .txt files.
to_analyzer
pipeline.PybiberPipeline.to_analyzer(biber_df)
Create a BiberAnalyzer from a Biber feature matrix.