SpaCySentimentAnalyzer

sentiment.SpaCySentimentAnalyzer(
    nlp=None,
    model='en_core_web_sm',
    positive_label='POSITIVE',
    negative_label='NEGATIVE',
    scorer=None,
    strip_whitespace=True,
    drop_empty=True,
)

Derive per-sentence sentiment values from a spaCy pipeline.

Parameters

Name	Type	Description	Default
nlp	Language	Pre-loaded spaCy pipeline. If provided, the `model` parameter is ignored.	`None`
model	str	Name of a spaCy model to load (e.g., `"en_core_web_sm"`) or path to a local model directory. Defaults to `"en_core_web_sm"`. Only used if `nlp` is not provided.	`'en_core_web_sm'`
positive_label	str	Label to use for positive sentiment when extracting from `doc.cats`. Defaults to `"POSITIVE"`.	`'POSITIVE'`
negative_label	str	Label to use for negative sentiment when extracting from `doc.cats`. Defaults to `"NEGATIVE"`.	`'NEGATIVE'`
scorer	callable	Custom scoring function that takes a spaCy `Doc` and returns a float sentiment score. If provided, overrides default sentiment extraction.	`None`
strip_whitespace	bool	Remove leading/trailing whitespace from sentences.	`True`
drop_empty	bool	Omit empty sentences from processing.	`True`

Examples

>>> # Use a model name
>>> analyzer = SpaCySentimentAnalyzer(model="en_core_web_sm")
>>>
>>> # Use a local model path
>>> analyzer = SpaCySentimentAnalyzer(model="/path/to/my_model")
>>>
>>> # Use a pre-loaded pipeline
>>> import spacy
>>> nlp = spacy.load("en_core_web_lg")
>>> analyzer = SpaCySentimentAnalyzer(nlp=nlp)

Methods

Name	Description
sentence_scores	Score a pre-tokenized list of sentences using spaCy.
text_scores	Split raw text into sentences and score each one.

sentence_scores

sentiment.SpaCySentimentAnalyzer.sentence_scores(sentences)

Score a pre-tokenized list of sentences using spaCy.

Parameters

Name	Type	Description	Default
sentences	Sequence[str]	Sentences to feed through the spaCy pipeline.	required

Returns

Name	Type	Description
	list[float]	One sentiment score per sentence.

text_scores

sentiment.SpaCySentimentAnalyzer.text_scores(text)

Split raw text into sentences and score each one.

Parameters

Name	Type	Description	Default
text	str \| Sequence[str]	Full document or iterable of segments to analyze.	required

Returns

Name	Type	Description
	list[float]	Sentiment score per detected sentence.