load_text_file

data.load_text_file(path, *, strict_ascii=True)

Read a single .txt file and return a cleaned {doc_id, text} record.

Parameters

Name Type Description Default
path str | Path File to read. required
strict_ascii bool Drop non-ASCII bytes when True (default) to sidestep spaCy model quirks on unusual characters. True