utils

some utility functions

source

segment

 segment (text:str, unit:str='paragraph', maxchars:int=2048)

Segments text into a list of paragraphs or sentences depending on value of unit (one of {'paragraph', 'sentence'}. The maxchars parameter is the maximum size of any unit of text.


source

split_list

 split_list (input_list, chunk_size)

source

get_datadir

 get_datadir ()

source

download

 download (url, filename, verify=False)