utils

some utility functions

source

get_datadir

 get_datadir ()

source

download

 download (url, filename, verify=False)

source

df_to_md

 df_to_md (df, caption=None)

Converts pd.Dataframe to markdown


source

html_to_df

 html_to_df (html_str:str)

Convert HTML to dataframe.


source

md_to_df

 md_to_df (md_str:str)

Convert Markdown to dataframe.


source

segment

 segment (text:str, unit:str='paragraph', maxchars:int=2048)

Segments text into a list of paragraphs or sentences depending on value of unit (one of {'paragraph', 'sentence'}. The maxchars parameter is the maximum size of any unit of text.


source

split_list

 split_list (input_list, chunk_size)

Split list into chunks


source

get_template_vars

 get_template_vars (template_str:str)

Get template variables from a template string.


source

format_string

 format_string (string_to_format:str, **kwargs:str)

Format a string with kwargs


source

SafeFormatter

 SafeFormatter (format_dict:Optional[Dict[str,str]]=None)

Safe string formatter that does not raise KeyError if key is missing. Adapted from llama_index.