Summarization

The pipelines modules in OnPrem.LLM includes the Summarizer to summarize one or more documents with an LLM. This notebook shows a couple of examples.

Document Summarization

The Summarizer.summarize method runs multiple intermediate prompts and inferences, so we will set verbose-False and mute_stream=True. We will also set temperature=0 for more consistency in outputs. Finally, we will use the Zephyr-7B-beta and use the appropriate prompt template obtained from here. You can experiment with different, newer models to improve results.

from onprem import LLM
from onprem.pipelines import Summarizer
llm = LLM(model_url='https://huggingface.co/TheBloke/zephyr-7B-beta-GGUF/resolve/main/zephyr-7b-beta.Q4_K_M.gguf', 
          prompt_template= "<|system|>\n</s>\n<|user|>\n{prompt}</s>\n<|assistant|>",
          n_gpu_layers=-1, verbose=False, mute_stream=True, temperature=0) # set based on your system
summarizer = Summarizer(llm)

Next, let’s download the ktrain paper and summarize it.

!wget --user-agent="Mozilla" https://arxiv.org/pdf/2004.10703.pdf -O /tmp/ktrain.pdf -q
text = summarizer.summarize('/tmp/ktrain.pdf', max_chunks_to_use=5)
print(text['output_text'])
/home/amaiya/projects/ghub/onprem/onprem/pipelines/summarizer.py:141: LangChainDeprecationWarning: The class `LLMChain` was deprecated in LangChain 0.1.17 and will be removed in 1.0. Use :meth:`~RunnableSequence, e.g., `prompt | llm`` instead.
  map_chain = LLMChain(llm=langchain_llm, prompt=map_prompt)
/home/amaiya/projects/ghub/onprem/onprem/pipelines/summarizer.py:157: LangChainDeprecationWarning: This class is deprecated. Use the `create_stuff_documents_chain` constructor instead. See migration guide here: https://python.langchain.com/v0.2/docs/versions/migrating_chains/stuff_docs_chain/
  combine_documents_chain = StuffDocumentsChain(
/home/amaiya/projects/ghub/onprem/onprem/pipelines/summarizer.py:162: LangChainDeprecationWarning: This class is deprecated. Please see the migration guide here for a recommended replacement: https://python.langchain.com/docs/versions/migrating_chains/map_reduce_chain/
  reduce_documents_chain = ReduceDocumentsChain(
/home/amaiya/projects/ghub/onprem/onprem/pipelines/summarizer.py:171: LangChainDeprecationWarning: This class is deprecated. Please see the migration guide here for a recommended replacement: https://python.langchain.com/docs/versions/migrating_chains/map_reduce_chain/
  map_reduce_chain = MapReduceDocumentsChain(

Ktrain is a low-code Python library that simplifies the machine learning process by providing a unified interface for building, training, inspecting, and applying models using various types of data such as text, vision, and tabular. It automates where possible but also allows users to make choices based on their unique requirements. Ktrain supports TensorFlow Keras models and includes out-of-the-box support for tasks like text classification, sequence tagging, image classification, node classification, and link prediction. It offers state-of-the-art models like BERT and fastText, learning rate finders, optimization techniques, explainable AI tools, and a simple prediction API for deployment. Ktrain is an open-source machine learning platform that automates various tasks beyond just model selection and architecture search in AutoML approaches. It provides text classification, regression, sequence tagging, topic modeling, document similarity, recommendation, summarization, and question answering for text data, as well as node classification and link prediction for graph data. Overall, ktrain aims to augment human engineers' strengths rather than replace them in the ML process.

Tips: For faster summarizations, we set max_chunks_to_use=5, so that only the first five chunks of 1000 characters are considered (where chunk_size=1000 is set as the default). You can set max_chunks_to_use to None (or omit the parameter) to consider the entire document when generating the summarization.

Concept-Focused Summarization

Concept-focused summarization allows you to summarize a long document with respect to a concept of interest. This can be accomplished by invoking the summarizer.summarize_by_concept method and supplying a concept_description.

In this example, we will use Ollama as the backend. You can install Ollama from here and download the model with: ollama pull llama3.1.

Let’s summarize a National Defense Authorization Act (NDAA) report by the concept (or topic) of hypersonics.

Summarizing the NDAA by Concept of Hypersonics

from onprem import LLM, utils
from onprem.pipelines import Summarizer


# STEP 1: Load LLM and setup Summarizer
llm = LLM('ollama_chat/llama3.1', mute_stream=True, temperature=0)
summarizer = Summarizer(llm)

# STEP 2: download NDAA report
utils.download('https://www.congress.gov/118/crpt/hrpt125/CRPT-118hrpt125.pdf', '/tmp/ndaa/ndaa.pdf', verify=True)

# STEP 3: Summarize with respect to concept (e.g., hypersonics)
summary, sources = summarizer.summarize_by_concept('/tmp/ndaa/ndaa.pdf', concept_description="hypersonics")
print()
print(summary)
[██████████████████████████████████████████████████]
The context discusses "hypersonics" in several sections, highlighting the importance of advancing this technology for national defense. Here are some key points related to hypersonics:

1. **Education and Workforce Development**: The committee recommends strengthening partnerships with academic institutions to promote and educate students in hypersonic technology. It also suggests establishing a pilot program at select institutions to expand graduate and pre-doctoral degree programs.
2. **Funding for Advanced Hypersonics Facilities**: The committee recommends increasing funding for advanced hypersonics facilities for research and graduate-level education, allocating $543.9 million (an increase of $3.0 million) in PE 0601153N for hypersonic education efforts.
3. **Hypersonics Prototyping**: The committee mentions two programs related to hypersonics prototyping: HYPERSONICS PROTOTYPING ($150,340) and HYPERSONICS PROTOTYPING—HYPERSONIC ATTACK CRUISE MISSILE (HACM) ($381,528).
4. **Multi-Service Advanced Capability Hypersonics Test Bed (MACH-TB)**: The committee encourages the Department of Defense to fully fund the MACH-TB program in future budget requests to achieve full-scale flight test objectives and expansion of critical test infrastructure.
5. **Hypersonic Workforce Development**: The committee expresses concern about the Department's ability to sustain a highly skilled workforce for hypersonic technology development and recommends expanding and fully funding science, technology, engineering, and mathematics (STEM) programs, particularly in the field of hypersonics.

Overall, the context emphasizes the importance of advancing hypersonic technology for national defense and highlights the need for education, workforce development, and research investments to support this effort.

Summarize a blog post on LLMs with respect to prompting

from langchain.document_loaders import WebBaseLoader

loader = WebBaseLoader("https://lilianweng.github.io/posts/2023-06-23-agent/")
docs = loader.load()
with open('/tmp/blog.txt', 'w') as f:
    f.write(docs[0].page_content)

summary, sources = summarizer.summarize_by_concept('/tmp/blog.txt', concept_description="prompting")
print(summary)
The context discusses "prompting" as a technique used to interact with Large Language Models (LLMs) and influence their behavior, particularly in the context of autonomous agent systems.

In this context, prompting refers to providing specific instructions or questions to an LLM to elicit a particular response or action. The goal is to guide the model's thinking process and encourage it to generate more accurate, relevant, or useful outputs.

There are several types of prompts mentioned:

1. **Simple prompting**: Providing basic instructions, such as "Steps for XYZ.\n1." or "What are the subgoals for achieving XYZ?"
2. **Task-specific instructions**: Providing domain-specific guidance, like "Write a story outline" for writing a novel.
3. **Human inputs**: Incorporating human feedback or input into the prompting process.

The context also highlights various techniques that use prompting to enhance LLM performance on complex tasks, including:

1. **Chain of Thought (CoT)**: A technique that instructs the model to "think step by step" and decompose hard tasks into smaller steps.
2. **Tree of Thoughts**: An extension of CoT that explores multiple reasoning possibilities at each step.
3. **LLM+P**: A method that relies on an external classical planner to do long-horizon planning.

Overall, the context emphasizes the importance of prompting in guiding LLMs and enabling them to perform more effectively in complex tasks.

More Options

If there is a need, you can experiment with different parameters, as described in our documentation.