Talk to Your Documents

This example of OnPrem.LLM demonstrates retrieval augmented generation or RAG.

Setup the LLM instance

In this notebook, we will use a model called Zephyr-7B-beta, which performs well on RAG tasks. When selecting a model, it is important to inspect the model’s home page and identify the correct prompt format. The prompt format for this model is located here, and we will supply it directly to the LLM constructor along with the URL to the specific model file we want (i.e., zephyr-7b-beta.Q4_K_M.gguf). We will offload layers to our GPU(s) to speed up inference using the n_gpu_layers parameter. (For more information on GPU acceleration, see here.) For the purposes of this notebook, we also supply temperature=0 so that there is no variability in outputs. You can increase this value for more creativity in the outputs. Finally, we will choose a non-default location for our vector database.

from onprem import LLM, utils as U
import tempfile
from textwrap import wrap
vectordb_path = tempfile.mkdtemp()

llm = LLM(model_url='https://huggingface.co/TheBloke/zephyr-7B-beta-GGUF/resolve/main/zephyr-7b-beta.Q4_K_M.gguf', 
          prompt_template= "<|system|>\n</s>\n<|user|>\n{prompt}</s>\n<|assistant|>",
          n_gpu_layers=-1,
          temperature=0,
          store_type='dense',
          vectordb_path=vectordb_path,
         verbose=False)
llama_new_context_with_model: n_ctx_per_seq (3904) < n_ctx_train (32768) -- the full capacity of the model will not be utilized

Since OnPrem.LLM includes built-in support for Zephyr, an easier way to instantiate the LLM with Zephyr is as follows:

llm = LLM(default_model='zephyr', 
          n_gpu_layers=-1,
          temperature=0,
          store_type='dense',
          vectordb_path=vectordb_path)

Ingest Documents

When ingesting documents, they can be stored in one of two ways: 1. a dense vector store: a conventional vector database like Chroma 2. a sparse vector store: a keyword-search engine

Sparse vector stores compute embeddings on-the-fly at inference time. As a result, sparse vector stores sacrifice a small amount of inference speed for significant speed ups in ingestion speed. This makes it better suited for larger document sets. Note that sparse vector stores include the contraint that any passages considered as sources for answers should have at least one word in common with the question being asked. You can specify the kind of vector store by supplying either store_type="dense" or store_type="sparse" when creating the LLM above. We use a dense vector store in this example, as shown above.

For this example, we will download the 2024 National Defense Autorization Act (NDAA) report and ingest it.

U.download('https://www.congress.gov/118/crpt/hrpt125/CRPT-118hrpt125.pdf', '/tmp/ndaa/ndaa.pdf', verify=True)
[██████████████████████████████████████████████████]
llm.ingest("/tmp/ndaa/")
Creating new vectorstore at /tmp/tmpmnt6g6l8/dense
Loading documents from /tmp/ndaa/
Loading new documents: 100%|██████████████████████| 1/1 [00:00<00:00,  1.62it/s]
Processing and chunking 672 new documents: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 10.22it/s]
Split into 5202 chunks of text (max. 500 chars each for text; max. 2000 chars for tables)
Creating embeddings. May take some minutes...
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:17<00:00,  2.95s/it]
Ingestion complete! You can now query your documents using the LLM.ask or LLM.chat methods

Asking Questions to Your Documents

result = llm.ask("What is said about artificial intelligence training and education?")

The context provided discusses the implementation of an AI education strategy required by Section 256 of the National Defense Authorization Act for Fiscal Year 2020. The strategy aims to educate servicemembers in relevant occupational fields, with a focus on data literacy across a broader population within the Department of Defense. The committee encourages the Air Force and Space Force to leverage government-owned training platforms informed by private sector expertise to accelerate learning and career path development. Additionally, the committee suggests expanding existing mobile enabled platforms to train and develop the cyber workforce of the Air Force and Space Force. Overall, there is a recognition that AI continues to be central to warfighting and that proper implementation of these new technologies requires a focus on education and training.

The answer is stored in results['answer']. The documents retrieved from the vector store used to generate the answer are stored in results['source_documents'] above.

print('ANSWER:')
print("\n".join(wrap(result['answer'])))
print()
print()
print('REFERENCES')
print()
for d in result['source_documents']:
    print(f"On Page {d.metadata['page']} in {d.metadata['source']}:")
    print(d.page_content)
    print('----------------------------------------')
    print()
ANSWER:
 The context provided discusses the implementation of an AI education
strategy required by Section 256 of the National Defense Authorization
Act for Fiscal Year 2020. The strategy aims to educate servicemembers
in relevant occupational fields, with a focus on data literacy across
a broader population within the Department of Defense. The committee
encourages the Air Force and Space Force to leverage government-owned
training platforms informed by private sector expertise to accelerate
learning and career path development. Additionally, the committee
suggests expanding existing mobile enabled platforms to train and
develop the cyber workforce of the Air Force and Space Force. Overall,
there is a recognition that AI continues to be central to warfighting
and that proper implementation of these new technologies requires a
focus on education and training.


REFERENCES

On Page 359 in /tmp/ndaa/ndaa.pdf:
‘‘servicemembers in relevant occupational fields on matters relating 
to artificial intelligence.’’ 
Given the continued centrality of AI to warfighting, the com-
mittee directs the Chief Digital and Artificial Intelligence Officer of 
the Department of Defense to provide a briefing to the House Com-
mittee on Armed Services not later than March 31, 2024, on the 
implementation status of the AI education strategy, with emphasis 
on current efforts underway, such as the AI Primer course within
----------------------------------------

On Page 359 in /tmp/ndaa/ndaa.pdf:
intelligence (AI) and machine learning capabilities available within 
the Department of Defense. To ensure the proper implementation 
of these new technologies, there must be a focus on data literacy 
across a broader population within the Department. Section 256 of 
the National Defense Authorization Act for Fiscal Year 2020 (Pub-
lic Law 116–92) required the Department of Defense to develop an 
AI education strategy, with the stated objective to educate
----------------------------------------

On Page 102 in /tmp/ndaa/ndaa.pdf:
tificial intelligence and machine learning (AI/ML), and cloud com-
puting. The committee encourages the Air Force and Space Force 
to leverage government owned training platforms with curricula in-
formed by private sector expertise to accelerate learning and career 
path development. 
To that end, the committee encourages the Secretary of the Air 
Force to expand existing mobile enabled platforms to train and de-
velop the cyber workforce of Air Force and Space Force. To better
----------------------------------------

On Page 109 in /tmp/ndaa/ndaa.pdf:
70 
role of senior official with principal responsibility for artificial intel-
ligence and machine learning. In February 2022, the Department 
stood up the Chief Digital and Artificial Intelligence Office to accel-
erate the Department’s adoption of AI. The committee encourages 
the Department to build upon this progress and sustain efforts to 
research, develop, test, and where appropriate, operationalize AI 
capabilities. 
Artificial intelligence capabilities of foreign adversaries
----------------------------------------
result = llm.ask("What is said about hypersonics?")

The context provided highlights the importance of expanding and fully funding programs related to hypersonic technology. The House Committee on Armed Services has directed the Secretary of Defense to submit a report by December 1, 2023, detailing efforts to ensure the development and sustainment of a future hypersonic workforce. The committee notes concerns about advancements in hypersonic capabilities made by peer and near-peer adversaries, emphasizing the need for investments to enhance the ability to develop, test, and field advanced hypersonic capabilities. The lack of research and development funding directed towards fielding a reusable hypersonic platform with aircraft-like operations and qualities is also raised as a concern. To address this issue, the committee directs the Under Secretary of Defense to develop graduate and pre-doctoral degree programs for the hypersonics workforce and increase funding for advanced hypersonics facilities for research and graduate-level education. Innovation organizations are also identified as important in this context. Overall, the provided context highlights the significance of hypersonic technology and the need for continued investment and development in this area.
print('ANSWER:')
print("\n".join(wrap(result['answer'])))
print()
print()
print('REFERENCES')
print()
for d in result['source_documents']:
    print(f"On Page {d.metadata['page']} in {d.metadata['source']}:")
    print(d.page_content)
    print('----------------------------------------')
    print()
ANSWER:
 The context provided highlights the importance of expanding and fully
funding programs related to hypersonic technology. The House Committee
on Armed Services has directed the Secretary of Defense to submit a
report by December 1, 2023, detailing efforts to ensure the
development and sustainment of a future hypersonic workforce. The
committee notes concerns about advancements in hypersonic capabilities
made by peer and near-peer adversaries, emphasizing the need for
investments to enhance the ability to develop, test, and field
advanced hypersonic capabilities. The lack of research and development
funding directed towards fielding a reusable hypersonic platform with
aircraft-like operations and qualities is also raised as a concern. To
address this issue, the committee directs the Under Secretary of
Defense to develop graduate and pre-doctoral degree programs for the
hypersonics workforce and increase funding for advanced hypersonics
facilities for research and graduate-level education. Innovation
organizations are also identified as important in this context.
Overall, the provided context highlights the significance of
hypersonic technology and the need for continued investment and
development in this area.


REFERENCES

On Page 120 in /tmp/ndaa/ndaa.pdf:
lieves those programs should be expanded and fully funded, par-
ticularly in the field of hypersonic technology. 
Therefore, the committee directs the Secretary of Defense to sub-
mit a report to the House Committee on Armed Services not later 
than December 1, 2023, on the Department’s efforts to ensure the 
development and sustainment of its future hypersonic workforce. 
The report shall include: 
(1) an overview of hypersonic workforce development objectives
----------------------------------------

On Page 81 in /tmp/ndaa/ndaa.pdf:
velopment of carbon-carbon high temperature composites for 
hypersonic weapons. 
Hypersonics test infrastructure 
The committee notes with concern the advancements in 
hypersonic capabilities made by peer and near-peer adversaries. To 
ensure the U.S. military can effectively deter and, if necessary, de-
feat these national security threats, the Department of Defense 
must make investments to enhance its ability to develop, test, and 
field advanced hypersonic capabilities.
----------------------------------------

On Page 127 in /tmp/ndaa/ndaa.pdf:
clusion areas in the Indo-Pacific theater of operations. Peer adver-
saries continue to advance in hypersonic technology, including re-
usable systems, that pose a threat to U.S. national security inter-
ests. 
However, the committee is concerned by the lack of research and 
development 
funding 
directed 
towards 
fielding 
a 
reusable 
hypersonic platform with aircraft-like operations and qualities. 
Therefore, the committee directs the Under Secretary of Defense
----------------------------------------

On Page 120 in /tmp/ndaa/ndaa.pdf:
hypersonics workforce through the development of graduate and 
pre-doctoral degree programs; and 
(4) plans to increase funding for advanced hypersonics facilities 
for research and graduate-level education. 
Additionally, the committee recommends $543.9 million, an in-
crease of $3.0 million, in PE 0601153N for hypersonic education ef-
forts. 
Identifying innovation organizations 
The committee notes that with the success of the Defense Inno-
----------------------------------------