Using OpenAI Models

Even when using on-premises language models that run locally on your machine, it can sometimes be useful to have easy access to cloud-based models (e.g., OpenAI) for experimentation, baselines for comparison, generating synthetic data, etc. For these reasons, in spite of the name, OnPrem.LLM now includes support for OpenAI chat models.

from onprem import LLM
llm = LLM(model_url='openai://gpt-3.5-turbo', temperature=0, mute_stream=True)
/home/amaiya/projects/ghub/onprem/onprem/core.py:139: UserWarning: The model you supplied is gpt-3.5-turbo, an external service (i.e., not on-premises). Use with caution, as your data and prompts will be sent externally.
  warnings.warn(f'The model you supplied is {self.model_name}, an external service (i.e., not on-premises). '+\

General Prompting

res = llm.prompt('I am an accountant, and I have to write a short resignation letter to my supervisor. '
                 'Write a draft of this letter using at most 5 sentences.')
print(res)
Dear [Supervisor's Name],

I hope this letter finds you well. I am writing to inform you of my decision to resign from my position as an accountant at [Company Name], effective [last working day, typically two weeks from the date of the letter]. I have thoroughly enjoyed my time working here and appreciate the opportunities for professional growth that I have been given. However, after careful consideration, I have decided to pursue a new opportunity that aligns more closely with my long-term career goals. I am committed to ensuring a smooth transition and will be available to assist with any necessary handover tasks. Thank you for your understanding.

Sincerely,
[Your Name]

Summarizing a Paper

import os
os.makedirs('/tmp/somepaper', exist_ok=True)
!wget --user-agent="Mozilla" https://arxiv.org/pdf/2004.10703.pdf -O /tmp/somepaper/paper.pdf -q
from onprem import pipelines
summarizer = pipelines.Summarizer(llm)
text = summarizer.summarize('/tmp/somepaper/paper.pdf', max_chunks_to_use=5)
print(text['output_text'])
ktrain is a low-code Python library that serves as a wrapper to TensorFlow and other libraries, simplifying the machine learning workflow for both beginners and experienced practitioners. It supports various data types and tasks such as text, vision, graph, and tabular data analysis. The library automates and streamlines processes like model building, inspection, and application. It offers features like text classification, regression, sequence tagging, topic modeling, document similarity, recommendation, summarization, and question-answering. ktrain provides options for choosing different models or using custom models and includes explainable AI features. It is open-source and available on GitHub.

Answer Questions About a Paper

llm.ingest('/tmp/somepaper')
Creating new vectorstore at /home/amaiya/onprem_data/vectordb
Loading documents from /tmp/somepaper
Loaded 9 new documents from /tmp/somepaper
Split into 57 chunks of text (max. 500 chars each)
Creating embeddings. May take some minutes...
Ingestion complete! You can now query your documents using the LLM.ask or LLM.chat methods
Loading new documents: 100%|██████████████████████| 1/1 [00:00<00:00, 16.04it/s]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:01<00:00,  1.53s/it]
res = llm.ask('What is said about image classification?')
print(res['answer'])
The example provided demonstrates building an image classifier using a standard ResNet50 model pretrained on ImageNet. The steps for image classification are similar to the previous text classification example, despite the tasks being completely different.