Advanced Use cases
Use Case 1: Building a Question-Answering System with Existing Collections using LangChain
In this example, we'll demonstrate how to build a question-answering system that uses NeoAthena for document retrieval and LangChain for orchestration. This application will:
Take a user question as input
Retrieve relevant documents from a NeoAthena collection
Process those documents into appropriate context
Generate a comprehensive answer using an LLM
Step 1: Installation and Setup
First, install the required packages:
pip install langchain langchain-openai langchain-core neoathena
Step 2: Import Dependencies
from langchain_openai import ChatOpenAI
from neoathena import NeoAthenaClient
from langchain import hub
from typing import List
from langchain_core.documents import Document
This imports:
ChatOpenAI
: LangChain's wrapper for OpenAI's chat modelsNeoAthenaClient
: The client library to interact with NeoAthena's APIhub
: LangChain's prompt hub for accessing community-built prompts
Step 3: Configure API Keys
OPENAI_API_KEY = "your-openai-api-key"
llm = ChatOpenAI(model='gpt-4')
client = NeoAthenaClient(api_key="your-neoathena-api-key")
Replace the placeholder API keys with your actual keys. This step establishes connections to:
OpenAI's API for the language model
NeoAthena's API for document retrieval
Step 4: Define the Retrieval Function
# Note: This example uses a collection that contains documents about climate change
def retrieve_docs(question: str):
try:
results = client.retrieve_from_collection(
collection_name="your-collection-name",
query=question,
top_k=4
)
return results
except Exception as e:
print(f"Retrieval failed: {e}")
This function:
Takes a user question as input
Queries NeoAthena's API to find relevant documents
Returns the top 4 most relevant documents (adjust as needed)
Includes error handling for reliability
Step 5: Format Retrieved Documents
def documents_to_string(documents: List[Document]) -> str:
"""Joins the page_content of all retrieved Document objects into a single string."""
return "\n\n".join(doc.page_content for doc in documents)
This helper function converts the retrieved documents into a format suitable for the LLM prompt, combining all document contents with clear separation.
Step 6: Create the LangChain Pipeline
# Pull a RAG prompt template from LangChain's hub
prompt = hub.pull("rlm/rag-prompt")
# Build the chain
chain = (
{
"context": lambda x: documents_to_string(retrieve_docs(x["question"])),
"question": lambda x: x["question"]
}
| prompt
| llm
)
This step:
Uses LangChain's hub to access a pre-built RAG prompt template
Creates a processing chain that:
Takes a question
Retrieves documents from NeoAthena
Formats them as context
Combines them with the prompt template
Sends everything to the LLM for response generation
Step 7: Run the Question-Answering System
# Execute the chain with a question
response = chain.invoke({"question": "What is climate change?"})
# Print the response
print(response.content)
The system outputs a comprehensive answer that combines information retrieved from your NeoAthena collection with the reasoning capabilities of the LLM:
Response:
Climate change refers to significant alterations in global climates, typically characterized by a rise in global temperatures. This phenomenon is driven by human activities such as burning fossil fuels, deforestation, and industrial emissions, which release greenhouse gases into the atmosphere, trapping heat. The effects of climate change include extreme weather events, rising sea levels, disruptions to ecosystems, threats to biodiversity, and substantial economic impacts.
Key Benefits of This Integration
Simplicity: Just a few lines of code to create a powerful RAG system
Flexibility: Easily change LLMs, prompt templates, or retrieval parameters
Scalability: Works with collections of any size in NeoAthena
Accuracy: Combines NeoAthena's precise retrieval with LLM reasoning
Use Case 2: Building a Document Summarization Bot with LangGraph & NeoAthena
Step 1: Installation and Setup
Install the required packages:
pip install langchain langchain-core langchain-openai langgraph neoathena
Step 2: Import Dependencies
Imports all the modules for the application.
from langchain import hub
from langchain_core.documents import Document
from neoathena import NeoAthenaClient
from langgraph.graph import START, StateGraph
from typing_extensions import List, TypedDict
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI
Step 3: Configure API Keys
OPENAI_API_KEY="your-openai-api-key"
llm = ChatOpenAI(model='gpt-4')
client = NeoAthenaClient(api_key="your-neoathena-api-key")
Replace the placeholder API keys with your actual keys.
Step 4: Define Summarization Prompt
prompt = ChatPromptTemplate.from_messages(
[("system", "Write a concise summary of the following:\\n\\n{context}")]
)
This creates a simple instruction that tells the LLM to summarize whatever content is passed to it.
Step 5: Create State Structure
class State(TypedDict):
question: str
context: List[Document]
answer: str
This defines what information the system needs to track during processing: the user's question, the retrieved documents, and the final answer.
Step 6: Implement Retrieval Function
def retrieve(state: State):
try:
retrieved_docs = client.retrieve_from_collection(
collection_name="your-collection-name",
query=state["question"],
top_k=4
)
return {"context": retrieved_docs}
except Exception as e:
print(f"Retrieval failed: {e}")
This function searches for relevant documents in the NeoAthena collection based on the user's question, retrieving the top 4 matches.
Step 7: Implement Generation Function
def generate(state: State):
docs_content = "\n\n".join(doc.page_content for doc in state["context"])
messages = prompt.invoke({"question": state["question"], "context": docs_content})
response = llm.invoke(messages)
return {"answer": response.content}
This function combines all the retrieved documents, sends them to the AI model along with our prompt, and gets back a concise summary.
Step 8: Build the LangGraph Workflow
graph_builder = StateGraph(State).add_sequence([retrieve, generate])
graph_builder.add_edge(START, "retrieve")
graph = graph_builder.compile()
Connect the functions into a workflow: first retrieve relevant documents, then generate a summary from them.
Step 9: Execute and Test
response = graph.invoke({"question": "What is a climate change?"})
print(response["answer"])
Response:
Climate change, primarily caused by human activities such as burning fossil fuels, deforestation, and industrial emissions, is leading to rising temperatures, extreme weather events, and disruptions to ecosystems. This is putting biodiversity and human lives at risk, particularly in vulnerable communities. The economic impact on agriculture, tourism, and infrastructure is severe. If no immediate action is taken to reduce emissions, the situation will worsen. Solutions include investing in renewable energy, promoting energy efficiency, protecting forests, and advancing public awareness and education. Immediate global action and cooperation can combat climate change and ensure a sustainable future.
The summarization bot generates a concise overview by extracting key information from documents in your NeoAthena collection and using the LLM to synthesize this content into a coherent summary that captures the essential points of the original material.
Next Steps
By combining NeoAthena's effortless document management and retrieval capabilities with LangChain and LangGraph's flexible orchestration, you can quickly build sophisticated AI applications that leverage your organization's knowledge. Whether you need question-answering systems or document summarization tools, this integration provides a powerful foundation for creating intelligent document processing solutions tailored to your specific needs.
Last updated
Was this helpful?