Advanced Features¶

Power user features and advanced configuration options for ChronoScope.

LangChain Integration (Optional)¶

ChronoScope supports two methods for AI-powered event extraction from your documents. Understanding the differences can help you choose the right approach for your needs.

Overview¶

Feature	Direct OpenAI (Default)	LangChain Integration (Optional)
Extraction Quality	✅ Excellent	✅ Excellent (identical)
Setup Complexity	✅ Simple	⚠️ Requires extra dependencies
Cost	✅ Standard OpenAI pricing	✅ Same pricing
Speed	✅ Fast	✅ Fast (similar)
Error Handling	✅ Basic retry logic	✅✅ Advanced retry & fallback
Debugging Tools	⚠️ Limited	✅✅ Comprehensive logging
Prompt Management	⚠️ String-based	✅✅ Template-based
Multi-Provider Support	❌ OpenAI only	✅ Easy to swap AI providers

Direct OpenAI Integration (Default)¶

How it works:

ChronoScope sends your document text directly to OpenAI's API with extraction instructions. The AI reads the text and returns structured timeline events (dates, titles, locations, etc.).

Pros:

✅ Simple setup - No extra dependencies needed
✅ Lightweight - Fewer packages to install and maintain
✅ Reliable - Direct API calls with straightforward error handling
✅ Easy to understand - Minimal abstraction layers

Cons:

⚠️ Limited debugging - Less visibility into prompt/response details
⚠️ Basic error handling - Simple retry logic only
⚠️ Harder to customize - Prompts embedded in code

Confidence Score: 0.80 (80%)

Best for:

Most users who just want to build timelines
Simple extraction workflows
Users who don't need advanced debugging

LangChain Integration (Optional)¶

How it works:

LangChain acts as a sophisticated middleware layer between ChronoScope and OpenAI. It provides structured prompt templates, automatic retry logic, enhanced logging, and easier provider switching.

Pros:

✅ Structured prompts - Template-based prompt management with validation
✅ Better error recovery - Automatic retries with exponential backoff
✅ Enhanced debugging - Detailed logs of prompts, responses, and token usage
✅ Easy customization - Modify prompts without touching core code
✅ Provider flexibility - Can switch from OpenAI to Anthropic, local models, etc.
✅ Advanced features - Chain multiple LLM calls, output parsing validation

Cons:

⚠️ Extra dependencies - Requires langchain and langchain-openai packages
⚠️ Slightly more complex - Additional abstraction layer
⚠️ Larger installation - More packages to download and update

Confidence Score: 0.85 (85%) - slightly higher due to structured parsing

Best for:

Developers customizing extraction prompts
Users needing detailed debugging information
Teams wanting to track API usage and costs
Advanced users experiencing extraction reliability issues
Future-proofing for multi-provider support

Installation¶

Check your current setup:

If you see this message in ChronoScope:

ℹ️ Using direct OpenAI integration. LangChain not installed.

You're using the Direct OpenAI method (default).

To enable LangChain:

Activate your virtual environment:

source .venv/bin/activate  # macOS/Linux
# or
.venv\Scripts\activate  # Windows

Install LangChain packages:
```
pip install langchain langchain-openai
```
Restart ChronoScope:
```
streamlit run timeline-mvp-pipeline.py
```

The info message will disappear and ChronoScope will automatically use LangChain for all document processing.

To verify:

Check the "Advanced Settings" panel (sidebar) → "LLM Transparency" section. You should see:

✅ Method: langchain (instead of direct_openai)
✅ Confidence scores: 0.85 (instead of 0.80)

Technical Differences¶

Prompt Management¶

Direct OpenAI:

# Prompts are formatted strings
prompt = f"""Extract timeline events from this resume:
{document_text}
Return JSON array with dates, titles, locations..."""

response = openai_client.chat.completions.create(...)

LangChain:

# Prompts are reusable templates
template = PromptTemplate(
    input_variables=["document_text", "doc_type"],
    template="""Extract timeline events from this {doc_type}:
    {document_text}
    Return JSON array with dates, titles, locations..."""
)

chain = LLMChain(llm=langchain_llm, prompt=template)
response = chain.run(document_text=text, doc_type="resume")

Error Handling¶

Direct OpenAI:

try:
    response = openai_client.chat.completions.create(...)
except Exception as e:
    st.error(f"Extraction failed: {e}")
    # Fall back to rule-based extraction

LangChain:

# Built-in retry logic with exponential backoff
# Automatic fallback chains
# Detailed error logging with context
# Token usage tracking

Output Parsing¶

Direct OpenAI: - Parse JSON response manually - Basic validation - String matching for fields

LangChain: - Structured output parsers - Schema validation - Type checking - Automatic correction for common LLM output issues

Performance Considerations¶

Extraction Speed:

Both methods have similar speed: - Direct OpenAI: ~2-5 seconds per document - LangChain: ~2-6 seconds per document (slightly slower due to validation)

API Costs:

Identical! Both use the same OpenAI model (gpt-3.5-turbo by default) with the same token usage.

Memory Usage:

Direct OpenAI: ~50MB additional memory
LangChain: ~150MB additional memory (due to extra packages)

Switching Between Methods¶

You can switch at any time without losing data:

Remove LangChain (go back to Direct OpenAI):

pip uninstall langchain langchain-openai

Reinstall LangChain:

pip install langchain langchain-openai

Your timeline_events.json file remains unchanged. All previously extracted events are preserved regardless of which method was used.

Debugging with LangChain¶

When LangChain is enabled, you get enhanced debugging in Advanced Settings:

LLM Transparency Panel: - View exact prompts sent to OpenAI - See raw AI responses before parsing - Track token usage per document - Monitor processing time - Review confidence scores - Check extraction method used

Access it: 1. Open ChronoScope 2. Sidebar → Toggle "Advanced Settings" 3. Scroll to "LLM Transparency" 4. Expand extraction log entries

Recommendations¶

Use Direct OpenAI if: - ✅ You just want to build timelines (most users) - ✅ You prefer simpler installations - ✅ You don't need advanced debugging - ✅ Extraction works well for your documents

Use LangChain if: - ✅ You're customizing extraction prompts - ✅ You need detailed API usage tracking - ✅ You want better error recovery - ✅ You plan to switch AI providers later - ✅ You're a developer integrating ChronoScope - ✅ You experience frequent extraction failures

Our recommendation: Start with Direct OpenAI (default). Only install LangChain if you encounter specific needs for its advanced features.

Other Advanced Settings¶

[Content coming soon - other power user features]

Back to Documentation Home