Skip to content

Advanced Features

Power user features and advanced configuration options for ChronoScope.


LangChain Integration (Optional)

ChronoScope supports two methods for AI-powered event extraction from your documents. Understanding the differences can help you choose the right approach for your needs.

Overview

Feature Direct OpenAI (Default) LangChain Integration (Optional)
Extraction Quality ✅ Excellent ✅ Excellent (identical)
Setup Complexity ✅ Simple ⚠️ Requires extra dependencies
Cost ✅ Standard OpenAI pricing ✅ Same pricing
Speed ✅ Fast ✅ Fast (similar)
Error Handling ✅ Basic retry logic ✅✅ Advanced retry & fallback
Debugging Tools ⚠️ Limited ✅✅ Comprehensive logging
Prompt Management ⚠️ String-based ✅✅ Template-based
Multi-Provider Support ❌ OpenAI only ✅ Easy to swap AI providers

Direct OpenAI Integration (Default)

How it works:

ChronoScope sends your document text directly to OpenAI's API with extraction instructions. The AI reads the text and returns structured timeline events (dates, titles, locations, etc.).

Pros:

  • Simple setup - No extra dependencies needed
  • Lightweight - Fewer packages to install and maintain
  • Reliable - Direct API calls with straightforward error handling
  • Easy to understand - Minimal abstraction layers

Cons:

  • ⚠️ Limited debugging - Less visibility into prompt/response details
  • ⚠️ Basic error handling - Simple retry logic only
  • ⚠️ Harder to customize - Prompts embedded in code

Confidence Score: 0.80 (80%)

Best for:

  • Most users who just want to build timelines
  • Simple extraction workflows
  • Users who don't need advanced debugging

LangChain Integration (Optional)

How it works:

LangChain acts as a sophisticated middleware layer between ChronoScope and OpenAI. It provides structured prompt templates, automatic retry logic, enhanced logging, and easier provider switching.

Pros:

  • Structured prompts - Template-based prompt management with validation
  • Better error recovery - Automatic retries with exponential backoff
  • Enhanced debugging - Detailed logs of prompts, responses, and token usage
  • Easy customization - Modify prompts without touching core code
  • Provider flexibility - Can switch from OpenAI to Anthropic, local models, etc.
  • Advanced features - Chain multiple LLM calls, output parsing validation

Cons:

  • ⚠️ Extra dependencies - Requires langchain and langchain-openai packages
  • ⚠️ Slightly more complex - Additional abstraction layer
  • ⚠️ Larger installation - More packages to download and update

Confidence Score: 0.85 (85%) - slightly higher due to structured parsing

Best for:

  • Developers customizing extraction prompts
  • Users needing detailed debugging information
  • Teams wanting to track API usage and costs
  • Advanced users experiencing extraction reliability issues
  • Future-proofing for multi-provider support

Installation

Check your current setup:

If you see this message in ChronoScope:

ℹ️ Using direct OpenAI integration. LangChain not installed.

You're using the Direct OpenAI method (default).

To enable LangChain:

  1. Activate your virtual environment:

    source .venv/bin/activate  # macOS/Linux
    # or
    .venv\Scripts\activate  # Windows
    

  2. Install LangChain packages:

    pip install langchain langchain-openai
    

  3. Restart ChronoScope:

    streamlit run timeline-mvp-pipeline.py
    

The info message will disappear and ChronoScope will automatically use LangChain for all document processing.

To verify:

Check the "Advanced Settings" panel (sidebar) → "LLM Transparency" section. You should see:

  • ✅ Method: langchain (instead of direct_openai)
  • ✅ Confidence scores: 0.85 (instead of 0.80)

Technical Differences

Prompt Management

Direct OpenAI:

# Prompts are formatted strings
prompt = f"""Extract timeline events from this resume:
{document_text}
Return JSON array with dates, titles, locations..."""

response = openai_client.chat.completions.create(...)

LangChain:

# Prompts are reusable templates
template = PromptTemplate(
    input_variables=["document_text", "doc_type"],
    template="""Extract timeline events from this {doc_type}:
    {document_text}
    Return JSON array with dates, titles, locations..."""
)

chain = LLMChain(llm=langchain_llm, prompt=template)
response = chain.run(document_text=text, doc_type="resume")

Error Handling

Direct OpenAI:

try:
    response = openai_client.chat.completions.create(...)
except Exception as e:
    st.error(f"Extraction failed: {e}")
    # Fall back to rule-based extraction

LangChain:

# Built-in retry logic with exponential backoff
# Automatic fallback chains
# Detailed error logging with context
# Token usage tracking

Output Parsing

Direct OpenAI: - Parse JSON response manually - Basic validation - String matching for fields

LangChain: - Structured output parsers - Schema validation - Type checking - Automatic correction for common LLM output issues


Performance Considerations

Extraction Speed:

Both methods have similar speed: - Direct OpenAI: ~2-5 seconds per document - LangChain: ~2-6 seconds per document (slightly slower due to validation)

API Costs:

Identical! Both use the same OpenAI model (gpt-3.5-turbo by default) with the same token usage.

Memory Usage:

  • Direct OpenAI: ~50MB additional memory
  • LangChain: ~150MB additional memory (due to extra packages)

Switching Between Methods

You can switch at any time without losing data:

Remove LangChain (go back to Direct OpenAI):

pip uninstall langchain langchain-openai

Reinstall LangChain:

pip install langchain langchain-openai

Your timeline_events.json file remains unchanged. All previously extracted events are preserved regardless of which method was used.


Debugging with LangChain

When LangChain is enabled, you get enhanced debugging in Advanced Settings:

LLM Transparency Panel: - View exact prompts sent to OpenAI - See raw AI responses before parsing - Track token usage per document - Monitor processing time - Review confidence scores - Check extraction method used

Access it: 1. Open ChronoScope 2. Sidebar → Toggle "Advanced Settings" 3. Scroll to "LLM Transparency" 4. Expand extraction log entries


Recommendations

Use Direct OpenAI if: - ✅ You just want to build timelines (most users) - ✅ You prefer simpler installations - ✅ You don't need advanced debugging - ✅ Extraction works well for your documents

Use LangChain if: - ✅ You're customizing extraction prompts - ✅ You need detailed API usage tracking - ✅ You want better error recovery - ✅ You plan to switch AI providers later - ✅ You're a developer integrating ChronoScope - ✅ You experience frequent extraction failures

Our recommendation: Start with Direct OpenAI (default). Only install LangChain if you encounter specific needs for its advanced features.


Other Advanced Settings

[Content coming soon - other power user features]


Back to Documentation Home