User Guide Overview¶

Welcome to Citation Compass! This guide helps you choose the right interface for your needs and get started with common workflows.

Platform Access¶

Choose your interface based on how you work:

Interface	When to Use	Quick Start
🖥️ Dashboard	Interactive exploration, demos, quick analysis	`streamlit run app.py`
📓 Notebooks	Reproducible research, custom analysis, model training	`jupyter notebook notebooks/`
🔌 Python API	Automation, integration with existing tools	`from src.services import get_ml_service`

New to Citation Compass? Start with the dashboard—it requires no code and provides immediate visualization.

User Personas¶

Different users have different needs. We've designed Citation Compass to support three main personas, each with tailored workflows and entry points.

User Journey

🎓 Academic Researchers¶

You want: Citation analysis, research discovery, network visualization

Start here:

Interactive Features - Dashboard walkthrough
Network Analysis - Community detection and metrics
Results Interpretation - Export for publications

🤖 Data Scientists¶

You want: Custom models, batch predictions, reproducible pipelines

Start here:

Notebook Pipeline - 4-notebook workflow
ML Predictions - Model training and evaluation
Developer Guide - API customization

📊 Research Administrators¶

You want: Usage monitoring, team reports, performance tracking

Start here:

Interactive Features - Dashboard overview
Results Interpretation - Report generation
Network Analysis - Performance metrics

The user journey diagram shows the typical flow from initial exploration through data analysis to final publication. Notice how most users start with demo mode to learn features before importing their own data.

Getting Started Paths¶

🚀 Quick Exploration🔬 Research Analysis⚙️ Custom Integration

Goal: Try features with demo data (no setup required)

Steps: 1. Launch: streamlit run app.py 2. Load demo dataset from sidebar 3. Try Interactive Features 4. Export results

Time: 10 minutes

Goal: Analyze your citation network comprehensively

Steps: 1. Configure Neo4j database (setup guide) 2. Import data via Data Import 3. Follow Notebook Pipeline 4. Train models with ML Predictions

Time: 2-3 hours

Goal: Integrate into existing research workflows

Steps: 1. Review Developer Guide 2. Check API Reference 3. Use Python API for automation 4. Build custom dashboards/reports

Time: Varies by complexity

Common Workflows¶

🔍 Citation Discovery¶

Goal: Find related papers you might have missed

flowchart LR
    A[Input Paper] --> B[Generate Predictions]
    B --> C[Explore Embeddings]
    C --> D[Validate Results]
    D --> E[Export Reading List]

    style A fill:#e3f2fd
    style B fill:#fff3e0
    style C fill:#e8f5e8
    style D fill:#fce4ec
    style E fill:#f1f8e9

Input: Choose a paper from your network
Predict: Run ML predictions (guide)
Explore: Visualize in embedding space
Validate: Check against known citations
Export: Save reading list with scores

🕸️ Network Analysis¶

Goal: Understand citation communities and influence

flowchart LR
    A[Load Network] --> B[Compute Metrics]
    B --> C[Detect Communities]
    C --> D[Analyze Trends]
    D --> E[Generate Report]

    style A fill:#ffebee
    style B fill:#e0f2f1
    style C fill:#f3e5f5
    style D fill:#e8f5e8
    style E fill:#fff3e0

Load: Import data or use demo
Metrics: Calculate centrality (guide)
Communities: Run Louvain or label propagation
Trends: Analyze temporal patterns
Report: Export LaTeX tables

🤖 Model Training¶

Goal: Train custom ML model on your data

flowchart LR
    A[Prepare Data] --> B[Train Model]
    B --> C[Evaluate]
    C --> D[Predict]
    D --> E[Deploy]

    style A fill:#f1f8e9
    style B fill:#e3f2fd
    style C fill:#fce4ec
    style D fill:#fff3e0
    style E fill:#e8f5e8

Data: Import citation network
Train: Use notebook 02 for TransE training
Evaluate: Check MRR, Hits@K metrics
Predict: Generate citations
Deploy: Save model for dashboard use

Feature Matrix¶

Feature Availability by Interface (click to expand)

Feature	Dashboard	Notebooks	API
Citation Prediction	✅ Interactive	✅ Customizable	✅ Programmatic
Network Analysis	✅ Visual	✅ Detailed	✅ Batch
Community Detection	✅ Real-time	✅ Multiple Algorithms	✅ Scalable
Temporal Analysis	✅ Interactive	✅ Advanced	✅ Automated
Export Capabilities	✅ Basic	✅ Advanced	✅ Custom
Model Training	❌	✅ Full Pipeline	✅ Programmatic
Custom Visualization	❌	✅ Matplotlib/Plotly	✅ Programmatic
Batch Processing	❌	✅ Yes	✅ Scalable

Dashboard: Best for exploration and demos Notebooks: Best for research and custom analysis API: Best for automation and integration

Best Practices¶

Analysis Tips¶

Start small: Use demo datasets before loading large networks
Validate results: Cross-check predictions with domain expertise
Document settings: Record parameters for reproducibility
Iterate: Dashboard for exploration → notebooks for final analysis

Technical Tips¶

Monitor resources: Track memory usage with large datasets
Enable caching: Speeds up repeated analyses significantly
Check logs: Look in logs/ directory for debugging
Version control: Track notebooks and config files

Reporting Tips¶

Document methodology: Explain analysis approach clearly
Consistent styling: Use same color schemes across visualizations
Include metrics: Add confidence intervals and statistical tests
High resolution: Export figures at 300+ DPI for publications

Support & Community¶

Documentation:

API Reference - Technical details
Developer Guide - Architecture and customization
Resources - Helpful guides

Get Help:

GitHub Issues - Bug reports and feature requests
GitHub Discussions - Community support
Documentation Source - Contribute improvements

Next Steps¶

Choose your path forward:

Interactive Features

Explore the dashboard with clickable nodes and real-time progress
ML Predictions

Learn citation prediction and embedding visualization
Network Analysis

Discover communities, centrality, and temporal patterns
Notebook Pipeline

Master the 4-notebook analysis workflow
Data Import

Import your research collections with file upload or search
Demo Datasets

Try curated datasets spanning AI, neuroscience, and physics

Happy analyzing! 🔬✨