Quick Start Guide¶

Get running with your first citation analysis in under 10 minutes!

Before You Start¶

Choose your path:

🎭 Demo Mode First! (recommended for all users): - ✅ Installation - Install with pip install -e ".[all]" - ✅ No database required - Use demo datasets to explore features - ✅ Learn features - Get familiar before full setup

🏢 Production Setup (after mastering demo mode): - ✅ Installation - Install with pip install -e ".[all]" - ✅ Configuration - .env file configured with Neo4j credentials - ✅ Environment Setup - Database connection validated - ✅ Demo Experience - Understanding gained from hands-on exploration

Your First Citation Analysis¶

Step 1: Launch the Platform¶

Choose your preferred interface:

🖥️ Interactive Dashboard📓 Jupyter Notebooks

# Launch Streamlit interface
streamlit run app.py

Your browser will open to http://localhost:8501 with the interactive dashboard.

# Start Jupyter
jupyter notebook notebooks/

Open 01_comprehensive_exploration.ipynb to begin analysis.

Step 2: Choose Your Data Source¶

Once the platform is running, choose how you want to explore citation networks:

🎭 Demo Mode (Recommended - Zero Setup!)📁 File Upload (Your Research Collections)🔍 Search Import (Discover New Papers)🏢 Production Database (Advanced)

Perfect for all users - start here!

Navigate to Demo Datasets in the sidebar
Browse curated datasets:
- complete_demo: 13 high-impact papers across AI, neuroscience, physics
- minimal_demo_5papers: Quick 5-paper network for fast testing
Click "Load Dataset" to load sample data
Explore all features with realistic academic data:
- ML predictions with synthetic embeddings
- Interactive network visualizations with clickable nodes
- Community detection across research fields
- Export capabilities for reports and analysis

Full Platform Experience

Demo mode provides complete functionality with curated academic papers spanning multiple research domains. Perfect for learning, testing, and demonstrating all platform capabilities!

Import your own paper collections easily:

Navigate to Data Import → Paper IDs → 📁 File Upload
Download sample files to see the format (sample_paper_ids.txt/csv)
Upload your .txt/.csv files with Semantic Scholar paper IDs
Monitor real-time progress with streaming updates and performance metrics
Explore your imported data using all platform features

Start Small

Try with 10-50 papers first to learn the workflow, then scale up to larger collections!

Import papers by academic search:

Navigate to Data Import → Search Query
Enter search terms: "machine learning", "neural networks", etc.
Configure filters: citation count, year range, quality settings
Start import with real-time progress tracking
Analyze imported networks immediately

For large-scale production use:

Complete demo experience first to understand workflows
Configure Neo4j database following configuration guide
Import data using search or file upload methods
Train custom ML models with your domain-specific data

In the Interactive Dashboard:¶

Navigate to the Home page - Overview of your citation network
Check Network Analysis - View basic statistics about your data:
Number of papers and citations
Network density and connectivity
Top-cited papers and influential authors

In Jupyter Notebooks:¶

Run the first few cells of 01_comprehensive_exploration.ipynb to see:

# Quick network overview
from src.services.analytics_service import get_analytics_service

analytics = get_analytics_service()
overview = analytics.get_network_overview()

print(f"📊 Network Overview:")
print(f"Papers: {overview.num_papers:,}")
print(f"Citations: {overview.num_citations:,}")
print(f"Authors: {overview.num_authors:,}")
print(f"Average citations per paper: {overview.avg_citations:.2f}")

Step 3: Make Your First Citation Prediction¶

New! Citation predictions now work in demo mode with no setup required!

🎭 Demo Mode Predictions (Recommended)🏢 Production Mode Predictions📓 Notebook Method

Works immediately with demo datasets:

Load a demo dataset first (complete_demo recommended)
Go to ML Predictions page
Notice green status - Demo ML service is ready!
Try a paper from your demo dataset:
- For complete_demo: Try "649def34f8be52c8b66281af98ae884c09aef38f9" (Attention Is All You Need)
- Or search by title: "Attention"
Click Generate Predictions
Explore realistic results with confidence scores based on:
- Research field similarity (ML papers cite ML papers)
- Temporal patterns (newer papers cite foundational work)
- Impact weighting (highly-cited papers get more predictions)

No Training Required!

Demo mode uses synthetic embeddings that cluster papers realistically by research field, providing educational ML prediction experience without model training!

For trained models with your data:

Train model first using notebook pipeline
Check ML service status (green = model loaded)
Enter paper ID from your database
Get predictions based on your trained model

# Works in both demo and production modes
from src.services.ml_service import get_ml_service

ml_service = get_ml_service()

# Demo mode: Use papers from loaded demo dataset
# Production: Use papers from your database
paper_id = "649def34f8be52c8b66281af98ae884c09aef38f9"  # Attention paper in demo
predictions = ml_service.predict_citations(paper_id, top_k=10)

print(f"🤖 Predictions for paper: {paper_id}")
for pred in predictions:
    print(f"📄 Target: {pred['target_id']}")
    print(f"   Confidence: {pred['confidence']:.3f}")
    print(f"   Field relationship: {pred.get('field_similarity', 'N/A')}")
    print()

Step 4: Analyze Citation Communities¶

Discover research communities in your network:

🖥️ Dashboard Method📓 Notebook Method

Visit Enhanced Visualizations page
Explore interactive network with clickable nodes!
- Click any paper node to see detailed information
- Trace citation paths visually
- Filter by research field or publication year
Try Community Detection:
- Choose algorithm (Louvain recommended)
- See research fields cluster together
- Explore cross-field connections
Export visualizations in high resolution

# Detect research communities
communities = analytics.detect_communities(
    method='louvain',
    resolution=1.0
)

print(f"🏘️ Found {len(communities.communities)} research communities")

# Show largest communities
for i, community in enumerate(communities.communities[:5]):
    print(f"\nCommunity {i+1}: {len(community.papers)} papers")
    print(f"Top papers: {community.top_papers[:3]}")

Step 5: Generate Your First Report¶

Export your analysis results:

🖥️ Dashboard Method📓 Notebook Method

Navigate to Results Interpretation
Select the analysis results you want to export
Choose export format (PDF, LaTeX, CSV)
Click Generate Report

from src.analytics.export_engine import ExportEngine

exporter = ExportEngine()

# Generate comprehensive report
report = exporter.generate_report(
    title="My First Citation Analysis",
    include_predictions=True,
    include_communities=True,
    format="latex"
)

print(f"📊 Report generated: {report.file_path}")

Sample Workflows¶

Try these common analysis patterns:

🔍 Research Discovery Workflow¶

Find a paper of interest in your network
Generate citation predictions to find related work
Explore the embedding space to visualize paper relationships
Export reading list with confidence scores

🕸️ Network Analysis Workflow¶

Compute network statistics (centrality, clustering)
Detect research communities using graph algorithms
Analyze temporal trends in citation patterns
Generate LaTeX report for publication

🤖 ML Pipeline Workflow¶

Train custom TransE model on your data
Evaluate model performance with standard metrics
Generate predictions for paper recommendation
Validate results against known citations

Next Steps¶

Now that you've completed your first analysis:

📚 Learn More¶

New User Path: - Demo Mode Guide - Master demo features and educational workflows - Demo Datasets - Explore all available demo datasets - File Upload Guide - Import your research collections easily

Advanced Features: - Interactive Features - Clickable nodes, real-time progress, enhanced UI - Data Import - Comprehensive import pipeline with streaming features
- User Guide - Complete feature walkthrough - Notebook Pipeline - Complete analysis workflows - ML Predictions - Advanced prediction techniques

🔧 Customize Your Setup¶

Model Training Notebook - Train models for your domain
API Reference - Scale for large datasets
Developer Guide - Connect with other tools

🤝 Get Help¶

GitHub Issues - Common issues and solutions
API Reference - Complete API documentation
GitHub Issues - Report bugs or request features

Quick Reference¶

Essential Commands¶

# Start interactive dashboard
streamlit run app.py

# Run complete analysis pipeline
jupyter notebook notebooks/01_comprehensive_exploration.ipynb

# Test your setup
python -m pytest tests/test_integration.py -v

# Validate configuration
python scripts/validate_environment.py

# Generate API documentation
mkdocs serve --watch-theme

Key File Locations¶

Configuration: .env
Models: models/
Outputs: outputs/
Notebooks: notebooks/
Documentation: docs/

Important URLs¶

Interactive Dashboard: http://localhost:8501
Jupyter Notebooks: http://localhost:8888
Documentation: http://localhost:8000 (if running mkdocs serve)

Congratulations!

You've completed your first citation analysis! The platform is now ready for advanced research workflows and custom analysis projects.

Happy researching! 🔬✨