This directory contains a comprehensive suite of Python scripts for creating personalized Active Inference curricula. The scripts follow a modular, test-driven development approach and implement the full curriculum generation pipeline from research to translation.
The curriculum creation process follows these stages:
Purpose: Analyzes domain characteristics to create domain-specific Active Inference curricula.
Key Features:
data/domain_research/Usage:
python 1_Research_Domain.py
Input: Domain files from Languages/Inputs_and_Outputs/Domain/Synthetic_*.md
Output: Research reports in data/domain_research/
Functions:
get_domain_files(domain_dir): Finds domain files to processmain(): Orchestrates the complete domain analysis workflowPurpose: Researches target audience characteristics for personalized curriculum creation.
Key Features:
data/audience_research/Usage:
python 1_Research_Entity.py
Input: Entity files from Languages/Inputs_and_Outputs/Entity/*.py
Output: Audience research reports in data/audience_research/
Functions:
get_entity_files(entity_dir): Finds entity files to processmain(): Orchestrates the complete audience analysis workflowPurpose: Converts research reports into comprehensive Active Inference curricula.
Key Features:
data/written_curriculums/Usage:
python 2_Write_Introduction.py
Input: Research files from data/domain_research/ and data/audience_research/
Output: Complete curricula in data/written_curriculums/
Functions:
get_research_files(research_dir): Finds research files to processprocess_research_directory(): Processes all files in a research directorymain(): Orchestrates the complete curriculum generation workflowPurpose: Generates PNG charts and Mermaid diagrams for curriculum visualization.
Key Features:
data/visualizations/Usage:
python 3_Introduction_Visualizations.py [--input INPUT_DIR] [--output OUTPUT_DIR]
Options:
--input: Custom input directory (default: data/written_curriculums)--output: Custom output directory (default: data/visualizations)Outputs:
curriculum_metrics.png: Comprehensive metrics dashboardcurriculum_structure.mmd: Overall curriculum structure diagram{entity}_flow.mmd: Individual learning flow diagramscurriculum_metrics.json: Detailed metrics dataFunctions:
extract_curriculum_metadata(): Analyzes curriculum content for metricscreate_curriculum_metrics_chart(): Generates PNG analytics chartscreate_curriculum_flow_mermaid(): Creates learning progression diagramscreate_curriculum_structure_mermaid(): Creates overview structure diagramsPurpose: Translates curricula into multiple configured languages.
Key Features:
data/translated_curriculums/Usage:
python 4_Translate_Introductions.py [--input INPUT_DIR] [--output OUTPUT_DIR] [--languages LANG1 LANG2 ...]
Options:
--input: Custom input directory (default: data/written_curriculums)--output: Custom output directory (default: data/translated_curriculums)--languages: Specific languages to translate (default: all configured languages)Functions:
validate_languages(): Validates requested languages against configurationmain(): Orchestrates the complete translation workflowAll scripts follow a consistent data organization pattern:
data/
├── audience_research/ # Entity/audience research reports
│ └── {entity}_research_{timestamp}.json
├── domain_research/ # Domain analysis reports
│ ├── {domain}_research_{timestamp}.json
│ └── {domain}_research_{timestamp}.md
├── written_curriculums/ # Generated curricula
│ └── {entity}/
│ ├── {section}_{timestamp}.md
│ └── complete_curriculum_{timestamp}.md
├── translated_curriculums/ # Translated curricula
│ └── {language}/
│ └── {entity}_curriculum_{language}_{timestamp}.md
└── visualizations/ # Charts and diagrams
├── curriculum_metrics.png
├── curriculum_structure.mmd
└── {entity}_flow.mmd
Configure target languages in data/config/languages.yaml:
target_languages:
- Chinese
- Spanish
- Arabic
- Hindi
- French
# ... more languages
script_mappings:
Arabic: "Modern Standard Arabic"
Chinese: "Simplified Chinese"
# ... more mappings
Customize prompts in data/prompts/:
research_domain_analysis.md: Domain analysis promptsresearch_domain_curriculum.md: Domain curriculum generationresearch_entity.md: Entity research promptscurriculum_section.md: Section generation promptstranslation.md: Translation promptsopenai: API client for Perplexity and OpenRouterpathlib: Path handlingpydantic: Data validationpyyaml: Configuration loadingmatplotlib: Chart generationseaborn: Statistical visualizationspandas: Data manipulationpytest: Testing frameworkblack: Code formattingruff: Lintinguv sync --all-extras --dev
export PERPLEXITY_API_KEY="your-perplexity-key"
export OPENROUTER_API_KEY="your-openrouter-key"
export PERPLEXITY_MODEL="llama-3.1-sonar-small-128k-online"
export OPENROUTER_MODEL="anthropic/claude-3.5-sonnet"
Run all scripts in sequence:
# 1. Research domains and entities
python 1_Research_Domain.py
python 1_Research_Entity.py
# 2. Generate curricula
python 2_Write_Introduction.py
# 3. Create visualizations
python 3_Introduction_Visualizations.py
# 4. Translate to target languages
python 4_Translate_Introductions.py
Generate visualizations for specific input:
python 3_Introduction_Visualizations.py --input /path/to/curricula --output /path/to/viz
Translate only to specific languages:
python 4_Translate_Introductions.py --languages Spanish French German
All scripts implement comprehensive error handling and quality assurance:
The scripts include comprehensive tests:
Run tests:
python tests/test_curriculum_scripts_integration.py
All scripts use structured logging:
logger = common_setup_logging()
logger.info("Starting process")
logger.error("Process failed", extra={"error": str(e)})
Logs include:
API Key Errors:
File Not Found Errors:
Memory Issues:
Network Errors:
Enable verbose logging by setting log level:
import logging
logging.getLogger().setLevel(logging.DEBUG)
Configuration Errors:
API Connection Issues:
Content Quality Issues:
Processing Failures:
black for code formattingruff for lintingThis project follows the repository’s LICENSE terms.