This directory contains a comprehensive suite of Python scripts for creating personalized Active Inference curricula. The scripts follow a modular, test-driven development approach and implement the full curriculum generation pipeline from research to translation.
The curriculum creation process follows these stages:
Purpose: Analyzes domain characteristics to create domain-specific Active Inference curricula.
Key Features:
data/domain_research/
Usage:
python 1_Research_Domain.py
Input: Domain files from Languages/Inputs_and_Outputs/Domain/Synthetic_*.md
Output: Research reports in data/domain_research/
Functions:
get_domain_files(domain_dir)
: Finds domain files to processmain()
: Orchestrates the complete domain analysis workflowPurpose: Researches target audience characteristics for personalized curriculum creation.
Key Features:
data/audience_research/
Usage:
python 1_Research_Entity.py
Input: Entity files from Languages/Inputs_and_Outputs/Entity/*.py
Output: Audience research reports in data/audience_research/
Functions:
get_entity_files(entity_dir)
: Finds entity files to processmain()
: Orchestrates the complete audience analysis workflowPurpose: Converts research reports into comprehensive Active Inference curricula.
Key Features:
data/written_curriculums/
Usage:
python 2_Write_Introduction.py
Input: Research files from data/domain_research/
and data/audience_research/
Output: Complete curricula in data/written_curriculums/
Functions:
get_research_files(research_dir)
: Finds research files to processprocess_research_directory()
: Processes all files in a research directorymain()
: Orchestrates the complete curriculum generation workflowPurpose: Generates PNG charts and Mermaid diagrams for curriculum visualization.
Key Features:
data/visualizations/
Usage:
python 3_Introduction_Visualizations.py [--input INPUT_DIR] [--output OUTPUT_DIR]
Options:
--input
: Custom input directory (default: data/written_curriculums
)--output
: Custom output directory (default: data/visualizations
)Outputs:
curriculum_metrics.png
: Comprehensive metrics dashboardcurriculum_structure.mmd
: Overall curriculum structure diagram{entity}_flow.mmd
: Individual learning flow diagramscurriculum_metrics.json
: Detailed metrics dataFunctions:
extract_curriculum_metadata()
: Analyzes curriculum content for metricscreate_curriculum_metrics_chart()
: Generates PNG analytics chartscreate_curriculum_flow_mermaid()
: Creates learning progression diagramscreate_curriculum_structure_mermaid()
: Creates overview structure diagramsPurpose: Translates curricula into multiple configured languages.
Key Features:
data/translated_curriculums/
Usage:
python 4_Translate_Introductions.py [--input INPUT_DIR] [--output OUTPUT_DIR] [--languages LANG1 LANG2 ...]
Options:
--input
: Custom input directory (default: data/written_curriculums
)--output
: Custom output directory (default: data/translated_curriculums
)--languages
: Specific languages to translate (default: all configured languages)Functions:
validate_languages()
: Validates requested languages against configurationmain()
: Orchestrates the complete translation workflowAll scripts follow a consistent data organization pattern:
data/
├── audience_research/ # Entity/audience research reports
│ └── {entity}_research_{timestamp}.json
├── domain_research/ # Domain analysis reports
│ ├── {domain}_research_{timestamp}.json
│ └── {domain}_research_{timestamp}.md
├── written_curriculums/ # Generated curricula
│ └── {entity}/
│ ├── {section}_{timestamp}.md
│ └── complete_curriculum_{timestamp}.md
├── translated_curriculums/ # Translated curricula
│ └── {language}/
│ └── {entity}_curriculum_{language}_{timestamp}.md
└── visualizations/ # Charts and diagrams
├── curriculum_metrics.png
├── curriculum_structure.mmd
└── {entity}_flow.mmd
Configure target languages in data/config/languages.yaml
:
target_languages:
- Chinese
- Spanish
- Arabic
- Hindi
- French
# ... more languages
script_mappings:
Arabic: "Modern Standard Arabic"
Chinese: "Simplified Chinese"
# ... more mappings
Customize prompts in data/prompts/
:
research_domain_analysis.md
: Domain analysis promptsresearch_domain_curriculum.md
: Domain curriculum generationresearch_entity.md
: Entity research promptscurriculum_section.md
: Section generation promptstranslation.md
: Translation promptsopenai
: API client for Perplexity and OpenRouterpathlib
: Path handlingpydantic
: Data validationpyyaml
: Configuration loadingmatplotlib
: Chart generationseaborn
: Statistical visualizationspandas
: Data manipulationpytest
: Testing frameworkblack
: Code formattingruff
: Lintinguv sync --all-extras --dev
export PERPLEXITY_API_KEY="your-perplexity-key"
export OPENROUTER_API_KEY="your-openrouter-key"
export PERPLEXITY_MODEL="llama-3.1-sonar-small-128k-online"
export OPENROUTER_MODEL="anthropic/claude-3.5-sonnet"
Run all scripts in sequence:
# 1. Research domains and entities
python 1_Research_Domain.py
python 1_Research_Entity.py
# 2. Generate curricula
python 2_Write_Introduction.py
# 3. Create visualizations
python 3_Introduction_Visualizations.py
# 4. Translate to target languages
python 4_Translate_Introductions.py
Generate visualizations for specific input:
python 3_Introduction_Visualizations.py --input /path/to/curricula --output /path/to/viz
Translate only to specific languages:
python 4_Translate_Introductions.py --languages Spanish French German
All scripts implement comprehensive error handling and quality assurance:
The scripts include comprehensive tests:
Run tests:
python tests/test_curriculum_scripts_integration.py
All scripts use structured logging:
logger = common_setup_logging()
logger.info("Starting process")
logger.error("Process failed", extra={"error": str(e)})
Logs include:
API Key Errors:
File Not Found Errors:
Memory Issues:
Network Errors:
Enable verbose logging by setting log level:
import logging
logging.getLogger().setLevel(logging.DEBUG)
Configuration Errors:
API Connection Issues:
Content Quality Issues:
Processing Failures:
black
for code formattingruff
for lintingThis project follows the repository’s LICENSE terms.