Core Concepts

This page explains the fundamental concepts and architecture behind Prompt Spec.

Key Components

Prompt Spec consists of several key components that work together:

1. Agent Specifications

Agent specifications define the behavior and capabilities of AI agents. These specifications are written in YAML and include:

Metadata: Name, version, and description of the agent
Agent Configuration: Model, system prompt, and other settings
Tools: Definitions of tools the agent can use
Benchmarks: Test cases to evaluate agent performance

2. Benchmarking System

The benchmarking system runs test cases against agents and evaluates their performance based on predefined criteria. It includes:

Test Runner: Executes test cases against agents
Evaluation Engine: Assesses agent performance based on criteria
Reporting: Generates reports with performance metrics

3. Optimization Engine

The optimization engine automatically improves agent prompts based on test results:

Analyzer: Identifies weaknesses in agent performance
Generator: Creates improved prompts based on analysis
Validator: Validates that improved prompts actually perform better

Architecture


┌───────────────┐      ┌───────────────┐      ┌───────────────┐
│               │      │               │      │               │
│  Agent Spec   │─────▶│  Test Runner  │─────▶│   Evaluation  │
│     YAML      │      │               │      │               │
└───────────────┘      └───────────────┘      └───────────────┘
                                                      │
                                                      ▼
┌───────────────┐      ┌───────────────┐      ┌───────────────┐
│               │      │               │      │               │
│  Improved     │◀─────│  Optimization │◀─────│    Reports    │
│   Agent       │      │    Engine     │      │               │
└───────────────┘      └───────────────┘      └───────────────┘

Core Philosophy

Prompt Spec is built on three key principles:

Declarative Definitions: Define agents and tests in a clear, readable format
Objective Evaluation: Evaluate agent performance using consistent, well-defined criteria
Continuous Improvement: Automatically improve agents through testing and feedback

Workflow

A typical workflow in Prompt Spec follows these steps:

Define an agent specification in YAML
Run benchmarks to evaluate agent performance
Analyze test results to identify weaknesses
Optimize the agent prompt to address these weaknesses
Re-run benchmarks to validate improvements

Next Steps

Learn about Agent Specifications in detail
Explore Benchmarking capabilities
Understand the Optimization process