Agent Specifications

This page explains how to define agents using Prompt Spec’s YAML specification format.

YAML Specification Structure

Prompt Spec uses a declarative YAML format to define agents, tools, and benchmarks. The structure consists of the following main sections:

metadata: General information about the agent
agent: Configuration for the agent itself
tools: Definitions of tools the agent can use
benchmarks: Test cases to evaluate agent performance

Complete Example


metadata:
  name: "Customer Service Agent"
  version: "1.0"
  description: "An agent that handles customer service inquiries"
  author: "Prompt Spec Team"
 
agent:
  model: gpt-4o
  systemPrompt: |
    You are a helpful customer service agent. Your job is to assist users with their inquiries,
    provide information about products, and help resolve issues. Be concise and professional.
  temperature: 0.2
  maxTokens: 1000
  toolChoice: "auto"
  maxSteps: 5
 
tools:
  checkOrderStatus:
    description: "Check the status of a customer order"
    inputSchema:
      type: object
      properties:
        orderId:
          type: string
          description: "The order ID to check"
      required: ["orderId"]
    outputSchema:
      type: object
      properties:
        status:
          type: string
          enum: ["processing", "shipped", "delivered", "canceled"]
        estimatedDelivery:
          type: string
          format: "date"
    implementation:
      type: "mock"
      responseMapping:
        - condition: "input.orderId.startsWith('A')"
          response: { status: "processing", estimatedDelivery: "2023-12-25" }
        - condition: "input.orderId.startsWith('B')"
          response: { status: "shipped", estimatedDelivery: "2023-12-15" }
        - default: { status: "delivered", estimatedDelivery: "2023-12-01" }
 
benchmarks:
  - name: "Order Status Inquiry"
    messages:
      - role: "user"
        content: "Can you tell me the status of my order A12345?"
    expectedToolCalls:
      - tool: "checkOrderStatus"
        expectedArgs: { orderId: "A12345" }
    evaluationCriteria:
      - key: "correctTool"
        description: "Did the agent use the correct tool?"
        type: "boolean"
      - key: "informationProvided"
        description: "Did the agent provide all relevant information from the tool response?"
        type: "scale"
        min: 1
        max: 5

Section Details

Metadata

The metadata section provides general information about the agent:


metadata:
  name: "Simple Question Answering Agent"
  version: "1.0"
  description: "A basic agent for testing question answering capabilities"
  author: "Optional author name"
  tags: ["question-answering", "basic"]

Agent Configuration

The agent section defines the core configuration of the agent:


agent:
  model: "gpt-4o" # LLM model to use
  systemPrompt: "You are..." # System prompt for the agent
  temperature: 0.2 # Temperature for generation
  maxTokens: 1000 # Maximum tokens to generate
  toolChoice: "auto" # Tool choice method (auto, required, none)
  maxSteps: 5 # Maximum conversation turns

Tools

The tools section defines tools that the agent can use:


tools:
  toolName:
    description: "Description of the tool"
    inputSchema: # JSON Schema for tool input
      type: object
      properties:
        param1:
          type: string
          description: "Parameter description"
      required: ["param1"]
    outputSchema: # JSON Schema for tool output
      type: object
      properties:
        result:
          type: string
    implementation: # Tool implementation (mock, function, etc.)
      type: "mock"
      responseMapping:
        - condition: "input.param1 === 'value'"
          response: { result: "Response for value" }
        - default: { result: "Default response" }

Benchmarks

The benchmarks section defines test cases to evaluate agent performance:


benchmarks:
  - name: "Test Case Name"
    messages: # Conversation messages
      - role: "user"
        content: "User message"
    expectedToolCalls: # Optional expected tool calls
      - tool: "toolName"
        expectedArgs: { param1: "value" }
    evaluationCriteria: # Criteria for evaluation
      - key: "criterion1"
        description: "Description of criterion"
        type: "boolean" # boolean, scale, or custom
      - key: "criterion2"
        description: "Scale criterion"
        type: "scale"
        min: 1
        max: 5

Next Steps

Learn about Benchmarking your agents
Explore Optimization techniques
Check out Examples of various agent specifications