Advanced Usage Examples

This section provides more complex examples of Prompt Spec usage for advanced scenarios.

Multi-Agent Collaboration

Here’s an example of setting up multiple agents that collaborate on a task:


metadata:
  name: "Collaborative Agents"
  version: "1.0"
  description: "A benchmark for testing multi-agent collaboration"
 
agents:
  - id: "researcher"
    model: gpt-4o
    systemPrompt: |
      You are a research agent that finds and summarizes information.
    maxSteps: 3
 
  - id: "writer"
    model: gpt-4o
    systemPrompt: |
      You are a writing agent that creates well-structured content based on research.
    maxSteps: 2
 
workflow:
  - step: "research"
    agent: "researcher"
    input: "Find information about renewable energy sources."
    output: "research_results"
 
  - step: "write"
    agent: "writer"
    input: "${research_results}"
    output: "final_content"
 
benchmarks:
  - name: "Renewable Energy Report"
    evaluationCriteria:
      - key: "accuracy"
        description: "Is the information accurate?"
        type: "boolean"
      - key: "completeness"
        description: "Does the report cover all major renewable energy sources?"
        type: "scale"
        min: 1
        max: 5
      - key: "readability"
        description: "Is the report well-structured and easy to read?"
        type: "scale"
        min: 1
        max: 5

Custom Evaluation Functions

You can define custom JavaScript functions for evaluation:


// custom-evaluator.js
export function evaluateResponse(response, criteria) {
  // Custom logic to evaluate the response
  const results = {};
 
  // Check for specific keywords in the response
  if (criteria.key === "keywordPresence") {
    const keywords = criteria.keywords || [];
    const matches = keywords.filter((keyword) =>
      response.toLowerCase().includes(keyword.toLowerCase()),
    );
    results.score = matches.length / keywords.length;
    results.explanation = `Found ${matches.length} out of ${keywords.length} keywords`;
  }
 
  return results;
}

Then reference it in your YAML:


benchmarks:
  - name: "Keyword Test"
    messages:
      - role: "user"
        content: "Explain the greenhouse effect."
    evaluationCriteria:
      - key: "keywordPresence"
        description: "Checks for presence of key terms"
        type: "custom"
        evaluator: "./custom-evaluator.js"
        keywords:
          - "carbon dioxide"
          - "radiation"
          - "atmosphere"
          - "temperature"
          - "greenhouse gases"

Next Steps

Explore optimization techniques for improving agent performance
Learn about API integration for programmatic usage