An Evaluation is an assessment of the performance of a Target Agent on Test Run. There is an Evaluation for each Metric defined in a Test.

An Evaluation can apply to one of three levels:

  • Conversation
  • Message
  • Sentence

An Evaluation consists of the following elements:

  • Result: Denotes whether the evaluation passed or failed.
  • Score: A number between 0 (lowest performance) and 1 (highest performance)
  • Explanation: The reason behind a Score and Result, provided in natural language.
  • Confidence: A number between 0 (lowest confidence) and 1 (highest confidence) that the Score is accurate.
  • Classification: TBC