Quickstart - Maihem

What is Maihem?

Maihem creates AI agents that automatically test your conversational LLM applications.

Automatically find failures in your LLM applications before your users do

Build LLM applications that you and others can trust with our Quality Assurance Reports

Get started

Create an account

Create an account in our web application.

Generate an API key

Generate an API key for your organization in Organization settings.

Easily integrate via our SDKs

Python SDK

pip install maihem

Integrate into your codebase with our Python SDK

Want to integrate directly with our API?

API Quickstart

Core concepts

Click on a box to find more details about each concept.

Agents

Target Agent: Your AI agent to be tested.
Maihem Agent: Simulates your users to test your Target Agent.

Metric

A dimension to evaluate a Target Agent on, such as hallucination or helpfulness.

See here a list of supported metrics.

Test

A configuration with the Metrics and other parameters for evaluating your Target Agent.

Test Run

A single execution of a Test. A Test Run generates a set of Conversations.

Conversation

A series of Turns. A Turn contains a Message from the Target Agent and a Message from the Maihem Agent.

Evaluation

An evaluation of a Message or Conversation with a Metric. An Evaluation contains a Score, Failure Flag, and Explanation.

Report

A Report contains the percentage of Conversations in a Test Run that passed the evaluation of a Metric.

Core Concepts

​What is Maihem?