With auto-generated dataset

Get your Maihem API key and install the SDK before you start.

Integrate with your codebase

Add target agent (if you haven't already)

from maihem import Maihem

maihem_client = Maihem()

maihem_client.add_target_agent(
    name="financial-assistant-x",
    label="Financial Assistant Company X", # Optional
    role="AI Financial Assistant",
    description="An AI assistant that provides information and summaries from financial documents."
    language="en" # (Optional) Default is "en" (English), follow ISO 639
)

Add a decorator to each step of your workflow

This is an example of a basic RAG workflow. Add a decorator to each step of the workflow as shown below.

See a full list of supported evaluators and metrics and their required input and output maps.

Example agent workflow in Python
import maihem
from maihem.evaluators import EndToEndEvaluator, AnswerGenerationEvaluator, ContextRetrievalEvaluator

class Agent:

    @maihem.workflow_step(
        target_agent_name="financial-assistant-x",
        workflow_name="2 step RAG workflow",
        evaluator=EndToEndEvaluator(
            input_query="message",
            output_answer=lambda x: x # Map the output of your function that contains the answer
        )
    )
    def run_workflow(conversation_id: str, message: str) -> str:
        """Trigger workflow to generate a response"""
        contexts = context_retrieval(conversation_id, message)
        message = generate_answer(conversation_id, message, contexts)
        return message

    @maihem.workflow_step(
        name="Context Retrieval", # (Optional) Name of the step to be displayed in Maihem's dashboard
        evaluator=ContextRetrievalEvaluator(
            input_query="message",
            output_contexts=lambda x: x # Map the output of your function that contains the answer
        )
    )
    def context_retrieval(conversation_id: str, message: str) -> str:
        """Retrieve a list of chunks to be used as context for the LLM."""
        contexts = retrieve_contexts(message)
        return contexts

    @maihem.workflow_step(
        name="Answer Generation", # (Optional) Name of the step to be displayed in Maihem's dashboard
        evaluator=AnswerGenerationEvaluator(
            input_query="message",
            input_contexts="contexts",
            output_answer=lambda x: x # Map the output of your function that contains the answer
        )
    )
    def generate_answer(conversation_id: str, message: str, contexts: List) -> str:
        """Generate a response using a list of retrieved contexts"""
        answer = call_llm(message, contexts) # Example of an LLM call
        return answer

Auto-generate test

Add documents to auto-generate questions from them

Maihem supports documents in the following formats: pdf, txt, docx, md.

Move all the documents to the same folder.

documents_path = "/path/to/folder/with/documents"

Create test with auto-generated data

maihem_client.create_test_auto_generated_data(
    name="test_auto_generated_1",
    label="Test auto-generated #1", # (Optional) Name of the test to be displayed
    target_agent_name="financial-assistant-x",
    documents_path=documents_path # Path to folder with documents
    number_conversations=50,
    conversation_turns_max=5 # (Optional) Default is 10
)

Run the test

A test run will generate:

Simulated conversations between your target agent and Maihem
Evaluations of the conversations
A list of detected failures

maihem_client.run_test(
    name="modelX_prompt2_5_28-11-2024",
    label="Model X Prompt v2.5 (28/Nov/2024)", # Optional
    test_name="test_auto_generated_1",
    concurrent_conversations=10 # Optional      
)

See test run results in Maihem UI

See the results in your Maihem account.

Or get the test results:

test_run_results = maihem_client.get_test_run_results(
    test_name="test_auto_generated_1",
    test_run_name="modelX_prompt2_5_28-11-2024"
)

test_run_results.result = "failed"
test_run_results.score = 82.5
test_run_results.conversations[0].messages = [
    {
        "role": "user",
        "content": "When was Fund X created?"
    },
    {
        "role": "assistant",
        "content": "Sorry, I could not find this information.",
        "evaluation": {
            "result": "failed",
            "explanation": "Hallucination detected. Fund X was created in 2005."
        }
    }
]

Get started

How-to guides

Reference

With auto-generated dataset

Integrate with your codebase

Add target agent (if you haven't already)

Add a decorator to each step of your workflow

Auto-generate test

Add documents to auto-generate questions from them

Create test with auto-generated data

Run the test

See test run results in Maihem UI

Get started

How-to guides

Reference

​Integrate with your codebase

Add target agent (if you haven't already)

Add a decorator to each step of your workflow

​Auto-generate test

Add documents to auto-generate questions from them

Create test with auto-generated data

Run the test

See test run results in Maihem UI

Integrate with your codebase

Auto-generate test