Type alias RunEvalConfig<T, U>

RunEvalConfig<T, U>: {
    customEvaluators?: U[];
    evalLlm?: string;
    evaluators?: (T | EvalConfig | U)[];
    formatEvaluatorInputs?: EvaluatorInputFormatter;
}

Configuration class for running evaluations on datasets.

Type Parameters

Type declaration

  • Optional customEvaluators?: U[]

    Custom evaluators to apply to a dataset run. Each evaluator is provided with a run trace containing the model outputs, as well as an "example" object representing a record in the dataset.

    ⚠️ Deprecated ⚠️

    Use evaluators instead.

    This feature is deprecated and will be removed in the future.

    It is not recommended for use.

  • Optional evalLlm?: string

    The language model specification for evaluators that require one.

  • Optional evaluators?: (T | EvalConfig | U)[]

    LangChain evaluators to apply to a dataset run. You can optionally specify these by name, or by configuring them with an EvalConfig object.

  • Optional formatEvaluatorInputs?: EvaluatorInputFormatter

    Convert the evaluation data into formats that can be used by the evaluator. This should most commonly be a string. Parameters are the raw input from the run, the raw output, raw reference output, and the raw run.

    Example

    // Chain input: { input: "some string" }
    // Chain output: { output: "some output" }
    // Reference example output format: { output: "some reference output" }
    const formatEvaluatorInputs = ({
    rawInput,
    rawPrediction,
    rawReferenceOutput,
    }) => {
    return {
    input: rawInput.input,
    prediction: rawPrediction.output,
    reference: rawReferenceOutput.output,
    };
    };

    Returns

    The prepared data.

Remarks

RunEvalConfig in LangSmith is a configuration class for running evaluations on datasets. Its primary purpose is to define the parameters and evaluators that will be applied during the evaluation of a dataset. This configuration can include various evaluators, custom evaluators, and different keys for inputs, predictions, and references.

Typeparam

T - The type of evaluators.

Typeparam

U - The type of custom evaluators.

Generated using TypeDoc