Coherence Score

Evaluate text coherence and quality.

Signature

def coherence_score(
    predictions: List[str],
    return_confidence: bool = True,
    detailed_analysis: bool = True,
) -> Dict[str, Any]:
    ...

Parameters

predictions (List[str]): List of predicted texts to evaluate for coherence.
return_confidence (bool, optional): Whether to return confidence intervals. Defaults to True.
detailed_analysis (bool, optional): Whether to return detailed coherence components. Defaults to True.

Returns

Dictionary containing:

mean_coherence (float): Average coherence score (0.0 to 1.0).
median_coherence (float): Median coherence score.
std_coherence (float): Standard deviation of coherence scores.
min_coherence (float): Minimum coherence score.
max_coherence (float): Maximum coherence score.
scores (List[float]): List of individual coherence scores for each prediction.
components (Dict[str, Any], optional): Component-level analysis (if detailed_analysis is True).
- sentence_consistency: Scores related to sentence length and structure consistency.
- lexical_diversity: Scores related to vocabulary richness.
- flow_continuity: Scores related to discourse markers and transitions.
- topic_consistency: Scores related to keyword overlap between sentences.
coherence_confidence_interval (Tuple[float, float], optional): 95% confidence interval for coherence score (if return_confidence is True).

Usage

from benchwise import coherence_score

texts = [
    "The sky is blue. It's a nice day. The weather is good.",
    "Random words. Not coherent. Disjointed thoughts."
]

result = coherence_score(texts)
print(f"Mean Coherence: {result['mean_coherence']:.3f}")

In Evaluations

from benchwise import evaluate, coherence_score

@evaluate("gpt-4")
async def test_coherence(model, dataset):
    responses = await model.generate(dataset.prompts)
    scores = coherence_score(responses)

    return {
        "coherence": scores["mean_coherence"],
        "high_quality": scores["mean_coherence"] > 0.7
    }

Signature​

Parameters​

Returns​

Usage​

In Evaluations​

See Also​

Signature

Parameters

Returns

Usage

In Evaluations

See Also