docarray.math.evaluation module#

docarray.math.evaluation.r_precision(binary_relevance, **kwargs)[source]#

R-Precision determines the precision in the fist R documents, where R is the number of documents relevant to the query.

Relevance is considered binary by this function (nonzero is relevant).

Please note, that it is necessary to provide relevance scores for all documents, i.e., the calculated metric is wrong, if you apply it on the Top-K scores only.

Parameters:: binary_relevance (List[int]) – binary relevancy in rank order
Return type:: float
Returns:: R-Precision

docarray.math.evaluation.precision_at_k(binary_relevance, k=None, **kwargs)[source]#

Precision @K. If binary_relevance is empty, 0.0 is returned.

Parameters:

binary_relevance (List[int]) – binary relevancy in rank order
k (Optional[int]) – measured on top-k

Return type:

float

Returns:

precision @k

docarray.math.evaluation.hit_at_k(binary_relevance, k=None, **kwargs)[source]#

Score is percentage of first relevant item in list that occur

Parameters:

binary_relevance (List[int]) – binary relevancy in rank order
k (Optional[int]) – measured on top-k

Return type:

int

Returns:

hit @k if hit return 1 else 0

docarray.math.evaluation.average_precision(binary_relevance, **kwargs)[source]#

Score is average precision (area under PR curve) Relevance is binary (nonzero is relevant).

Parameters:: binary_relevance (List[int]) – binary relevancy in rank order
Return type:: float
Returns:: Average precision

docarray.math.evaluation.reciprocal_rank(binary_relevance, **kwargs)[source]#

Score is reciprocal of the rank of the first relevant item

Parameters:: binary_relevance (List[int]) – binary relevancy in rank order
Return type:: float
Returns:: Average precision

docarray.math.evaluation.recall_at_k(binary_relevance, max_rel, k=None, **kwargs)[source]#

Score is recall after all relevant documents have been retrieved Relevance is binary (nonzero is relevant).

Parameters:

binary_relevance (List[int]) – binary relevancy in rank order
k (Optional[int]) – measured on top-k
max_rel (int) – Maximum number of documents that can be relevant

Return type:

float

Returns:

Recall score

docarray.math.evaluation.f1_score_at_k(binary_relevance, max_rel, k=None, **kwargs)[source]#

Score is harmonic mean of precision and recall Relevance is binary (nonzero is relevant).

Parameters:

binary_relevance (List[int]) – binary relevancy in rank order
k (Optional[int]) – measured on top-k
max_rel (int) – Maximum number of documents that can be relevant

Return type:

float

Returns:

F1 score @ k

docarray.math.evaluation.dcg_at_k(relevance, method=0, k=None, **kwargs)[source]#

Score is discounted cumulative gain (dcg) Relevance is positive real values. Can use binary as the previous methods.

Example from http://www.stanford.edu/class/cs276/handouts/EvaluationNew-handout-6-per.pdf

Parameters:

relevance (List[float]) – Relevance scores (list or numpy) in rank order (first element is the first item)
k (Optional[int]) – measured on top-k
method (int) – If 0 then weights are [1.0, 1.0, 0.6309, 0.5, 0.4307, …] If 1 then weights are [1.0, 0.6309, 0.5, 0.4307, …]

Returns:

Discounted cumulative gain

docarray.math.evaluation.ndcg_at_k(relevance, method=0, k=None, **kwargs)[source]#

Calculates a normalized discounted cumulative gain (ndcg). Relevance values can be positive real values. However, one can also use binary scores as in other evaluation methods.

Please note, that it is necessary to provide relevance scores for all documents, i.e., the calculated metric is wrong, if you apply it on the Top-K scores only.

Example from http://www.stanford.edu/class/cs276/handouts/EvaluationNew-handout-6-per.pdf

Parameters:

relevance (List[float]) – Relevance scores (list or numpy) in rank order (first element is the first item)
k (Optional[int]) – measured on top-k
method (int) – If 0 then weights are [1.0, 1.0, 0.6309, 0.5, 0.4307, …] If 1 then weights are [1.0, 0.6309, 0.5, 0.4307, …]

Returns:

Normalized discounted cumulative gain