tensorflow streaming_recall@kprecision@k与sklearn的区别

声明：tensorflow的版本1.1.0

class_id是用来让你确定哪一个类别是正类的，

这是tf.contrib.metrics.streaming_sparse_recall_at_k官方文档

Signature: tf.contrib.metrics.streaming_sparse_recall_at_k(predictions, labels,
k, class_id=None, weights=None, metrics_collections=None, updates_collections=No
ne, name=None)

Computes recall@k of the predictions with respect to sparse labels.

If `class_id` is not specified, we'll calculate recall as the ratio of true
positives (i.e., correct predictions, items in the top `k` highest
`predictions` that are found in the corresponding row in `labels`) to

actual positives (the full `labels` row).

如果class_id没有指明的，那么则使用所有预测正确的样本数除以样本总数。这么一看，recall@k变成了accuracy@k的定义了

If `class_id` is specified, we calculate recall by considering only the rows
in the batch for which `class_id` is in `labels`, and computing the
fraction of them for which `class_id` is in the corresponding row in
`labels`.

`streaming_sparse_recall_at_k` creates two local variables,
`true_positive_at_<k>` and `false_negative_at_<k>`, that are used to compute
the recall_at_k frequency. This frequency is ultimately returned as
`recall_at_<k>`: an idempotent operation that simply divides
`true_positive_at_<k>` by total (`true_positive_at_<k>` +
`false_negative_at_<k>`).

For estimation of the metric over a stream of data, the function creates an
`update_op` operation that updates these variables and returns the
`recall_at_<k>`. Internally, a `top_k` operation computes a `Tensor`
indicating the top `k` `predictions`. Set operations applied to `top_k` and
`labels` calculate the true positives and false negatives weighted by
`weights`. Then `update_op` increments `true_positive_at_<k>` and
`false_negative_at_<k>` using these values.

If `weights` is `None`, weights default to 1. Use weights of 0 to mask values.

Args:
predictions: Float `Tensor` with shape [D1, ... DN, num_classes] where
N >= 1. Commonly, N=1 and predictions has shape [batch size, num_classes].
The final dimension contains the logit values for each class. [D1, ... DN]
must match `labels`.
labels: `int64` `Tensor` or `SparseTensor` with shape
[D1, ... DN, num_labels], where N >= 1 and num_labels is the number of
target classes for the associated prediction. Commonly, N=1 and `labels`
has shape [batch_size, num_labels]. [D1, ... DN] must match `predictions`.
Values should be in range [0, num_classes), where num_classes is the last
dimension of `predictions`. Values outside this range always count
towards `false_negative_at_<k>`.
k: Integer, k for @k metric.
class_id: Integer class ID for which we want binary metrics. This should be
in range [0, num_classes), where num_classes is the last dimension of
`predictions`. If class_id is outside this range, the method returns NAN.
weights: `Tensor` whose rank is either 0, or n-1, where n is the rank of
`labels`. If the latter, it must be broadcastable to `labels` (i.e., all
dimensions must be either `1`, or the same as the corresponding `labels`
dimension).
metrics_collections: An optional list of collections that values should
be added to.
updates_collections: An optional list of collections that updates should
be added to.
name: Name of new update operation, and namespace for other dependent ops.

Returns:
recall: Scalar `float64` `Tensor` with the value of `true_positives` divided
by the sum of `true_positives` and `false_negatives`.
update_op: `Operation` that increments `true_positives` and
`false_negatives` variables appropriately, and whose value matches
`recall`.

Raises:
ValueError: If `weights` is not `None` and its shape doesn't match
`predictions`, or if either `metrics_collections` or `updates_collections`
are not a list or tuple.
File: d:\programdata\anaconda3\lib\site-packages\tensorflow\contrib\metrics
\python\ops\metric_ops.py

Type: function

这是 tf.contrib.metrics.streaming_sparse_precision_at_k的官方文档
Signature: tf.contrib.metrics.streaming_sparse_precision_at_k(predictions, label
s, k, class_id=None, weights=None, metrics_collections=None, updates_collections
=None, name=None)
Docstring:
Computes precision@k of the predictions with respect to sparse labels.

If `class_id` is not specified, we calculate precision as the ratio of true
positives (i.e., correct predictions, items in the top `k` highest
`predictions` that are found in the corresponding row in `labels`) to
positives (all top `k` `predictions`).

如果没有指定class_id,那么预测正确的样本数除以所有top@k个预测数目，也就是样本总数*k。因此，随着K的增大，precision@k一般会减小。
If `class_id` is specified, we calculate precision by considering only the
rows in the batch for which `class_id` is in the top `k` highest
`predictions`, and computing the fraction of them for which `class_id` is
in the corresponding row in `labels`.

We expect precision to decrease as `k` increases.

`streaming_sparse_precision_at_k` creates two local variables,
`true_positive_at_<k>` and `false_positive_at_<k>`, that are used to compute
the precision@k frequency. This frequency is ultimately returned as
`precision_at_<k>`: an idempotent operation that simply divides
`true_positive_at_<k>` by total (`true_positive_at_<k>` +
`false_positive_at_<k>`).

For estimation of the metric over a stream of data, the function creates an
`update_op` operation that updates these variables and returns the
`precision_at_<k>`. Internally, a `top_k` operation computes a `Tensor`
indicating the top `k` `predictions`. Set operations applied to `top_k` and
`labels` calculate the true positives and false positives weighted by
`weights`. Then `update_op` increments `true_positive_at_<k>` and
`false_positive_at_<k>` using these values.

If `weights` is `None`, weights default to 1. Use weights of 0 to mask values.

sklearn版本为0.19.1

下面网页引自sklearn

sklearn.metrics.recall_score(y_true, y_pred, labels=None, pos_label=1, average=’binary’, sample_weight=None)

Compute the recall

The recall is the ratio tp / (tp + fn) where tp is the number of true positives and fn the number of false negatives. The recall is intuitively the ability of the classifier to find all the positive samples.

The best value is 1 and the worst value is 0.

Read more in the User Guide.

Parameters:

Parameters:	y_true : 1d array-like, or label indicator array / sparse matrix Ground truth (correct) target values. y_pred : 1d array-like, or label indicator array / sparse matrix Estimated targets as returned by a classifier. labels : list, optional The set of labels to include when `average != 'binary'`, and their order if `average is None`. Labels present in the data can be excluded, for example to calculate a multiclass average ignoring a majority negative class, while labels not present in the data will result in 0 components in a macro average. For multilabel targets, labels are column indices. By default, all labels in `y_true` and `y_pred` are used in sorted order. Changed in version 0.17: parameter labels improved for multiclass problem. pos_label : str or int, 1 by default The class to report if `average='binary'` and the data is binary. If the data are multiclass or multilabel, this will be ignored; setting `labels=[pos_label]` and `average != 'binary'` will report scores for that label only. average : string, [None, ‘binary’ (default), ‘micro’, ‘macro’, ‘samples’, ‘weighted’] This parameter is required for multiclass/multilabel targets. If `None`, the scores for each class are returned. Otherwise, this determines the type of averaging performed on the data: `'binary'`: Only report results for the class specified by `pos_label`. This is applicable only if targets (`y_{true,pred}`) are binary. `'micro'`: Calculate metrics globally by counting the total true positives, false negatives and false positives. `'macro'`: Calculate metrics for each label, and find their unweighted mean. This does not take label imbalance into account. `'weighted'`: Calculate metrics for each label, and find their average, weighted by support (the number of true instances for each label). This alters ‘macro’ to account for label imbalance; it can result in an F-score that is not between precision and recall. `'samples'`: Calculate metrics for each instance, and find their average (only meaningful for multilabel classification where this differs from `accuracy_score`). sample_weight : array-like of shape = [n_samples], optional Sample weights.
Returns:	recall : float (if average is not None) or array of float, shape = [n_unique_labels] Recall of the positive class in binary classification or weighted average of the recall of each class for the multiclass task.

y_true : 1d array-like, or label indicator array / sparse matrix

Ground truth (correct) target values.

y_pred : 1d array-like, or label indicator array / sparse matrix

Estimated targets as returned by a classifier.

labels : list, optional

The set of labels to include when average != 'binary', and their order if average is None. Labels present in the data can be excluded, for example to calculate a multiclass average ignoring a majority negative class, while labels not present in the data will result in 0 components in a macro average. For multilabel targets, labels are column indices. By default, all labels in y_true and y_pred are used in sorted order.

Changed in version 0.17: parameter labels improved for multiclass problem.

pos_label : str or int, 1 by default

The class to report if average='binary' and the data is binary. If the data are multiclass or multilabel, this will be ignored; setting labels=[pos_label] and average != 'binary' will report scores for that label only.

average : string, [None, ‘binary’ (default), ‘micro’, ‘macro’, ‘samples’, ‘weighted’]

This parameter is required for multiclass/multilabel targets. If None, the scores for each class are returned. Otherwise, this determines the type of averaging performed on the data:

'binary':

Only report results for the class specified by pos_label. This is applicable only if targets (y_{true,pred}) are binary.

'micro':

Calculate metrics globally by counting the total true positives, false negatives and false positives.

'macro':

Calculate metrics for each label, and find their unweighted mean. This does not take label imbalance into account.

'weighted':

Calculate metrics for each label, and find their average, weighted by support (the number of true instances for each label). This alters ‘macro’ to account for label imbalance; it can result in an F-score that is not between precision and recall.

'samples':

Calculate metrics for each instance, and find their average (only meaningful for multilabel classification where this differs from accuracy_score).

sample_weight : array-like of shape = [n_samples], optional

Sample weights.

Returns:

recall : float (if average is not None) or array of float, shape = [n_unique_labels]

Recall of the positive class in binary classification or weighted average of the recall of each class for the multiclass task.

Examples

>>>

>>> from sklearn.metrics import recall_score
>>> y_true = [0, 1, 2, 0, 1, 2]
>>> y_pred = [0, 2, 1, 0, 0, 1]
>>> recall_score(y_true, y_pred, average='macro')
0.33...
>>> recall_score(y_true, y_pred, average='micro')
0.33...
>>> recall_score(y_true, y_pred, average='weighted')
0.33...
>>> recall_score(y_true, y_pred, average=None)
array([ 1.,  0.,  0.])

需要注意的第一点，y_true和y_pred可以一维矩阵，也可是多维的，但是必须要统一

第二点，micro是计算全局的TP，FP,FN，macro是计算每个类的TP、FP、FN，并算出算术平均值，weighted和macro类似，只是它是加权平均值，权重是由每个真实样本的个数决定的。sample，这个是针对每个样本，目前暂不清楚是如何工作的。

第三点，pos_label 和label是用来确定要计算正类的label。

tensorflow streaming_recall@kprecision@k与sklearn的区别相关推荐

Tensorflow Keras模型和Estimator有什么区别？
对于整个tensorflow2.0框架,tensorflow Keras模型和Tensorflow Estimators都能够训练神经网络模型并使用它们来预测新的数据. 它们都是TensorFlow ...
型、T型、K型热电偶的区别和特点
不同热电偶类型的特点和应用场合常见热电偶的类型和特点 2020-07-04 在制造业中经常会用到热电偶,常见的热电偶类型有J型.T型和K型,根据不同从应用场合,我们必须使用合理的种类才能确保热电偶 ...
java 问号_java泛型--问号？和T或E或K或V的区别
所谓泛型,就是在定义类.接口.方法.参数或成员变量的时候,指定它们操作对象的类型为通用类型. 使用尖括号 <> 操作符 (The diamond operator )表示泛型, 尖括号内 ...
机器学习工具 sklearn与tensorflow优劣势
什么是sklearn Sklearn原称是Scikit learn,是机器学习领域中最知名的python模块之一,是基于Python语言的机器学习的工具.他主要建立在NumPy,SciPy,matpl ...
k均值聚类算法原理和（TensorFlow）实现
顾名思义,k均值聚类是一种对数据进行聚类的技术,即将数据分割成指定数量的几个类,揭示数据的内在性质及规律. 我们知道,在机器学习中,有三种不同的学习模式:监督学习.无监督学习和强化学习: 监督学习,也 ...
Tensorflow 1.x 和 Pytorch 中 Conv2d Padding的区别
Tensorflow 和 Pytorch 中 Conv2d Padding的区别 Pytorch中Conv2d的Padding 可以是整数,二元组,字符串三种形式. 整数(int).如果输入的padd ...
人工智能实践：tensorflow笔记
tensorflow2.1安装教程,遇到的问题及解决办法一.神经网络计算过程及模型搭建 (一)人工智能三学派: 我们常说的人工智能,就是让机器具备人的思维和意识.人工智能主要有三个学派,即行为主 ...
tensorflow零基础入门学习
开发环境:tensorflow-gpu-2.20.pycharm实现目录入门知识张量(Tensor) 数据类型 (1) 整型和浮点型 (2) 布尔型 (3) 字符串创建数据创建Tensor ...
利用TensorFlow实现多元线性回归
利用TensorFlow实现多元线性回归,代码如下: # -*- coding:utf-8 -*- import tensorflow as tf import numpy as np from sk ...

tensorflow streaming_recall@kprecision@k与sklearn的区别

tensorflow streaming_recall@kprecision@k与sklearn的区别相关推荐

最新文章

热门文章