ssd网络详解之detection output layer

本文原创，转载请引用https://blog.csdn.net/dan_teng/article/details/81561783

闲话少说，本文基本结构：首先介绍detection output 这一层的基本理解，之后给出ssd所有代码的详细注释，最后给出caffe中该层各个参数的定义和默认值。

detection out layer是ssd网络最后一层，用于整合预选框、预选框偏移以及得分三项结果，最终输出满足条件的目标检测框、目标的label和得分。
输入方面，mbox_priorbox是网络各个priorbox层输出concat后的结果（priorbox解析点这里），相当于把所有预选框放到一起；mbox_loc是在预选框的基础上的偏移量；mbox_conf_flatten就是每个类别在各个框上的得分。
输出大小为[1, 1, x, 7]，其中x是最后保留的框的个数，最后一维存放的数据为：
[image_id, label, confidence, xmin, ymin, xmax, ymax]

计算思路：
1）对bottom层的location、confidence和priorbox进行解析，放到vector中
2）对每个priorbox进行解码。所谓解码其实就是整合输入层。前面说到过了，输出需要给出每个目标的检测框，但是输入是预选框和偏移量，这里要做的就是计算出最终的检测框。解码需要考虑priorbox编码方式，共三种情况。

假设检测框用b表示(存储内容：b_xmin, b_ymin, b_xmax, b_ymax)，预选框用p表示(存储内容：p_xmin, p_ymin, p_xmax, p_ymax)，偏移量用t表示(存储内容：t_x, t_y, t_height, t_width)。
b和p的宽高分别用x和y的最大最小值减一下得到，中心点的值用最大最小值相加除以2得到。
那么在每种类型中，编码公式分别为：

CodeType_CORNER：

                              t = b - p（每个维度一样）

CodeType_CENTER_SIZE：

                             t_x = (b_center_x – p_center_x) / p_width （t_y同理）t_height = log(b_height / p_height) （t_width同理）

CodeType_CORNER_SIZE：

                              t_x = (b_x – p_x) / p_widtht_y = (b_y – p_y) / p_height

解码时求取b的各个值就可以。如果需要添加variance的值，将t与variance相乘即可。
以center_size编码方式为例：

               b_center_x = t_x * p_width + p_center_xb_center_y = t_y * p_height + p_center_yb_width = exp(t_x) * p_widthb_height = exp(t_y) * p_height

如果需要添加variance：

               b_center_x = t_x *prior_variance[0]* p_width + p_center_xb_center_y = t_y *prior_variance[1] * p_height + p_center_yb_width = exp(prior_variance[2] * t_x) * p_widthb_height = exp(prior_variance[3] * t_y) * p_height

据此分别计算出b_xmin, b_ymin, b_xmax, b_ymax即可。详细可参见代码

3） Non-Maximum Suppression非极大值抑制
检测算法给出的box往往有很多，如下图所示，多个检测框其实框出的是一个目标，nms就是一个目标保留一个最优框。抑制的过程是一个迭代-遍历-消除的过程。

（图片来源：https://blog.csdn.net/shuzfan/article/details/52711706）

给定处理前的集合：预选结合，处理后的集合keep集合

首先，将预选集合所有框按照得分高低进行排序，选中得分最高的框，从预选集合移出放到keep集合中；

接下来进行迭代：
*从当前预选集合移出得分最高的框，用它与keep集合每个框计算交并比：
*超过阈值说明二者重复很多，框住的应该是同一个东西，不放到keep集合中；
*如果与keep集合中每个框交并比都小于阈值，说明当前框框住的是一个新目标，应该放到keep中。

迭代下去，直到预选集合为空，那么keep集合中留下的就是检出的所有目标的检测框。

jaccard overlap
这里补充介绍一下ssd网络中的jaccard overlap。
jaccard overlap其实就是交并比，简单说起来就是两个检测框重合的面积（相交的部分）除以两个检测框并在一起的面积（面积之和减去重合部分），用公式表示为

J(A,B)=|A∩B||A∪B|J(A,B)=|A∩B||A∪B|

J(A,B) = \frac{|A\cap B|}{|A\cup B|}
J为0说明两个框一点没有重合，为1说明完全重合

4）按照输出大小要求输出结果

代码详解：

注意：这里给出了detection_output_layer.cpp中的代码，但是代码里用到了一些函数，这些函数放在了
ssd/src/caffe/util/bbox_util.cpp中

#include <algorithm>
#include <fstream>  // NOLINT(readability/streams)
#include <map>
#include <string>
#include <utility>
#include <vector>#include "boost/filesystem.hpp"
#include "boost/foreach.hpp"#include "caffe/layers/detection_output_layer.hpp"namespace caffe {template <typename Dtype>
void DetectionOutputLayer<Dtype>::LayerSetUp(const vector<Blob<Dtype>*>& bottom,const vector<Blob<Dtype>*>& top) {const DetectionOutputParameter& detection_output_param =this->layer_param_.detection_output_param();CHECK(detection_output_param.has_num_classes()) << "Must specify num_classes";num_classes_ = detection_output_param.num_classes();// 类别数量share_location_ = detection_output_param.share_location();num_loc_classes_ = share_location_ ? 1 : num_classes_;background_label_id_ = detection_output_param.background_label_id();code_type_ = detection_output_param.code_type();// 编码类型variance_encoded_in_target_ =detection_output_param.variance_encoded_in_target();keep_top_k_ = detection_output_param.keep_top_k(); // 保留框最大数量confidence_threshold_ = detection_output_param.has_confidence_threshold() ?detection_output_param.confidence_threshold() : -FLT_MAX;// 得分阈值// Parameters used in nms.nms_threshold_ = detection_output_param.nms_param().nms_threshold();CHECK_GE(nms_threshold_, 0.) << "nms_threshold must be non negative.";eta_ = detection_output_param.nms_param().eta();CHECK_GT(eta_, 0.);CHECK_LE(eta_, 1.);top_k_ = -1;if (detection_output_param.nms_param().has_top_k()) {top_k_ = detection_output_param.nms_param().top_k();}const SaveOutputParameter& save_output_param =detection_output_param.save_output_param();output_directory_ = save_output_param.output_directory();if (!output_directory_.empty()) {if (boost::filesystem::is_directory(output_directory_)) {boost::filesystem::remove_all(output_directory_);}if (!boost::filesystem::create_directories(output_directory_)) {LOG(WARNING) << "Failed to create directory: " << output_directory_;}}output_name_prefix_ = save_output_param.output_name_prefix();need_save_ = output_directory_ == "" ? false : true;output_format_ = save_output_param.output_format();if (save_output_param.has_label_map_file()) {string label_map_file = save_output_param.label_map_file();if (label_map_file.empty()) {// Ignore saving if there is no label_map_file provided.LOG(WARNING) << "Provide label_map_file if output results to files.";need_save_ = false;} else {LabelMap label_map;CHECK(ReadProtoFromTextFile(label_map_file, &label_map))<< "Failed to read label map file: " << label_map_file;CHECK(MapLabelToName(label_map, true, &label_to_name_))<< "Failed to convert label to name.";CHECK(MapLabelToDisplayName(label_map, true, &label_to_display_name_))<< "Failed to convert label to display name.";}} else {need_save_ = false;}if (save_output_param.has_name_size_file()) {string name_size_file = save_output_param.name_size_file();if (name_size_file.empty()) {// Ignore saving if there is no name_size_file provided.LOG(WARNING) << "Provide name_size_file if output results to files.";need_save_ = false;} else {std::ifstream infile(name_size_file.c_str());CHECK(infile.good())<< "Failed to open name size file: " << name_size_file;// The file is in the following format://    name height width//    ...string name;int height, width;while (infile >> name >> height >> width) {names_.push_back(name);sizes_.push_back(std::make_pair(height, width));}infile.close();if (save_output_param.has_num_test_image()) {num_test_image_ = save_output_param.num_test_image();} else {num_test_image_ = names_.size();}CHECK_LE(num_test_image_, names_.size());}} else {need_save_ = false;}has_resize_ = save_output_param.has_resize_param();if (has_resize_) {resize_param_ = save_output_param.resize_param();}name_count_ = 0;visualize_ = detection_output_param.visualize();if (visualize_) {visualize_threshold_ = 0.6;if (detection_output_param.has_visualize_threshold()) {visualize_threshold_ = detection_output_param.visualize_threshold();}data_transformer_.reset(new DataTransformer<Dtype>(this->layer_param_.transform_param(),this->phase_));data_transformer_->InitRand();save_file_ = detection_output_param.save_file();}bbox_preds_.ReshapeLike(*(bottom[0]));if (!share_location_) {bbox_permute_.ReshapeLike(*(bottom[0]));}conf_permute_.ReshapeLike(*(bottom[1]));
}
// 输出大小为[1, 1, x, 7]
// 最后一维7指的是：[image_id, label, confidence, xmin, ymin, xmax, ymax]
template <typename Dtype>
void DetectionOutputLayer<Dtype>::Reshape(const vector<Blob<Dtype>*>& bottom,const vector<Blob<Dtype>*>& top) {if (need_save_) {CHECK_LE(name_count_, names_.size());if (name_count_ % num_test_image_ == 0) {// Clean all outputs.if (output_format_ == "VOC") {boost::filesystem::path output_directory(output_directory_);for (map<int, string>::iterator it = label_to_name_.begin();it != label_to_name_.end(); ++it) {if (it->first == background_label_id_) {continue;}std::ofstream outfile;boost::filesystem::path file(output_name_prefix_ + it->second + ".txt");boost::filesystem::path out_file = output_directory / file;outfile.open(out_file.string().c_str(), std::ofstream::out);}}}}CHECK_EQ(bottom[0]->num(), bottom[1]->num());if (bbox_preds_.num() != bottom[0]->num() ||bbox_preds_.count(1) != bottom[0]->count(1)) {bbox_preds_.ReshapeLike(*(bottom[0]));}if (!share_location_ && (bbox_permute_.num() != bottom[0]->num() ||bbox_permute_.count(1) != bottom[0]->count(1))) {bbox_permute_.ReshapeLike(*(bottom[0]));}if (conf_permute_.num() != bottom[1]->num() ||conf_permute_.count(1) != bottom[1]->count(1)) {conf_permute_.ReshapeLike(*(bottom[1]));}num_priors_ = bottom[2]->height() / 4;CHECK_EQ(num_priors_ * num_loc_classes_ * 4, bottom[0]->channels())<< "Number of priors must match number of location predictions.";CHECK_EQ(num_priors_ * num_classes_, bottom[1]->channels())<< "Number of priors must match number of confidence predictions.";// num() and channels() are 1.vector<int> top_shape(2, 1);// Since the number of bboxes to be kept is unknown before nms, we manually// set it to (fake) 1.top_shape.push_back(1);// Each row is a 7 dimension vector, which stores// [image_id, label, confidence, xmin, ymin, xmax, ymax]top_shape.push_back(7);top[0]->Reshape(top_shape);
}template <typename Dtype>
void DetectionOutputLayer<Dtype>::Forward_cpu(const vector<Blob<Dtype>*>& bottom, const vector<Blob<Dtype>*>& top) {const Dtype* loc_data = bottom[0]->cpu_data();const Dtype* conf_data = bottom[1]->cpu_data();const Dtype* prior_data = bottom[2]->cpu_data();const int num = bottom[0]->num();// Retrieve all location predictions.vector<LabelBBox> all_loc_preds;// 处理偏移量数据GetLocPredictions(loc_data, num, num_priors_, num_loc_classes_,share_location_, &all_loc_preds);// Retrieve all confidences.vector<map<int, vector<float> > > all_conf_scores;// 处理得分数据GetConfidenceScores(conf_data, num, num_priors_, num_classes_,&all_conf_scores);// Retrieve all prior bboxes. It is same within a batch since we assume all// images in a batch are of same dimension.vector<NormalizedBBox> prior_bboxes;vector<vector<float> > prior_variances;// 处理预选框数据GetPriorBBoxes(prior_data, num_priors_, &prior_bboxes, &prior_variances);// Decode all loc predictions to bboxes.vector<LabelBBox> all_decode_bboxes;const bool clip_bbox = false;// 解码DecodeBBoxesAll(all_loc_preds, prior_bboxes, prior_variances, num,share_location_, num_loc_classes_, background_label_id_,code_type_, variance_encoded_in_target_, clip_bbox,&all_decode_bboxes);int num_kept = 0;vector<map<int, vector<int> > > all_indices;for (int i = 0; i < num; ++i) {const LabelBBox& decode_bboxes = all_decode_bboxes[i];const map<int, vector<float> >& conf_scores = all_conf_scores[i];map<int, vector<int> > indices;int num_det = 0;for (int c = 0; c < num_classes_; ++c) {if (c == background_label_id_) {// Ignore background class.continue;}if (conf_scores.find(c) == conf_scores.end()) {// Something bad happened if there are no predictions for current label.LOG(FATAL) << "Could not find confidence predictions for label " << c;}const vector<float>& scores = conf_scores.find(c)->second;int label = share_location_ ? -1 : c;if (decode_bboxes.find(label) == decode_bboxes.end()) {// Something bad happened if there are no predictions for current label.LOG(FATAL) << "Could not find location predictions for label " << label;continue;}const vector<NormalizedBBox>& bboxes = decode_bboxes.find(label)->second;// 非极大值抑制ApplyNMSFast(bboxes, scores, confidence_threshold_, nms_threshold_, eta_,top_k_, &(indices[c]));num_det += indices[c].size();}// 处理后有效数据量大于最后输出量，那就取得分最高的前keep_top_k个检测框if (keep_top_k_ > -1 && num_det > keep_top_k_) {vector<pair<float, pair<int, int> > > score_index_pairs;for (map<int, vector<int> >::iterator it = indices.begin();it != indices.end(); ++it) {int label = it->first;const vector<int>& label_indices = it->second;if (conf_scores.find(label) == conf_scores.end()) {// Something bad happened for current label.LOG(FATAL) << "Could not find location predictions for " << label;continue;}const vector<float>& scores = conf_scores.find(label)->second;for (int j = 0; j < label_indices.size(); ++j) {int idx = label_indices[j];CHECK_LT(idx, scores.size());score_index_pairs.push_back(std::make_pair(scores[idx], std::make_pair(label, idx)));}}// Keep top k results per image.std::sort(score_index_pairs.begin(), score_index_pairs.end(),SortScorePairDescend<pair<int, int> >);score_index_pairs.resize(keep_top_k_);// Store the new indices.map<int, vector<int> > new_indices;for (int j = 0; j < score_index_pairs.size(); ++j) {int label = score_index_pairs[j].second.first;int idx = score_index_pairs[j].second.second;new_indices[label].push_back(idx);}all_indices.push_back(new_indices);num_kept += keep_top_k_;} else {all_indices.push_back(indices);num_kept += num_det;}}vector<int> top_shape(2, 1);top_shape.push_back(num_kept);top_shape.push_back(7);Dtype* top_data;// 没有检测到目标if (num_kept == 0) {LOG(INFO) << "Couldn't find any detections";top_shape[2] = num;top[0]->Reshape(top_shape);top_data = top[0]->mutable_cpu_data();caffe_set<Dtype>(top[0]->count(), -1, top_data);// Generate fake results per image.for (int i = 0; i < num; ++i) {top_data[0] = i;top_data += 7;}} else {// 检测到目标top[0]->Reshape(top_shape);top_data = top[0]->mutable_cpu_data();}// 检测到目标的处理int count = 0;boost::filesystem::path output_directory(output_directory_);for (int i = 0; i < num; ++i) {const map<int, vector<float> >& conf_scores = all_conf_scores[i];const LabelBBox& decode_bboxes = all_decode_bboxes[i];for (map<int, vector<int> >::iterator it = all_indices[i].begin();it != all_indices[i].end(); ++it) {int label = it->first;if (conf_scores.find(label) == conf_scores.end()) {// Something bad happened if there are no predictions for current label.LOG(FATAL) << "Could not find confidence predictions for " << label;continue;}const vector<float>& scores = conf_scores.find(label)->second;int loc_label = share_location_ ? -1 : label;if (decode_bboxes.find(loc_label) == decode_bboxes.end()) {// Something bad happened if there are no predictions for current label.LOG(FATAL) << "Could not find location predictions for " << loc_label;continue;}const vector<NormalizedBBox>& bboxes =decode_bboxes.find(loc_label)->second;vector<int>& indices = it->second;if (need_save_) {CHECK(label_to_name_.find(label) != label_to_name_.end())<< "Cannot find label: " << label << " in the label map.";CHECK_LT(name_count_, names_.size());}// 将数据放入输出数据域中for (int j = 0; j < indices.size(); ++j) {int idx = indices[j];top_data[count * 7] = i;top_data[count * 7 + 1] = label;top_data[count * 7 + 2] = scores[idx];const NormalizedBBox& bbox = bboxes[idx];top_data[count * 7 + 3] = bbox.xmin();top_data[count * 7 + 4] = bbox.ymin();top_data[count * 7 + 5] = bbox.xmax();top_data[count * 7 + 6] = bbox.ymax();if (need_save_) {NormalizedBBox out_bbox;OutputBBox(bbox, sizes_[name_count_], has_resize_, resize_param_,&out_bbox);float score = top_data[count * 7 + 2];float xmin = out_bbox.xmin();float ymin = out_bbox.ymin();float xmax = out_bbox.xmax();float ymax = out_bbox.ymax();ptree pt_xmin, pt_ymin, pt_width, pt_height;pt_xmin.put<float>("", round(xmin * 100) / 100.);pt_ymin.put<float>("", round(ymin * 100) / 100.);pt_width.put<float>("", round((xmax - xmin) * 100) / 100.);pt_height.put<float>("", round((ymax - ymin) * 100) / 100.);ptree cur_bbox;cur_bbox.push_back(std::make_pair("", pt_xmin));cur_bbox.push_back(std::make_pair("", pt_ymin));cur_bbox.push_back(std::make_pair("", pt_width));cur_bbox.push_back(std::make_pair("", pt_height));ptree cur_det;cur_det.put("image_id", names_[name_count_]);if (output_format_ == "ILSVRC") {cur_det.put<int>("category_id", label);} else {cur_det.put("category_id", label_to_name_[label].c_str());}cur_det.add_child("bbox", cur_bbox);cur_det.put<float>("score", score);detections_.push_back(std::make_pair("", cur_det));}++count;}}if (need_save_) {++name_count_;if (name_count_ % num_test_image_ == 0) {if (output_format_ == "VOC") {map<string, std::ofstream*> outfiles;for (int c = 0; c < num_classes_; ++c) {if (c == background_label_id_) {continue;}string label_name = label_to_name_[c];boost::filesystem::path file(output_name_prefix_ + label_name + ".txt");boost::filesystem::path out_file = output_directory / file;outfiles[label_name] = new std::ofstream(out_file.string().c_str(),std::ofstream::out);}BOOST_FOREACH(ptree::value_type &det, detections_.get_child("")) {ptree pt = det.second;string label_name = pt.get<string>("category_id");if (outfiles.find(label_name) == outfiles.end()) {std::cout << "Cannot find " << label_name << std::endl;continue;}string image_name = pt.get<string>("image_id");float score = pt.get<float>("score");vector<int> bbox;BOOST_FOREACH(ptree::value_type &elem, pt.get_child("bbox")) {bbox.push_back(static_cast<int>(elem.second.get_value<float>()));}*(outfiles[label_name]) << image_name;*(outfiles[label_name]) << " " << score;*(outfiles[label_name]) << " " << bbox[0] << " " << bbox[1];*(outfiles[label_name]) << " " << bbox[0] + bbox[2];*(outfiles[label_name]) << " " << bbox[1] + bbox[3];*(outfiles[label_name]) << std::endl;}for (int c = 0; c < num_classes_; ++c) {if (c == background_label_id_) {continue;}string label_name = label_to_name_[c];outfiles[label_name]->flush();outfiles[label_name]->close();delete outfiles[label_name];}} else if (output_format_ == "COCO") {boost::filesystem::path output_directory(output_directory_);boost::filesystem::path file(output_name_prefix_ + ".json");boost::filesystem::path out_file = output_directory / file;std::ofstream outfile;outfile.open(out_file.string().c_str(), std::ofstream::out);boost::regex exp("\"(null|true|false|-?[0-9]+(\\.[0-9]+)?)\"");ptree output;output.add_child("detections", detections_);std::stringstream ss;write_json(ss, output);std::string rv = boost::regex_replace(ss.str(), exp, "$1");outfile << rv.substr(rv.find("["), rv.rfind("]") - rv.find("["))<< std::endl << "]" << std::endl;} else if (output_format_ == "ILSVRC") {boost::filesystem::path output_directory(output_directory_);boost::filesystem::path file(output_name_prefix_ + ".txt");boost::filesystem::path out_file = output_directory / file;std::ofstream outfile;outfile.open(out_file.string().c_str(), std::ofstream::out);BOOST_FOREACH(ptree::value_type &det, detections_.get_child("")) {ptree pt = det.second;int label = pt.get<int>("category_id");string image_name = pt.get<string>("image_id");float score = pt.get<float>("score");vector<int> bbox;BOOST_FOREACH(ptree::value_type &elem, pt.get_child("bbox")) {bbox.push_back(static_cast<int>(elem.second.get_value<float>()));}outfile << image_name << " " << label << " " << score;outfile << " " << bbox[0] << " " << bbox[1];outfile << " " << bbox[0] + bbox[2];outfile << " " << bbox[1] + bbox[3];outfile << std::endl;}}name_count_ = 0;detections_.clear();}}}if (visualize_) {
#ifdef USE_OPENCVvector<cv::Mat> cv_imgs;this->data_transformer_->TransformInv(bottom[3], &cv_imgs);vector<cv::Scalar> colors = GetColors(label_to_display_name_.size());VisualizeBBox(cv_imgs, top[0], visualize_threshold_, colors,label_to_display_name_, save_file_);
#endif  // USE_OPENCV}
}#ifdef CPU_ONLY
STUB_GPU_FORWARD(DetectionOutputLayer, Forward);
#endifINSTANTIATE_CLASS(DetectionOutputLayer);
REGISTER_LAYER_CLASS(DetectionOutput);}  // namespace caffe

caffe定义

message DetectionOutputParameter {// 预测种类optional uint32 num_classes = 1;// 不同类别之间是否共享框位置optional bool share_location = 2 [default = true];// Background label id. 无则为 -1.optional int32 background_label_id = 3 [default = 0];// nms参数optional NonMaximumSuppressionParameter nms_param = 4;// Parameters used for saving detection results.optional SaveOutputParameter save_output_param = 5;// bbox的编解码方式optional PriorBoxParameter.CodeType code_type = 6 [default = CORNER];// variance是否被编码optional bool variance_encoded_in_target = 8 [default = false];// 每张图片在nms处理后保留框的数量// -1 表示保留所有框optional int32 keep_top_k = 7 [default = -1];// 得分阈值optional float confidence_threshold = 9;// If true, visualize the detection results.optional bool visualize = 10 [default = false];// The threshold used to visualize the detection results.optional float visualize_threshold = 11;// If provided, save outputs to video file.optional string save_file = 12;
}

ssd网络详解之detection output layer相关推荐

EfficientNetV2网络详解
原论文名称:EfficientNetV2: Smaller Models and Faster Training 论文下载地址:https://arxiv.org/abs/2104.00298 原论文 ...
ResNet网络详解并使用pytorch搭建模型、并基于迁移学习训练
1.ResNet网络详解网络中的创新点: (1)超深的网络结构(突破1000层) (2)提出residual模块 (3)使用Batch Normalization加速训练(丢弃dropout) (1 ...
GoogLeNet网络详解并使用pytorch搭建模型
1.GoogLeNet网络详解网络中的创新点: (1)引入了Inception结构(融合不同尺度的特征信息) (2)使用1x1的卷积核进行降维以及映射处理 (虽然VGG网络中也有,但该论文介绍的更详 ...
ResNet网络详解与keras实现
ResNet网络详解与keras实现 ResNet网络详解与keras实现 Resnet网络的概览 Pascal_VOC数据集第一层目录第二层目录第三层目录梯度退化 Residual Lear ...
GoogleNet网络详解与keras实现
GoogleNet网络详解与keras实现 GoogleNet网络详解与keras实现 GoogleNet系列网络的概览 Pascal_VOC数据集第一层目录第二层目录第三层目录 Incepti ...
深度学习之图像分类（二十五）-- S2MLPv2 网络详解
深度学习之图像分类(二十五)S2MLPv2 网络详解目录深度学习之图像分类(二十五)S2MLPv2 网络详解 1. 前言 2. S2MLPv2 2.1 S2MLPv2 Block 2.2 Spat ...
ResNet、ResNeXt网络详解及复现
网络详解: ResNet网络详解 ResNeXt网络详解 torch复现: import torch.nn as nn import torch''' 对应着18层和34层的残差结构既要拥有实现部分 ...
HighwayNet网络详解及复现
HighwayNet网络详解及复现: https://mp.weixin.qq.com/s?__biz=Mzk0MzIzODM5MA==&mid=2247485190&idx=1&am ...
深度学习之图像分类（十九）-- Bottleneck Transformer(BoTNet)网络详解
深度学习之图像分类(十九)Bottleneck Transformer(BoTNet)网络详解目录深度学习之图像分类(十九)Bottleneck Transformer(BoTNet)网络详解 1 ...
深度学习之图像分类（二十六）-- ConvMixer 网络详解
深度学习之图像分类(二十六)ConvMixer 网络详解目录深度学习之图像分类(二十六)ConvMixer 网络详解 1. 前言 2. A Simple Model: ConvMixer 2.1 ...

ssd网络详解之detection output layer

ssd网络详解之detection output layer

ssd网络详解之detection output layer相关推荐

最新文章

热门文章