Αναρτήθηκε από Savvas Chatzichristofis 

Recently, standard benchmark databases and evaluation campaigns have been created allowing a quantitative comparison of CBIR systems. These benchmarks allow the comparison of image retrieval systems under different aspects: usability and user interfaces, combination with text retrieval, or overall performance of a system.

1. WANG database

The WANG database is a subset of 1,000 images of the Corel stock photo database which have been manually selected and which form 10 classes of 100 images each. The WANG database can be considered similar to common stock photo retrieval tasks with several images from each category and a potential user having an image from a particular category and looking for similar images which have e.g. cheaper royalties or which have not been used by other media. The 10 classes are used for relevance estimation: given a query image, it is assumed that the user is searching for images from the same class, and therefore the remaining 99 images from the same class are considered relevant and the images from all other classes are considered irrelevant

Download

2. The MIRFLICKR-25000 Image Collection

The new MIRFLICKR-25000 collection consists of 25000 images downloaded from the social photography site Flickr through its public API.

Features:

  • OPEN 
    Access to the collection is simple and reliable, with image copyright clearly established. This is realized by selecting only images offered under the Creative Commons license. See the copyright section below.
  • INTERESTING 
    Images are also selected based on their high interestingness rating. As a result the image collection is representative for the domain of original and high-quality photography.
  • PRACTICAL 
    In particular for the research community dedicated to improving image retrieval. We have collected the user-supplied image Flickr tags as well as the EXIF metadata and make it available in easy-to-access text files. Additionally we provide manual imageannotations on the entire collection suitable for a variety of benchmarks.

MIRFLICKR-25000 is an evolving effort with many ideas for extension. So far the image collection, metadata and annotations can be downloaded below. If you enter your email address before downloading, we will keep you posted of the latest updates.

Download

3. UW database

The database created at the University of Washington consists of a roughly categorized collection of 1,109 images.These images are partly annotated using keywords. The remaining images were annotated by our group to allow the annotation to be used for relevance estimation; our annotations are publicly available10.The images are of various sizes and mainly include vacation pictures from various locations. There are 18 categories,for example “spring flowers”, “Barcelona”, and “Iran”. Some example images with annotations are shown in Figure 2. The complete annotation consists of 6,383 words with a vocabulary of 352 unique words. On the average, each image has about 6 words of annotation. The maximum number of key-words per image is 22 and the minimum is 1. The database is freely available11. The relevance assessment for the experiments with this database were performed using the annotation: an image is considered to be relevant w.r.t. a given query image if the two images have a common keyword in the annotation. On the average, 59.3 relevant images correspond to each image. The keywords are rather general; thus for example images showing sky are relevant w.r.t. each other,which makes it quite easy to find relevant images (high precision is likely easy) but it can be extremely difficult to obtain a high recall since some images showing sky might have hardly any visual similarity with a given query.This task can be considered a personal photo retrieval task,e.g. a user with a collection of personal vacation pictures is looking for images from the same vacation, or showing the same type of building.

Read More

4. IRMA-10000 database

The IRMA database consists of 10,000 fully annotated radio-graphs taken randomly from medical routine at the RWTH Aachen University Hospital. The images are split into 9,000training and 1,000 test images. The images are sub dividedinto 57 classes. The IRMA database was used in the ImageCLEF 2005 image retrieval evaluation for the automatic annotation task. For CBIR, the relevances are defined by the classes, given a query image from a certain class, all database images from the same class are considered relevant
Read More

5. ZuBuD database

The “Zurich Buildings Database for Image Based Recognition”(ZuBuD) is a database which has been created by the Swiss Federal Institute of Technology in Zurich. The database consists of two parts, a training part of 1,005images of 201 buildings, 5 of each building and a query part of 115 images. Each of the query images contains one of the buildings from the main part of the database. The pictures of each building are taken from different viewpoints and some of them are also taken under different weather conditions and with two different cameras. Given a query image, only images showing exactly the same building are considered relevant.

Download

6. UCID database (Suggested)

The UCID database13 was created as a benchmark database for CBIR and image compression applications. This database is similar to the UW database as it consists of vacation images and thus poses a similar task.For 264 images, manual relevance assessments among all database images were created, allowing for performance evaluation. The images that are judged to be relevant are images which are very clearly relevant, e.g. for an image showing a particular person, images showing the same person are searched and for an image showing a football game, images showing football games are considered to be relevant. The used relevance assumption makes the task easy on one hand,because relevant images are very likely quite similar, but on the other hand, it makes the task difficult, because there are likely images in the database which have a high visual similarity but which are not considered relevant. Thus, it can be difficult to have high precision results using the given rel-evance assessment, but since only few images are considered relevant, high recall values might be rather easy to obtain.

Download
7.Yaroslav Bulatov OCR dataset

<Yaroslav Bulatov> I've collected this dataset for a project that involves automatically reading bibs in pictures of marathons and other races. This dataset is larger than robust-reading dataset of ICDAR 2003 competition with about 20k digits and more uniform because it's digits-only. I believe it is more challenging than the MNIST digit recognition dataset. 
I'm now making it publicly available in hopes of stimulating progress on the task of robust OCR. Use it freely, with only requirement that if you are able to exceed 80% accuracy, you have to let me know ;) 
The dataset file contains raw data (images), as well as Weka-format ARFF file for simple set of features. 
For completeness I include matlab script used to for initial pre-processing and feature extraction, Python script to convert space-separated output into ARFF format. Check "readme.txt" for more details.

Download

8. Microsoft Object Class Recognition
  1. Database of thousands of weakly labelled, high-res images. Please, click here to download the database.
  2. Pixel-wise labelled image database v1 (240 images, 9 object classes). Please, clickhereto download the database. This database was used in paper 1 below and in the above demo video.
  3. Pixel-wise labelled image database v2(591 images, 23 object classes). Please, clickhereto download the database.
  4. Pixel-wise labelled image database of textile materials. Please, clickhere to download the database.

References:

1.  Deselaers, T., Keysers, D., and Ney, H. 2008. Features for image retrieval: an experimental comparison. Inf. Retr. 11, 2 (Apr. 2008), 77-107. DOI=http://dx.doi.org/10.1007/s10791-007-9039-3
2. S. A. Chatzichristofis, K Zagoris, Y. S. Boutalis and Nikolas Papamarkos, “ACCURATE IMAGE RETRIEVAL BASED ON COMPACT COMPOSITE DESCRIPTORS AND RELEVANCE FEEDBACK INFORMATION”, «International Journal of Pattern Recognition and Artificial Intelligence (IJPRAI) », to Appear, 2009
from: http://savvash.blogspot.com/2008/12/benchmark-databases-for-cbir.html

基于内容的图像检索中常用的标准图像库 Benchmark databases for CBIR相关推荐

  1. 基于内容的图像检索系统设计与实现--颜色信息--纹理信息--形状信息--PHASH--SHFT特征点的综合检测项目,包含简易版与完整版的源码及数据!

    百度云提取源码以及数据包,直接下载压缩包解压就可以使用,数据就在压缩包文件dataset中. 简化版:只有-颜色信息–纹理信息–形状信息–PHASH–SHFT特征点的综合检测 [百度云链接,提取码:6 ...

  2. 图像处理(4)--基于内容的图像检索

    目录 1. 为什么需要基于内容的图像检索(CBIR) 2. 查询方式和现有系统 3. 具体内容 3.1 特征提取 3.2 颜色特征 3.3 纹理特征 3.4 形状特征 3.5 相关反馈 3.6 索引结 ...

  3. 基于内容的图像检索概述

    摘要:我们现在处于信息爆炸的时代,各种海量信息充斥在我们周围,如何能在海量的数据中搜索到我们想要的图像是个很有挑战性的研究课题.本文简要分析了目前基于内容的图像检索(CBIR)的几种主要方法,如颜色, ...

  4. 基于内容的图像检索软件库LIRE的特征提取方法综述

    LIRE(Lucene Image Retrieval ) 是利用Apache Lucene 建立索引进行图像检索的开源软件库.该软件项目的网址是 http://lire-project.net.LI ...

  5. 基于内容的图像检索技(CBIR)术相术介绍

    基于内容的图像检索技(CBIR)术相术介绍 kezunhai@gmail.com http://blog.csdn.net/kezunhai 近20年来,计算机与信号处理领域如火如荼地发展着,随着普通 ...

  6. 【CBIR】基于内容的图像检索技(CBIR)术相术介绍

    基于内容的图像检索技(CBIR)术相术介绍 转载之: kezunhai  出处: http://blog.csdn.net/kezunhai 近20年来,计算机与信号处理领域如火如荼地发展着,随着普通 ...

  7. 基于内容的图像检索系统(合集)

    基于内容的图像检索,即CBIR(Content-based image retrieval),是计算机视觉领域中关注大规模数字图像内容检索的研究分支.典型的CBIR系统,允许用户输入一张图片,以查找具 ...

  8. 图像特征计算与表示——基于内容的图像检索

    1️⃣作业需求 给定不少于100幅合适的图像集合,尺寸可不一,任意选一张图像,并人工给定图像中的一个目标区域,如人脸.楼房.狗等,要求设计一个基于内容的图像检索方法,它能在剩余的图像中找出5张包含最类 ...

  9. 基于内容的图像检索技术

    转:https://blog.csdn.net/u013087984/article/details/52038980 图像检索:基于内容的图像检索技术 2016年06月05日  图像检索  图像检索 ...

最新文章

  1. 基于px2rpx-loader,探讨一下loader的封装思想
  2. Javascript基础系列之(三)数据类型 (数值 Number)
  3. PLSQL Developer概念学习系列之如何正确登录连接上Oracle(图文详解)
  4. mfc指示灯报警显示_消防水炮需要外置声光报警吗
  5. java8 lambda python_【学习笔记】java8 Lambda表达式语法及应用
  6. php图片上传报502,PHPStrom上传文件报502错误原因,_PHP教程
  7. C语言程序练习-L1-015 跟奥巴马一起画方块 (15分)
  8. 通过rpm包安装、配置及卸载mysql的详细过程.
  9. 我如何向团队解释依赖注入
  10. zookeeper伪集群部署
  11. leetcode刷题六z字形变换
  12. Spring 阶段总结
  13. bzoj1565【NOI2009】植物大战僵尸(最小割)
  14. java 输入输出流知识_Java知识点总结(JavaIO-字节流)
  15. 国内pinterest模式昙花一现 社交电商不该这么玩
  16. 3.卷2(进程间通信)---System V IPC
  17. 是谁断送了网络工程师的前途
  18. Regex 量词Quantifier 分组group
  19. python spss modeler 比较_非常值得收藏的 IBM SPSS Modeler 算法简介
  20. C++Pollard_rho分解质因数及其例题—————Prime Test

热门文章

  1. 不到 200 行代码,教你如何用 Keras 搭建生成对抗网络(GAN)
  2. 基因组与数据整合:DNA应用开发正在临近
  3. java function获取参数_「Java容器」ArrayList源码,大厂面试必问
  4. Apache Kafka-生产者_批量发送消息的核心参数及功能实现
  5. Spring OXM-XStream流化对象
  6. mysql连接优先级设置_MySQL的按优先级等效连接
  7. 怎么解决线上CPU100%的问题
  8. 项目实战-linux下安装activeMQ
  9. python中有没有switch_Python为什么没有switch/case语句?
  10. python网页优化公司_使用python优化scipy.optimize.minimize公司