matlab 图像接缝

介绍(Introduction)

In this article, we will be taking a deep dive into an interesting algorithm known as “Seam Carving”. It does a seemingly impossible task of resizing an image without cropping it or distorting its contents. We will build our way up, to implement the seam carving algorithm from scratch while taking a peek at some of the interesting maths behind it.

在本文中,我们将深入研究一种有趣的算法,称为“缝隙雕刻”。 调整图像的大小而不裁剪或扭曲其内容似乎是不可能完成的任务。 我们将逐步构建,从头开始实现接缝雕刻算法,同时了解其背后的一些有趣数学。

Tad knowledge about calculus will be helpful to follow along but is not required. So let’s begin.(This article is inspired by a lecture from Grant Sanderson at MIT.)

关于微积分的Tad知识将有助于后续工作,但不是必需的。 因此,让我们开始吧。 (本文的灵感来自麻省理工学院的格randint·桑德森的演讲。)

问题: (The Problem:)

Let’s take a look at this image.

让我们看一下这张图片。

The painting, done by Salvador Dali, is named “The Persistence of Memory”. Rather than the artistic value, we are more interested in the contents of the painting. We want to resize the picture by decreasing its width. Two valid processes we can think of are cropping the picture or squeezing the width.

萨尔瓦多·达利(Salvador Dali)完成的这幅画被命名为“记忆的持久性”。 我们对绘画的内容更感兴趣,而不是艺术价值。 我们要通过减小图片的宽度来调整图片的大小。 我们可以想到的两个有效过程是裁剪图片或压缩宽度。

But as we can see, cropping removes many of the objects, and squeezing distorts the pics. We want the best of both i.e. decrease the width without having to crop out any object or without distorting the objects.

但是,正如我们所看到的,裁剪会删除许多对象,并且挤压会扭曲图片。 我们都希望两者兼有,即在不裁切任何物体或不扭曲物体的情况下减小宽度。

As we can see, along with the objects, there are also a lot of empty spaces in the picture. What we want to accomplish here is to somehow remove those empty areas between the objects, so that the interesting parts of the picture remains while throwing away the unnecessary spaces.

我们可以看到,除了对象之外,图片中还有很多空白。 我们要在此处完成的任务是以某种方式删除对象之间的空白区域,以便保留图像中有趣的部分,同时丢弃不必要的空间。

This is indeed a tough problem, and it’s easy to get lost. So, it is always a good idea to split the problem into smaller more manageable parts. We can split this problem into two parts.

这确实是一个棘手的问题,很容易迷路。 因此,将问题分解为更小,更易于管理的部分始终是一个好主意。 我们可以将这个问题分为两个部分。

  1. Identifying interesting parts (i.e. objects) in the picture.识别图片中有趣的部分(即对象)。
  2. Identifying pixel paths which can be removed without distorting the picture.标识可以去除而不会扭曲图片的像素路径。

识别对象: (Identifying the Objects:)

Before moving forward we would want to convert our picture to greyscale. It would be helpful for the operations we would be doing later. Here is a simple formula to convert an RGB pixel to a greyscale value.

在继续之前,我们需要将图片转换为灰度。 这将对我们稍后进行的操作很有帮助。 这是一个将RGB像素转换为灰度值的简单公式。

def rgbToGrey(arr): greyVal = np.dot(arr[...,:3], [0.2989, 0.5870, 0.1140])return np.round(greyVal).astype(np.int32)

For identifying the objects, we can make a strategy. What if we can somehow identify all the edges in the picture? Then we can ask the seam carving algorithm to take the pixel paths which don’t travel through the edges so, by extension, any area closed by edges won’t be touched.

为了识别对象,我们可以制定策略。 如果我们能以某种方式识别图片中的所有边缘怎么办? 然后,我们可以要求接缝雕刻算法采用不通过边缘的像素路径,因此,通过扩展,不会碰触任何由边缘封闭的区域。

But then, how are we going to identify the edges? One observation we can make is, whenever there is a sharp change in color between two adjacent pixels, it’s most likely will be an edge to an object. We can rationalize this immediate change in color as the starting of a new object from that pixel.

但是,我们如何识别边缘呢? 我们可以看到的一个观察结果是,每当两个相邻像素之间的颜色发生急剧变化时,最有可能是物体的边缘。 我们可以将这种立即的颜色变化合理化,作为从该像素开始的新对象的开始。

The next problem we have to tackle is how to identify sharp changes in the pixel value. For now, let’s think of a simple case, a single line of pixels. Say we denote this array of values as x.

我们必须解决的下一个问题是如何识别像素值的急剧变化。 现在,让我们考虑一个简单的情况,即一行像素。 假设我们将此值数组表示为x

We may take the difference between the pixels x[i+1], x[i]. It would show how much our current pixel varies from the right-hand side. Or we can also take the difference of x[i] and x[i-1] which would give change on the left-hand side. For denoting the total change we may want to take the average of both, which yields,

我们可以取像素x [i + 1],x [i]之间的差。 它会显示我们当前的像素在右侧有多少变化。 或者我们也可以取x [i]x [i-1]之差,这将在左侧产生变化。 为了表示总变化,我们可能要取两者的平均值,得出

Anyone familiar with calculus can quickly identify this expression as the definition of a derivative. That’s right. We need to calculate the sharp change in x value so we are calculating the derivative of it. One more eager observation we can have is, if we define a filter say [-0.5,0,0.5] and took multiply it element-wise with the array [ x[i-1],x[i],x[i+1]] and took its sum, it would give the derivative at x[i]. As our picture is 2d we would need a 2d filter. I won’t go into the details, but the 2d version of our filter looks like this,

熟悉微积分的任何人都可以快速地将此表达式识别为导数的定义。 那就对了。 我们需要计算x值的急剧变化,因此我们正在计算它的导数。 如果我们定义一个过滤器[ -0.5,0,0.5 ]并将其与数组[ x [i-1],x [i],x [i + [1] ]并取其总和,将得出x [i]的导数。 由于我们的图片是2d的,因此我们需要2d滤镜。 我不会详细介绍,但是我们的过滤器的2D版本看起来像这样,

As our filter calculates the derivative for each pixel along the x-axis it would give the vertical edges. Similarly, if we calculate the derivatives along the y-axis, we will have the horizontal edges. The filter for it would be the following. (It is the same as the filter for x-axis upon transpose.)

当我们的过滤器计算沿x轴的每个像素的导数时,它将给出垂直边缘。 同样,如果我们沿y轴计算导数,则将具有水平边缘。 过滤器如下。 (与转置时用于x轴的滤镜相同。)

These filters are also known as Sobel Filters.

这些过滤器也称为Sobel过滤器

So, we have two filters that need to travel across the picture. For each pixel, doing an element-wise multiplication with (3X3) submatrix surrounding it then taking its sum. This operation is known as Convolution.

因此,我们有两个需要遍历图片的滤镜。 对于每个像素,请对其周围的(3X3)子矩阵进行逐元素乘法,然后求和。 此操作称为卷积。

卷积: (Convolution:)

Mathematically the convolution operation is defined as,

在数学上,卷积运算定义为

See how we do a pointwise multiplication of both functions then take its integration. Numerically it would correspond to what we did earlier i.e. element-wise multiplication of the filter and the image then taking the sum over it.

看看我们如何对两个函数进行逐点乘法,然后对其进行积分。 从数值上讲,这将与我们之前所做的相对应,即滤波器和图像的逐元素相乘,然后对其求和。

Notice, how for the k function it is written as k(t-τ). Because for the convolution operation one of the signals needs to be flipped. You can intuitively imagine it as something like this. Imagine two trains, on a straight horizontal track, are running towards each other for an inevitable collision (don’t worry nothing would happen to the trains because, Superposition). So the heads of the trains would be facing each other. Now imagine that you are scanning the track from left to right. Then for the left train, you would scan from rear to head.

注意,对于k函数,它如何写为k(t-τ) 。 因为对于卷积运算,信号之一需要翻转。 您可以直观地将其想象成这样。 想象一下,两列火车在一条直线的水平轨道上相互朝着一个不可避免的碰撞(不必担心,因为叠加,火车不会发生任何事情)。 因此,火车头将彼此面对。 现在,假设您正在从左到右扫描轨道。 然后,对于左列火车,您将从后向头扫描。

Similarly, the computer needs to read our filters from the bottom-right (2,2) corner to the top-left (0,0) instead of from the top-left to the bottom-right. So the actual Sobel filters are the following,

同样,计算机需要从右下角(2,2)角到左上角(0,0)而不是从左上角到右下角读取过滤器。 因此,实际的Sobel过滤器如下所示,

upon which we do the 180-degree rotation before the convolution operation.

在进行卷积运算之前,我们先进行180度旋转。

We can go on and write a simple naive implementation to do the convolution operation. It would be something like this,

我们可以继续编写一个简单的天真的实现来进行卷积运算。 这样的话

def naiveConvolve(img,ker): res = np.zeros(img.shape) r,c = img.shape rK,cK = ker.shape halfHeight,halfWidth = rK//2,cK//2 ker = np.rot90(ker,2) img = np.pad(img,((1,1),(1,1)),mode='constant') for i in range(1,r+1): for j in range(1,c+1): res[i-1,j-1] = np.sum(np.multiply(ker,img[i-halfHeight:i+halfHeight+1,j-halfWidth:j+halfWidth+1])) return res

This would work just fine but will take an excruciating amount of time to be executed, as it would be doing nearly 9*r*c multiplications and additions to arrive at the result. But we can be smart and use more concepts from math to decrease the time complexity drastically.

这将很好地工作,但是将花费大量时间来执行,因为它将进行近9 * r * c的乘法和加法运算以得出结果。 但是我们可以很聪明,可以使用数学中的更多概念来大大减少时间复杂度。

快速卷积: (Fast Convolution:)

Convolutions have an interesting property. Convolutions in the time domain correspond to multiplication over the frequency domain. I.e.

卷积具有有趣的性质。 时域中的卷积对应于频域上的乘法。 即

, where F(w) denotes the function in the frequency domain.

,其中F(w)表示频域中的函数。

We know that Fourier transform converts a signal in the time domain to its frequency domain. So what we can do is, calculate the Fourier Transform of the image and the filter, multiply them, then take an Inverse Fourier Transform to get the convolution results. We can use the NumPy library for this.

我们知道傅立叶变换将时域中的信号转换成其频域。 因此,我们可以做的是计算图像和滤波器的傅立叶变换,将它们相乘,然后进行傅立叶逆变换以获得卷积结果。 我们可以为此使用NumPy库。

def fastConvolve(img,ker): imgF = np.fft.rfft2(img) kerF = np.fft.rfft2(ker,img.shape) return np.fft.irfft2(imgF*kerF)

(Note: Values might be slightly different from the naive method in some cases, as the fastConvolve function calculates a circular convolution. But in practice we can comfortably use the fast convolution without worrying about these small differences in values.)

(注意:在某些情况下,值可能与朴素方法稍有不同,因为fastConvolve函数会计算圆形卷积。但是实际上,我们可以轻松地使用快速卷积,而不必担心这些较小的值差异。)

Cool! Now we have an efficient way to calculate the horizontal edges and the vertical edges, I.e the x and the y components. Thus, calculate the edges in the image using,

凉! 现在,我们有了一种有效的方法来计算水平边缘和垂直边缘,即x和y分量。 因此,使用以下方法计算图像中的边缘:

def getEdge(greyImg):sX = np.array([[0.25,0.5,0.25], [0,0,0], [-0.25,-0.5,-0.25]]) sY = np.array([[0.25,0,-0.25], [0.5,0,-0.5], [0.25,0,-0.25]]) #edgeH = naiveConvolve(greyImg,sX) #edgeV = naiveConvolve(greyImg,sY) edgeH = fastConvolve(greyImg,sX) edgeV = fastConvolve(greyImg,sY) return np.sqrt(np.square(edgeH) + np.square(edgeV))

Awesome. We completed the first part. The edges are the interesting parts of the picture and the black parts are what we can remove without worrying.

太棒了我们完成了第一部分。 边缘是图片中有趣的部分,黑色部分是我们可以不用担心消除的部分。

识别像素路径: (Identifying Pixel Paths:)

For a continuous path, we can define a rule that each pixel only connects to the 3 nearest pixels below it. This is to have a continuous path of pixels from top to bottom. So our subproblem becomes a basic pathfinding problem where we have to minimize the cost. As edges have higher magnitude, if we continue to remove the pixel paths with the lowest cost, it would avoid the edges.

对于连续路径,我们可以定义一个规则,即每个像素仅连接到它下面的3个最近的像素。 这将使像素从上到下具有连续的路径。 因此,我们的子问题成为基本的寻路问题,我们必须将成本降到最低。 由于边缘具有更高的幅度,如果我们继续以最低的成本移除像素路径,它将避免出现边缘。

Let’s define a function “ cost” which takes a pixel and calculates the minimum cost pixel path to reach from there to the end of the pic. We have the following observations,

让我们定义一个函数“ cost” ,该函数获取一个像素并计算从那里到图片结尾的最小成本像素路径。 我们有以下观察,

  1. In the bottom-most row (ie. i=r-1)在最底行(即i = r-1)

2. For any intermediate pixel,

2.对于任何中间像素,

def findCostArr(edgeImg): r,c = edgeImg.shape cost = np.zeros(edgeImg.shape) cost[r-1,:] = edgeImg[r-1,:] for i in range(r-2,-1,-1): for j in range(c): c1,c2 = max(j-1,0),min(c,j+2) cost[i][j] = edgeImg[i][j] + cost[i+1,c1:c2].min() return cost

We can see the triangular shapes in the plot. They denote the points of no return, i.e. If you reach that pixel, there is no path to the bottom that doesn’t pass through an edge. And that is what we are trying to avoid.

我们可以在图中看到三角形。 它们表示无法返回的点,即,如果到达该像素,则没有底部的路径不会通过边缘。 这就是我们要避免的事情。

From the cost matrix finding a pixel path can be easily done with a greedy algorithm. Find the min-cost pixel on the top row then move downwards selecting the pixel with the least cost among all the pixels connected to it.

从成本矩阵中找到像素路径可以很容易地用贪婪算法完成。 在第一行找到最低成本像素,然后向下移动,在与之相连的所有像素中选择成本最低的像素。

def findSeam(cost): r,c = cost.shape path = [] j = cost[0].argmin() path.append(j) for i in range(r-1): c1,c2 = max(j-1,0),min(c,j+2) j = max(j-1,0)+cost[i+1,c1:c2].argmin() path.append(j) return path

For removing the seam defined by the path we just have to go through each row and drop the column mentioned by the path array.

为了删除路径定义的接缝,我们只需要遍历每一行并删除路径数组提到的列。

def removeSeam(img,path): r,c,_ = img.shape newImg = np.zeros((r,c,3)) for i,j in enumerate(path): newImg[i,0:j,:] = img[i,0:j,:] newImg[i,j:c-1,:] = img[i,j+1:c,:] return newImg[:,:-1,:].astype(np.int32)

And, that is it. Here, I have pre-computed 100 seam carving operations.

而且,就是这样。 在这里,我已经预先计算了100个接缝雕刻操作。

We can see how the objects in the picture have come really close to each other. We have successfully decreased the size of the image using the Seam Carving algorithm without causing any distortion to the objects. I have attached the link for the notebook with the full code. Interested readers can take a look here.

我们可以看到图片中的对象是如何彼此真正接近的。 我们已经使用接缝雕刻算法成功减小了图像的大小,而不会导致对象变形。 我已经附上了完整代码的笔记本链接。 有兴趣的读者可以在这里看看。

Overall, Seam Carving is a fun algorithm to play with. It does have it’s caveats as it will fail if the image provided has too many details or too many edges in it. It is always amusing to tinker around different pictures using the algorithm to see what the final result is. If you have any doubts or suggestions then kindly leave them in the responses. Thank you for reading until the end.

总体而言,Seam Carving是一个有趣的算法。 它确实有一些警告,因为如果提供的图像中包含过多的细节或过多的边缘,它将失败。 使用算法查看不同的图片总是很有趣,看看最终结果是什么。 如果您有任何疑问或建议,请将其保留在答复中。 感谢您的阅读,直到最后。

Related Articles:

相关文章:

Originally published at https://www.analyticsvidhya.com on September 15, 2020.

最初于2020年9月15日https://www.analyticsvidhya.com发布

翻译自: https://medium.com/analytics-vidhya/seam-carving-algorithm-a-seemingly-impossible-way-of-resizing-an-image-b3a57469d33e

matlab 图像接缝


http://www.taodudu.cc/news/show-6776835.html

相关文章:

  • 医学AI论文解读 |Circulation|2018| 超声心动图的全自动检测在临床上的应用
  • python 语法 tip 知识集合 常更新,以防忘记
  • ie js html 压缩,H5图片裁剪-压缩-上传-神奇的Croppie.js
  • android 剪切大图,Android大图裁剪解决办法
  • 仿 twitter头像上传组件(vue2)
  • ajax的添加方法,jQuery - AJAX load()方法如何添加2多变量
  • Python+opencv裁剪/截取图片的几种方式
  • 项目中运用cropper插件
  • 【Pytorch学习笔记】数据增强
  • Vue 跨域请求报错No ‘Access-Control-Allow-Origin‘ header is present on the requested resource.
  • iOS 图片裁剪,旋转角度,微调角度 LEGOImageCropper
  • vue图片裁剪固定尺寸/vue-cropper的使用
  • 我与计算机视觉-[CUDA]-[Opencv.Resize的CPU实现和GPU实现]
  • Error: The following dependencies are imported but could not be resolved:
  • opencv学习 Resize and Crop
  • 【现货】AP6317 同步3A锂电充电芯片 带短温度保护
  • PL7022/PL7022B原厂双节/两节锂电池串联充电IC和保护IC
  • 保护板应用方案设计 充电器唤醒电路
  • 电动汽车、车载充电器及其过流保护电路介绍
  • 如何加强对Type-C数据线的充电保护?
  • 杭州驾照驾驶证更换
  • 已拿到驾照!
  • C1驾驶证拿到本本了
  • 拿了驾照不喜欢开车
  • 【生活】驾照C1-拿证总结(完结)
  • 驾考拿证后的总结
  • 获200万天使投资,驾本易专注驾培服务,已帮2000多人拿驾照
  • ccf-bdci 互联网金融新实体发现9st 赛题基础收获总结
  • AI金融:利用LSTM预测股票每日最高价
  • 逻辑回归模型预测股票涨跌

matlab 图像接缝_接缝雕刻算法似乎是无法调整图像大小的方法相关推荐

  1. pso解决tsp matlab,计算智能课程设计_粒子群优化算法求解旅行商问题_Matlab实现.doc...

    计算智能课程设计_粒子群优化算法求解旅行商问题_Matlab实现.doc 摘要:TSP是一个典型的NPC问题.本文首先介绍旅行商问题和粒子群优化算法的基本概念.然后构造一种基于交换子和交换序[1]概念 ...

  2. ipad电池饿死激活方法_询问操作方法:iPad电池寿命,批量调整照片大小以及同步大量音乐收藏...

    ipad电池饿死激活方法 Christmas was good to many of you and now you've got all sorts of tech questions relate ...

  3. 安卓 图像清晰度识别_螺柱焊位置识别算法初稿

    为了保证螺柱焊接时,螺柱焊接位置和螺柱所在凸台偏差太远带来的抱怨,打算采用图像识别的方法识别螺柱在所焊接凸台位置是否居中: 在凸台上的的螺柱焊 通过计算凸台圆弧的圆心和螺柱的圆心偏差来判断螺柱是否在凸 ...

  4. java 图像 截取正方形_响应但是作为img元素的正方形的图像

    一种方法是创建一个响应方形元素,然后使用CSS背景图像而不是HTML img src : 您可以使用 height: 0 , width: x% 和 padding-bottom: y% (或 pad ...

  5. opencv 图像雾检测_雾的检测算法

    雾的检测算法相对来说文献不是很多,这次和大家介绍两篇相对来说比较容易实现的两篇文章,其中一篇是基于灰度直方图的方式进行分析检测,另一篇是将rgb图像空间转化为hsv空间进行分析检测. 1.灰度图检测 ...

  6. python特征选择relieff图像特征优选_基于Relief特征选择算法的研究与应用

    作者姓名导师姓名文献出处论文摘要伴随着当代科学技术的高速发展,人类已经进入了信息爆炸的时代.数据挖掘技术通过从大量数据中揭示出隐含的信息,将海量的高维数据转换为有用的信息和知识.特征选择是数据挖掘中的 ...

  7. 步骤条自定义图片_小技巧丨如何使用word批量调整图片大小?

    导Lead语 不知道你们有没有遇到需要批量修改图片尺寸的难题?那么,在大量图片的前提下,如何才能快速修改呢? 今天就和大家分享一下我当初的解决方法: 1F在word中添加 选择多个对象功能 新建一个W ...

  8. matlab立体坐标定位_【光电视界】视觉导航定位系统工作原理及过程

    今日光电        有人说,20世纪是电的世纪,21世纪是光的世纪:知光解电,再小的个体都可以被赋能.欢迎来到今日光电! ----与智者为伍 为创新赋能---- 当今,由于数字图像处理和计算机视觉 ...

  9. imread函数_MATLAB图像处理:23:使用缩放函数调整图像大小

    本示例说明如何使用imresize函数调整图像大小. 指定放大倍数 将图像读入工作区. I = imread('circuit.tif'); 使用imresize功能调整图像尺寸.在此示例中,您指定放 ...

最新文章

  1. pytorch 加载不对齐预训练
  2. Xcode5搭建Python开发环境
  3. 云痕大数据 家长登录_云痕家长app
  4. 深度学习-Tensorflow2.2-模型保存与恢复{9}-保存与恢复-21
  5. 如何删除Github上一个pull request
  6. 实例49:python
  7. 低字节+高字节+字地址+大端序+小端序全辨析
  8. R语言︱文本挖掘之中文分词包——Rwordseg包(原理、功能、详解)
  9. 大数据如何改善社会治理:国外“大数据社会福祉”运动的案例分析和借鉴
  10. adb java_ADB - javalzy - 博客园
  11. python如何读取二进制文件为图片_python之读取二进制文件
  12. C-Free简单介绍
  13. adobe reader XI 打开后闪退(或过几秒后自动退出)【终极解决方案】
  14. mysql修改表只读属性_VF设置的疑问
  15. 实现透明背景但背景上元素不透明
  16. 四凯模型火箭发动机参数大全
  17. (Note)同比和环比
  18. 桌面点击:右键点击-显示设置,提示“该文件没有与之关联的程序来执行该操作“解决方法总结
  19. MATLAB导入LTspice RAW格式文件教程(绘制波特图)
  20. QQ怎么样设置透明头像?2020最新方法!一个小工具快速搞定!

热门文章

  1. 【Seq2Seq】使用 RNN 编码器-解码器学习短语表示以进行统计机器翻译
  2. Spring入门【自用笔记】
  3. Linux系统编程——特殊进程之僵尸进程
  4. APP“IP摄像头” 与OpenCV视频流读取
  5. echart 地图添加了滚轮放大缩小导致二级地图无法居中问题
  6. Enable VT-x in your BIOS security settings (refer to document for your computer)
  7. 不root怎么将FDex2反编译的dex文件拷出来
  8. 同态加密库 HEAAN效率测试(4)
  9. PR导出文件大小很小只有几十k且不能播放问题
  10. Share-Nothing架构