I recently discovered that a blog called (seriously) "Google Chrome Browser" was reblogging my site. (It of course has NO relationship to Google or the lovely folks on the Chrome team.)

我最近发现一个名为(严重)“ Google Chrome浏览器”的博客正在重新博客我的网站。 (当然,它与Google或Chrome团队中的可爱人物没有关系。)

This is a splog or "spam blog." It's less of a blog and more of a 'suck your feed in and reblog it.' Basically every post is duplicated or sucked in via RSS from somewhere else.  I get this many times a week and have for years.

这是一个博客或“垃圾博客”。 与其说是博客,不如说是“吸收您的提要并重新博客”。 基本上,每个帖子都是从其他地方通过RSS复制或吸取的。 我一个星期有很多次,已经好几年了。

However, this particular site started showing up ahead of mine in searches and that's not cool.


Worse yet, they have almost 25k followers on Twitter. I've asked them a few times to stop doing this, but this time I got tired of it.

更糟糕的是,他们在Twitter上有将近25,000的关注者。 我已经问过他们几次停止这样做,但是这次我已经厌倦了。

They're even 'hotlinking' my images, which means that all my PNGs are still hosted on my site. When you visit their site, the text is from my RSS but I pay for the images bandwidth. The irony of this is thick. Not to mention my copyright notice is intact on their site. ;)

他们甚至“热链接”我的图像,这意味着我所有的PNG仍托管在我的网站上。 当您访问他们的网站时,文字来自我的RSS,但我需要支付图片带宽。 具有讽刺意味的是,这很厚道。 更不用说我的版权声明在他们的网站上完整无缺。 ;)

When an image is linked to from another domain the HTTP_REFERER header is populated with the location that the image is linked from. That means when my web server gets a request for 'foo.png' from the Google Chrome Browser blog I can see the page that asked for that image.

当图像从另一个域链接到时,HTTP_REFERER标头会填充图像所链接到的位置。 这意味着当我的网络服务器从Google Chrome浏览器博客中获取“ foo.png”请求时,我可以看到要求该图像的页面。

For example:


Request URL:http://www.hanselman.com/blog/content/binary/Windows-Live-Writer/How-to-run-a-Virtual-Conference-for-10_E53C/image_5.pngRequest Method:GETReferer:http://google-chrome-browser.com/penny-pinching-cloud-how-run-two-day-virtual-conference-10

Because this differentiates the GET request that means I can do something about it. This brings up a few important things to remember in general about the web that I feel a lot of programmers forget about:

因为这可以区分GET请求,这意味着我可以对此做一些事情。 这带来了一些有关Web的一般要记住的重要事项,我感到很多程序员都忘记了:

  • The Internet is not a black box.


  • You can do something about it.您可以为此做些事情。

That said, I want to detect these requests and serve a different image.


If I was using Apache and had an .htaccess file, I might do this:


RewriteCond %{HTTP:Referer} ^.*http://(?:www\.)?computersblogsexample.info.*$RewriteHeader Referer: .* damn\.spammers

RewriteCond %{HTTP:Referer} ^.*http://(?:www\.)?google-chrome-browser.*$RewriteHeader Referer: .* damn\.spammers

#make more of these for each evil spammer

RewriteCond %{HTTP:Referer} ^.*damn\.spammers.*$RewriteRule ^.*\.(?:gif|jpg|png)$ /images/splog.png [NC,L]

Since I'm using IIS, I'll do similar rewrites in my web.config. I could do a whitelist where I only allow hotlinking from a few places, or a blacklist where I only block a few folks. Here's a blacklist.

由于使用的是IIS,因此我将在web.config中进行类似的重写。 我可以做一个白名单,只允许从几个地方进行热链接,或者做一个黑名单,只阻止一些人。 这是黑名单。

<system.webServer>  <rewrite>    <rules>      <rule name="Blacklist block" stopProcessing="true">          <match url="(?:jpg|jpeg|png|gif|bmp)$" />          <conditions>              <add input="{HTTP_REFERER}" pattern="^https?://(.+?)/.*$" />              <add input="{DomainsBlackList:{C:1}}" pattern="^block$" />              <add input="{REQUEST_FILENAME}" pattern="splog.png" negate="true" />          </conditions>          <action type="Redirect" url="http://www.hanselman.com/images/splog.png" appendQueryString="false" redirectType="Temporary"/>      </rule>    </rules>    <rewriteMaps>              <rewriteMap name="DomainsBlackList" defaultValue="allow">                  <add key="google-chrome-browser.com" value="block" />                  <add key="www.verybadguy.com" value="block" />                  <add key="www.superbadguy.com" value="block" />              </rewriteMap>    </rewriteMaps>  </rewrite></system.webServer>

I could have just made a single rule and put this bad domain in it but it would have only worked for one domain, so instead my buddy Ruslan suggested that I make a rewritemap and refer to it from the rule. This way I can add more domains to block as the evil spreads.

我本可以只制定一条规则,然后将这个坏域放入其中,但它只适用于一个域,所以我的伙伴Ruslan建议我制作一个rewritemap并从规则中引用它。 这样,我可以添加更多域来阻止邪恶蔓延。

It was important to exclude the splog.png file that I am going to redirect the bad guy to, otherwise I'll get into a redirect loop where I redirect requests for the splog.png back to itself!


The result is effective. If you visit their site, I'll issue an HTTP 307 (Moved Temporarily) and then you'll see my splog.png image everywhere that they've hotlinked my image.

结果是有效的。 如果您访问他们的网站,我将发出HTTP 307(临时移动),然后在他们与我的图像进行热链接的任何地方都会看到splog.png图像。

If you wanted to change the blacklist to a white list, you'd reverse the values of allow and block in the rewrite map:


<rewriteMaps>    <rewriteMap name="DomainsBlackList" defaultValue="block">         <add key="google-chrome-browser.com" value="allow" />        <add key="www.verybadguy.com" value="allow" />         <add key="www.superbadguy.com" value="allow" />    </rewriteMap>  </rewriteMaps>

Nice, simple and clean. I don't plan on playing "whac a mole" with sploggers as it's a losing game, but I will bring down the ban-hammer on particularly obnoxious examples of content theft, especially when they mess with my Google Juice.

不错,简单干净。 我不打算与发行人一起玩“大吃大喝”,因为这是一场失败的游戏,但是我将对那些令人讨厌的内容盗窃示例(特别是当它们与我的Google Juice混为一谈时)放任自流。

翻译自: https://www.hanselman.com/blog/blocking-image-hotlinking-leeching-and-evil-sploggers-with-iis-url-rewrite


