Search Engine Hacking – Manual and Automation

Ethical Hacking Boot Camp

OUR MOST POPULAR COURSE!

CLICK HERE!

Skillset

What's this?

Practice for certification success with the Skillset library of over 100,000 practice test questions. We analyze your responses and can determine when you are ready to sit for the test.

Introduction:

We are all aware of Google/Yahoo/Bing Search engines; they need no introduction. We use them every now and then to solve our day-to-day queries.

Google and other search engines use automated programs called spiders or crawlers. Also, these search engines have a large index of keywords, and where those words can be found. Powerful crawling and indexing features make these search engines not only powerful but also opens doors for hackers to use for identifying vulnerable targets over the internet. This is called Search Engine Hacking.

Search Engine Hacking involves using advanced operator-based searching to identify exploitable targets and sensitive data using the search engines.

In this article, we learn to use various Google search operators to identify vulnerable targets over the Internet and also check out a new tool that can be used to automate this process.

Special Search Characters:

Google search engine provides its users with various special search characters for advanced searching. See a partial list below:

  1. Quotes [“search query”]: Quotes are used to search for specific phrase or set of words.

    E.g. The query [“The monk who sold his Ferrari”] will search for the specific phrase —The monk who sold his Ferrari.

  2. Minus Sign [-]: The minus sign tells Google search engine to exclude the word that follows the minus operator.

    E.g. [-red apple] will display the search results which will exclude the word red.

  3. Tilde operator [~]: Adding a tilde operator in front of a word will search for results containing that word as well as even more synonyms.

    E.g. [~jokes] will display search results which will include the word jokes as well as its synonyms like funny, humor, etc.


  4. OR operator or vertical bar [|]: Using OR (in uppercase) or the vertical bar with two or more keywords, tells Google to search for pages that contain either of the words.

    E.g. [Android OR Apple] will display search results containing either of the words.

  5. Asterisk operator [*]: The asterisk is a computer symbol for a wildcard, which allows the search engine, such as Google, to fill in that space with any text string. You can also use it within double quotes for more precise searches.

    E.g. The query [“today is * day”] will display search results like “today is a good day” or “today is mother’s day”, etc.

Basic Searching Techniques:

Google search engine provides various operators to customize our search results.

The basic syntax of a Google advanced operator is

operator:search_term

The list below provides some of the key operators useful in creating search queries to retrieve valuable information from the web.

  1. Intitle operator:

    The query [intitle:keyword] in the search engine will return pages containing the keyword in the title.

    E.g. 1: The query [intitle:Google] will return all the web pages containing Google in the title.

    E.g. 2: Google Hacking using intitle operator

    Using the query [intitle:”Index of”] will return all the web pages containing “Index of” in the title. This can be used to identify if Directory Listing (Directory Listing displays a list of the directory contents) is enabled on the web server.

  2. Site operator:

    The query [site:www.site.com] narrows a search to a particular site, domain or sub-domain.

    E.g. 1: The query [news site:yahoo.com] will search for the keyword “news” on the site and the sub-domains of Yahoo.com.

    E.g. 2: Google Hacking – Information gathering on sub domains

    The query [site:yahoo.com] will display search results containing all the sub-domains of yahoo.com. This operator is useful for gathering information on the sub-domains of a specific target site.

  3. Inurl operator:

    The query inurl:keyword in the search engine will return pages containing the keyword in the URL.

    E.g. 1 – The query [inurl:contactus site:www.MySite.com] will search for pages on MySite in the URL containing the word “contactus”.

    E.g. 2 – Google Hacking – Looking for Admin Portals

    The query [inurl:admin.php] will search for all the websites that might have admin login pages. These pages attract the hackers and they might brute force the login page to gain access to the admin interface.

  4. Cache operator:

    Google keeps the snapshot of the pages it has crawled. The query [cache:keyword] in the search engine displays Google’s cached version of the page.

    E.g. – The query [cache:www.yahoo.com] will display cached pages of the website Yahoo.com. The above directive can be useful in gathering information from the previously cached pages.

    Another very useful website that can be used to obtain the cached pages is http://archive.org/

    This websites stores a snapshot of the websites in a calendar format, and can be used to view the pages of any previous date. The screenshot below displays a cached page of Yahoo.com dated 9 Feb 2010.

    Click to Enlarge

  5. Filetype operator:

    The query [filetype:file extension] searches for pages that end in a particular file extension. Google can search for many different types of files like pdf, doc, image, rtf, ppt, xls, etc.

    E.g. The query [filetype:pdf site:yahoo.com] will return all the links to pdf files found on Yahoo.com.

Google Hacking through keyword search

Let’s look at some of the keyword searches and the operators that can be used to build search queries to carry out Google Hacking.

  1. Digging Google for Configuration Files:

    Configuration files are used to configure the initial settings for some computer programs. An attacker having access to the configuration file can get a complete understanding of the program deployed.

    For e.g. a Google query like [filetype:ini inurl:ws_ftp.ini] would retrieve the configuration file used by the WS_FTP client program as shown in the screenshot below:

  2. Digging Google for Log Files:

    The web servers log information like IP address, timestamps, HTTP request, usernames and password in to the log files. These log files are usually stored with the extension .log on the server side and may be accessible over the internet due to inadequate protection.

    For e.g. a Google query like [filetype:log cron.log] would retrieve the UNIX cron log as shown in the screenshot below:

    Click to Enlarge

  3. Digging Google for database leakage information from web applications:

    Google Hackers search Google for pieces of database information leaked from vulnerable servers. This information can be used to identify a vulnerable target and launch a more sophisticated attack against the target.

    For example, a Google query like [filetype:inc intext:mysql_connect

    ] will retrieve the .inc file that contains the mysql user credentials and other functions details that are used to connect to the database.

  4. Digging Google for leakage of information though error messages:

    Information leakage through error messages are very much useful for information gathering and launching further attacks on the websites. If the application does not have exception/error handling mechanisms, it might leak sensitive details in the error messages like database details, error stack trace details, etc.

    E.g. a Google query like [intitle:”Apache Tomcat” “Error Report”] will display search results containing the Apache Tomcat error messages.

We discussed a brief on the directives that can be used to carry out search engine hacking. Manually trying out each of these directives can be a cumbersome task. To automate the process of search engine hacking and retrieving juicy information, we make use of automated tools.

Automated tools available for Google Hacking:

  • Gooscan – Gooscan is a tool that automates queries against Google search appliances, but with a twist. These particular queries are designed to find potential vulnerabilities on web pages.

    Ref:
    http://www.securitytube-tools.net/index.php@title=Gooscan.html

  • Sitedigger – SiteDigger searches Google’s cache to look for vulnerabilities, errors, configuration issues, proprietary information, and interesting security nuggets on web sites.

    Ref:
    http://www.mcafee.com/in/downloads/free-tools/sitedigger.aspx

  • Wikto – This is a multipurpose tool developed by Sensepost which can be used for automating Google Hacking.

    Ref:
    http://research.sensepost.com/tools/web/wikto

The above tools provide are useful for Google Hacking. However, let’s look at a new tool called Search Diggity, which provides a graphical user interface and is useful in retrieving lot information from both Bing as well as Google search engine.

Search Diggity:

It is Stach & Liu’s MS Windows GUI application that serves as a front-end to the most recent versions of the Diggity tools:

  • GoogleDiggity
  • BingDiggity
  • Bing LinkFromDomainDiggity
  • CodeSearchDiggity, DLPDiggity
  • FlashDiggity
  • MalwareDiggity
  • PortScanDiggity
  • SHODANDiggity
  • BingBinaryMalwareSearch
  • NotInMyBackYard Diggity

More information on these modules can be found here: Ref:
http://www.stachliu.com/resources/tools/google-hacking-diggity-project/attack-tools/

Let’s explore a few of the above key modules of interest to learn about the art of search engine hacking.

GoogleDiggity:

The Google Diggity tool automates the Google Hacking process. It queries the search engine using the Google JSON/ATOM Custom Search API to identify vulnerabilities and information disclosures.

The Google Search engine uses a bot detection technique. As a result querying Google using automated tools for Google hacking. This is overcome with the use of Google JSON/ATOM Custom Search API, which uses an API key. A user can register for an API key against a valid Gmail account and get a free 100 requests/day. Additional queries are available at a cost (Google charges $5 per 1000 queries).

The tool provides a well-structured interface that allows the user to:

  • Select the search queries from the list
  • Feed the API key
  • Specify the target site/domain/IP address
  • Scan button to kick of the scan, etc.

Bing Diggity:

Similar to GoogleDiggity, Bing Diggity is a Bing search engine hacking tool. It utilizes the Bing 2.0 API (The Bing 2.0 API allows 1000 results per query) and the Stach & Liu’s newly developed Bing Hacking Database (BHDB) to find vulnerabilities and sensitive information disclosures related to your organization that are exposed via Microsoft’s Bing search engine.

The tool provides a well-structured interface that allows the user to:

  • Select the search queries specific to Bing search engine from the list
  • Feed the API key
  • Specify the target site/domain/IP address
  • Scan button to kick of the scan, etc.

DLPDiggity:

DLPDiggity is a data loss prevention tool that leverages Google/Bing to identify exposures of sensitive info (e.g. SSNs, credit card numbers, etc.) via common document formats such as .doc, .xls, and .pdf. First, GoogleDiggity and BingDiggity are used to locate and download files belonging to target domains/sites on the Internet. Then, DLPDiggity is used to analyze those downloaded files for sensitive information disclosures.

DLPDiggity utilizes IFilters
(An IFilter is a plugin that allows the Windows Indexing Service and the newer Windows Desktop Search to index different file formats so that they become searchable) to search through the actual contents of files, as opposed to just the meta-data. Using .NET regular expressions, DLPDiggity can find almost any type of sensitive data within common document file formats.

Over the last few years, there has been a tremendous increase in the volume of office documents that have been indexed and made searchable by Google and Bing. DLPDiggity taps into that in order to find documents containing sensitive information.

The tool provides a well-structured interface that allows the user to:

  • Select the DLPDiggity search queries from the list that can be used to dig Google/Bing search engine for querying for documents.
  • Select the regular expressions that will be used to search through the documents in the target directory for data leaks of sensitive information such as SSN, credit card numbers
  • Search button to analyze through the documents

FlashDiggity:

FlashDiggity automates Google searching/downloading/decompiling/analysis of SWF files to identify Flash vulnerabilities and information disclosures.

FlashDiggity first leverages the GoogleDiggity tool in order to identify Adobe Flash SWF applications for target domains via Google searches, such as ext:swf. Next, the tool is used to download all of the SWF files in bulk for analysis. The SWF files are disassembled back to their original ActionScript source code, and then analyzed for code-based vulnerabilities.

The tool provides a well-structured interface that allows the user to:

  • Select the FlashDiggity search queries from the list that can be used to dig Google search engine for querying for documents
  • Select the regular expressions that will be used to search through the ActionScript of decompiled SWF Flash files for code-based vulnerabilities and information disclosures.
  • Search button to decompile and analyze the SWF files

Search Engine Hacking – Manual and Automation相关推荐

  1. Search Engine XSS Worm

    作者:余弦 来源:0x37 Security 有挑战才有意思,为了诞生个Search Engine XSS Worm,这里拿yeeyan做实验了.译言http://www.yeeyan.com/是一个 ...

  2. 企业搜索引擎(Enterprise Search Engine)的2007中重要的功能

    企业搜索引擎(Enterprise Search Engine)的2007中重要的功能 SPS 2003中的搜索引擎虽然可用,但是效果和准确程度让人很不满意,但是在MOSS 2007中提供了一个全新的 ...

  3. Deteming the User Intent of Web Search Engine

    论文心得. 把搜索分为三种类型:informational, navigational, transactional The classifications of informational, nav ...

  4. SEO全称:Search Engine Optimization,即搜索引擎优化

    SEO全称:Search Engine Optimization,即搜索引擎优化.是指为了从搜索引擎中获得更多的免费流量,从网站结构.内容建设方案.用户互动传播.页面等角度进行合理规划,使网站更适合搜 ...

  5. 【爬虫学习笔记day03】1.1. (了解)通用爬虫和聚焦爬虫+通用搜索引擎(Search Engine)工作原理+聚焦爬虫

    文章目录 1.1. (了解)通用爬虫和聚焦爬虫 通用爬虫和聚焦爬虫 通用爬虫 通用搜索引擎(Search Engine)工作原理 第一步:抓取网页 搜索引擎如何获取一个新网站的URL: 1. 新网站向 ...

  6. 关键词排名查询工具 - Search Engine Result Position Checker

    站长, 电子商务人员和搜索引擎优化研究员都知道, 关键词在搜索引擎中的排名至关重要.  为了满足这个普遍的需求, UESEO.org开发了 关键词排名查询工具 (Search Engine Resul ...

  7. Everything Search Engine这款软件如何做到如此快速的搜索的

    Everything和Windows搜索是有区别的,以下列几点: 1. Everything只能搜索文件名和文件夹名,Windows搜索可以搜索文件名和文件内容: 2. Everything只能搜索N ...

  8. Source Code Search Engine?

    Source Code Search Engine maybe you like it . http://www.koders.com/

  9. SEM(Search Engine Marketing)–

    SEM(Search Engine Marketing)–搜索引擎营销又称PPC(Pay Per Click),是搜索引擎的竞价排名广告推广.而搜索引擎营销越来越成为网络营销的重要一员,竞价排名广告为 ...

  10. TSE(Tiny Search Engine)介绍

    TSE是Tiny Search Engine("微型搜索引擎")的简称,由北京大学网络实验室出品 这个实验室推出过当年教育网搜索颇有名气的 "北大天网搜索" 天 ...

最新文章

  1. html li padding,求大神来看为li元素设置相同的padding为何padding-bottom和padding-right为多出一部分_html/css_WEB-ITnose...
  2. Java后端向前端传递数据,挥泪整理面经
  3. PAT甲级1037 Magic Coupon:[C++题解]贪心
  4. SAP用Function发布供外部调用的webservice
  5. 趣味SQL——创建指定的数据类型
  6. Myeclipse的standard、pro、spring、Blue、Bling版本功能差别
  7. Java NIO————NIO 简介
  8. 苹果CMS音乐猪自适应网站模板
  9. I2C总线时序模拟(二)-加深理解总线协议
  10. 130242014049+魏俊斌+第2次试验
  11. 大脑计算机马云,人类和计算机谁更聪明?马云和马斯克在2019世界人工智能大会机智交锋...
  12. Xftp6+Xshell6+XmanagerPowerSuite安装教程
  13. Linux下音乐播放器的实现
  14. Angular学习笔记第三章——创建组件
  15. 一个好玩的c++小游戏 另外一个是木马病毒
  16. 学计算机要具备什么能力,具备什么特质能学计算机
  17. 转 SPOOLING技术——操作系统
  18. 【背包问题】基于matlab禁忌搜索算法求解背包问题【含Matlab源码 373期】
  19. Elasticsearch7.8
  20. 华硕开机时出现无法验证数字签名驱动

热门文章

  1. 项目团队管理 Atitit 职位的自动分配草案 attilax总结
  2. paip.erlang环境搭建和脚本式escript运行halo world 在windows下attilax总结
  3. paip.SVN merge分支合并到主干
  4. 从0到60%:中国在环球指数当中的崛起
  5. (转)一个由自由职业者建立的量化对冲基金
  6. (转)高盛:老牌投行新生意,华尔街“谷歌”如何炼成?
  7. (转)如何学习盈透api的开发?
  8. 机器学习笔记(二):矩阵、环境搭建、NumPy | 凌云时刻
  9. 第九届中国开源黑客松活动将于2019年4月18日-4月20日,在深圳举办
  10. oracle clob 导出csv,在Apex把csv导入数据库Clob字段再导入到各自对应列的解决方法...