
1.APWG : 国际反钓鱼组织,每季度有关于全球钓鱼攻击方面的统计和分析信息;
2.Microsoft Computing Safer Index Report:介绍了每年因钓鱼攻击造成的财产损失情况。
3. Phishing URL Detection with ML

An phisher has full control over the sub-domain portions and can set any value to it. The URL may also have a path and file components which, too, can be changed by the phisher at will. The sub-domain name and path are fully controllable by the phisher.

1)The sub-domain portions of URL can be control and set any value to it.
2)The path and file components of URL can be changed by the phisher at will. 
3)The attacker can register any domain name that has not been registered before. The phisher can change FreeURL at any time to create a new URL. The reason security defenders struggle to detect phishing domains is because of the unique part of the website domain. 
4)The phisher tried to make the domain look like the domain of the legal URL. 
5)Other methods that are often used by attackers are Cybersquatting and Typosquatting.
Cybersquatting (also known as domain squatting), is registering, trafficking in ,or using a domain name with bad faith intent to profit from the goodwill of a trademark belonging to someone else. 
That is to say, the phisher can register the similarity of your company’s URL.(For example, the name of your company is “abcompany” and you register as abcompany.com. Then phishers can register abcompany.net, abcompany.org, abcompany.biz and they can use it for fraudulent purpose.)
Typosquatting, also called URL hijacking, is a form of cybersquatting which relies on mistakes such as typographical errors made by Internet users when inputting a website address into a web browser or based on typographical errors that are hard to notice while quick reading.

Features Used for Phishing Domain Detection
1)URL-Based Features
2)Domain-Based Features
3)Page-Based Features
4)Content-Based Features

URL-Based Features
    Digit count in the URL
    Total length of URL
    Checking whether the URL is Typosquatted or not
    Checking whether it includes a legitimate brand name or not
    Number of sub-domains in URL 
    Is top Level Domain (TLD) one of the commonly used one
Domain-Based Features
    Its domain name or its IP address in blacklists of well-known reputation services?
    How many days passed since the domain was registered?
    Is the registrant name hidden
Page-Based Features    Global Page rank
    Country Page rank
    Position at the Alexa Top 1 Million Site
    Estimated Number of Visits for the domain on a daily, weekly, or monthly basis
    Average Page views per visit
    Average Visit Duration
    Web traffic share per country
    Count of reference from Social Networks to the given domain
    Category of the domain
    Similar websites etc.

Content-Based Features    Page Titles
    Meta Tags
    Hidden Text
    Text in the Body
    Images etc

What is URL Filtering?

