Tehničko veleučilište u Zagrebu · Zagreb

Percentage of known words from the dictionary in phishing domain name as a possible indication of its compromised status

prošireni sažetak izlaganja sa skupa

prošireni sažetak izlaganja sa skupa

Percentage of known words from the dictionary in phishing domain name as a possible indication of its compromised status

Vrsta prilog sa skupa (u zborniku)
Tip prošireni sažetak izlaganja sa skupa
Godina 2024
Nadređena publikacija Eighth International Scientific Conference on Recent Advances in Information Technology, Tourism, Economics, Management and Agriculture ITEMA 2023
Status prihvaćeno

Sažetak

Recognising phishing domains has become increasingly important for protecting users and systems from cyber threats. Phishing is a fraudulent attempt, usually via email or messaging applications, to steal your personal information. Phishing domains pose as well-known organisations asking for your personal information — such as credit card number, national insurance number, bank account number or password. The best way to protect yourself from phishing is to learn how to recognise a phish. Suspicious signs include a generic greeting, a missing link, a request for personal information and a sense of urgency. A link to insecure domains without certificates is almost a sure sign of phishing. All these methods are difficult to recognize to ordinary users, which is why automatic methods for evaluating domain names are being developed. Traditional approaches such as blacklisting and heuristic methods often cannot keep up with the ever-changing landscape of phishing sites. These methods can recognise the validity of the domain and warn the user to look out for certain indicators. Phishing domains try to be as similar as possible to real domains in terms of name and appearance. Recognition algorithms use a variety of features and make the final decision on compromise by adding an importance coefficient. Recent advances have introduced more sophisticated detection techniques that can analyse a variety of features, including lexical features, DNS records and WHOIS information. Lexical features consist of replacing letters, adding different characters such as numbers or special characters. Over time, certain features become less or more pronounced, depending on the availability of domain names as well as the organisation of the network of compromised computers (botnets). One of the interesting features is the comparison of words from domain names with an extended dictionary. The dictionary is extended to include words that are expected in a domain name, such as the www tag or top-level domain tags. In this paper, research was conducted with several well-known dictionaries and results were given on known phishing domains in relation to legitimate domains. Our approach not only addresses the dynamic nature of phishing, but also provides a scalable feature that can adapt to emerging threats, providing better protection to Internet users and organisations alike.

Ključne riječi

Bothet; phishing; lexical features