Multimodal Large Language Models for Phishing Webpage Detection and Identification

Read original: arXiv:2408.05941 - Published 8/13/2024 by Jehyun Lee, Peiyuan Lim, Bryan Hooi, Dinil Mon Divakaran

💬

Overview

Researchers have developed numerous machine learning-based solutions to detect phishing webpages, particularly those that use computer vision models to identify if a webpage is imitating a well-known brand.
However, such brand-based models are costly to maintain as they require regularly updating the labeled dataset and reference list of well-known websites.
This paper explores the use of large language models (LLMs), especially multimodal LLMs, to detect phishing webpages by identifying the brand of a given webpage and comparing it with the domain name in the URL.

Plain English Explanation

Phishing attacks, where scammers create fake websites that look like real, trusted ones to trick people into sharing sensitive information, are a major problem on the internet. Researchers have developed various machine learning-based solutions to detect these phishing webpages, with a particular focus on brand-based phishing detection. These models use computer vision techniques to examine a webpage and determine if it's imitating a well-known brand or company.

While these brand-based models can be effective, they have some significant drawbacks. They're expensive to maintain, as the researchers need to regularly collect and label new data to keep the models up-to-date. They also need to maintain a comprehensive list of well-known websites and their associated metadata, which is a complex and ongoing challenge.

To address these issues, the researchers in this paper explore the use of large language models (LLMs) – powerful AI systems that have been trained on vast amounts of online data. The researchers hypothesize that these LLMs, particularly the multimodal ones that can process both text and images, can be used to identify the brand of a given webpage and then compare that to the domain name in the URL to detect phishing attempts.

The researchers propose a two-step system that uses LLMs in both phases: first, to identify the brand of the webpage, and second, to verify that the domain name matches the identified brand. Through extensive testing on a newly collected dataset, the researchers show that this LLM-based system can achieve high detection rates with high precision, while also providing clear explanations for its decisions. Importantly, the system also performs significantly better than the state-of-the-art brand-based phishing detection models, and it's even robust against known adversarial attacks that are designed to fool other phishing detection systems.

Technical Explanation

The researchers propose a two-phase system that employs large language models (LLMs) to detect phishing webpages. In the first phase, the system uses the LLM to identify the brand or company associated with the given webpage, based on various elements like the logo, theme, favicon, and other visual and textual cues. In the second phase, the system compares the identified brand with the domain name in the URL to detect any mismatch, which would indicate a phishing attempt.

To evaluate their approach, the researchers collected a new dataset of phishing and legitimate webpages, and conducted comprehensive experiments. The results show that the LLM-based system achieves a high detection rate with high precision, outperforming a state-of-the-art brand-based phishing detection system. Importantly, the LLM-based system also provides interpretable evidence for its decisions, making it more transparent and trustworthy.

Furthermore, the researchers tested the system's robustness against two known adversarial attacks designed to fool phishing detection models. The results indicate that the LLM-based system is more resilient to these attacks compared to the brand-based model, demonstrating its potential for real-world deployment.

Critical Analysis

The researchers have presented a promising approach to phishing detection that leverages the power of large language models. By exploiting the multimodal understanding of LLMs, the system can effectively identify the brand associated with a webpage and compare it to the domain name, a crucial step in detecting phishing attempts.

One potential limitation of the study is the reliance on a newly collected dataset, which may not fully represent the diversity and evolving nature of phishing webpages in the real world. The researchers acknowledge this and suggest that further evaluation on larger, more comprehensive datasets would be valuable.

Additionally, while the system's interpretability and robustness to adversarial attacks are notable strengths, it would be important to investigate its performance in the face of other types of evasion techniques that may emerge as the technology advances.

Overall, the researchers have made a significant contribution by demonstrating the potential of LLMs in the context of phishing detection, a critical problem with important implications for internet security and user safety. Further research and refinement of this approach could lead to more effective and practical solutions for combating phishing attacks.

Conclusion

This paper explores the use of large language models (LLMs), particularly multimodal LLMs, as a solution for detecting phishing webpages. The researchers propose a two-phase system that leverages the LLM's understanding of various webpage elements to identify the brand associated with a given website and then compare it with the domain name to detect phishing attempts.

The comprehensive evaluation on a newly collected dataset shows that the LLM-based system achieves high detection rates with high precision, outperforming a state-of-the-art brand-based phishing detection system. Importantly, the system also provides interpretable evidence for its decisions and demonstrates robustness against known adversarial attacks.

This research represents a significant step forward in the fight against phishing, a persistent and costly problem on the internet. By harnessing the power of LLMs, the proposed approach offers a more scalable and maintainable solution compared to traditional brand-based models, with the potential to improve internet security and protect users from falling victim to these deceptive scams.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

💬

Multimodal Large Language Models for Phishing Webpage Detection and Identification

Jehyun Lee, Peiyuan Lim, Bryan Hooi, Dinil Mon Divakaran

To address the challenging problem of detecting phishing webpages, researchers have developed numerous solutions, in particular those based on machine learning (ML) algorithms. Among these, brand-based phishing detection that uses models from Computer Vision to detect if a given webpage is imitating a well-known brand has received widespread attention. However, such models are costly and difficult to maintain, as they need to be retrained with labeled dataset that has to be regularly and continuously collected. Besides, they also need to maintain a good reference list of well-known websites and related meta-data for effective performance. In this work, we take steps to study the efficacy of large language models (LLMs), in particular the multimodal LLMs, in detecting phishing webpages. Given that the LLMs are pretrained on a large corpus of data, we aim to make use of their understanding of different aspects of a webpage (logo, theme, favicon, etc.) to identify the brand of a given webpage and compare the identified brand with the domain name in the URL to detect a phishing attack. We propose a two-phase system employing LLMs in both phases: the first phase focuses on brand identification, while the second verifies the domain. We carry out comprehensive evaluations on a newly collected dataset. Our experiments show that the LLM-based system achieves a high detection rate at high precision; importantly, it also provides interpretable evidence for the decisions. Our system also performs significantly better than a state-of-the-art brand-based phishing detection system while demonstrating robustness against two known adversarial attacks.

8/13/2024

KnowPhish: Large Language Models Meet Multimodal Knowledge Graphs for Enhancing Reference-Based Phishing Detection

Yuexin Li, Chengyu Huang, Shumin Deng, Mei Lin Lock, Tri Cao, Nay Oo, Hoon Wei Lim, Bryan Hooi

Phishing attacks have inflicted substantial losses on individuals and businesses alike, necessitating the development of robust and efficient automated phishing detection approaches. Reference-based phishing detectors (RBPDs), which compare the logos on a target webpage to a known set of logos, have emerged as the state-of-the-art approach. However, a major limitation of existing RBPDs is that they rely on a manually constructed brand knowledge base, making it infeasible to scale to a large number of brands, which results in false negative errors due to the insufficient brand coverage of the knowledge base. To address this issue, we propose an automated knowledge collection pipeline, using which we collect a large-scale multimodal brand knowledge base, KnowPhish, containing 20k brands with rich information about each brand. KnowPhish can be used to boost the performance of existing RBPDs in a plug-and-play manner. A second limitation of existing RBPDs is that they solely rely on the image modality, ignoring useful textual information present in the webpage HTML. To utilize this textual information, we propose a Large Language Model (LLM)-based approach to extract brand information of webpages from text. Our resulting multimodal phishing detection approach, KnowPhish Detector (KPD), can detect phishing webpages with or without logos. We evaluate KnowPhish and KPD on a manually validated dataset, and a field study under Singapore's local context, showing substantial improvements in effectiveness and efficiency compared to state-of-the-art baselines.

6/18/2024

Utilizing Large Language Models to Optimize the Detection and Explainability of Phishing Websites

Sayak Saha Roy, Shirin Nilizadeh

In this paper, we introduce PhishLang, an open-source, lightweight language model specifically designed for phishing website detection through contextual analysis of the website. Unlike traditional heuristic or machine learning models that rely on static features and struggle to adapt to new threats, and deep learning models that are computationally intensive, our model leverages MobileBERT, a fast and memory-efficient variant of the BERT architecture, to learn granular features characteristic of phishing attacks. PhishLang operates with minimal data preprocessing and offers performance comparable to leading deep learning anti-phishing tools, while being significantly faster and less resource-intensive. Over a 3.5-month testing period, PhishLang successfully identified 25,796 phishing URLs, many of which were undetected by popular antiphishing blocklists, thus demonstrating its potential to enhance current detection measures. Capitalizing on PhishLang's resource efficiency, we release the first open-source fully client-side Chromium browser extension that provides inference locally without requiring to consult an online blocklist and can be run on low-end systems with no impact on inference times. Our implementation not only outperforms prevalent (server-side) phishing tools, but is significantly more effective than the limited commercial client-side measures available. Furthermore, we study how PhishLang can be integrated with GPT-3.5 Turbo to create explainable blocklisting -- which, upon detection of a website, provides users with detailed contextual information about the features that led to a website being marked as phishing.

9/11/2024

💬

Large Language Models Spot Phishing Emails with Surprising Accuracy: A Comparative Analysis of Performance

Het Patel, Umair Rehman, Farkhund Iqbal

Phishing, a prevalent cybercrime tactic for decades, remains a significant threat in today's digital world. By leveraging clever social engineering elements and modern technology, cybercrime targets many individuals, businesses, and organizations to exploit trust and security. These cyber-attackers are often disguised in many trustworthy forms to appear as legitimate sources. By cleverly using psychological elements like urgency, fear, social proof, and other manipulative strategies, phishers can lure individuals into revealing sensitive and personalized information. Building on this pervasive issue within modern technology, this paper aims to analyze the effectiveness of 15 Large Language Models (LLMs) in detecting phishing attempts, specifically focusing on a randomized set of 419 Scam emails. The objective is to determine which LLMs can accurately detect phishing emails by analyzing a text file containing email metadata based on predefined criteria. The experiment concluded that the following models, ChatGPT 3.5, GPT-3.5-Turbo-Instruct, and ChatGPT, were the most effective in detecting phishing emails.

6/10/2024