Phish or legit? quantifying web-based threat signals through predictive analytics and feature attribution

Daniel Duah; Bismark Kofi Owusu Sarfo; Collins Boakye; Dina Duku

Phish or legit? quantifying web-based threat signals through predictive analytics and feature attribution

Autores: Daniel Duah, Bismark Kofi Owusu Sarfo, Collins Boakye, Dina Duku
Localización: International Journal of Professional Business Review: Int. J. Prof.Bus. Rev., ISSN 2525-3654, ISSN-e 2525-3654, Vol. 10, Nº. 5, 2025 (Ejemplar dedicado a: Continuous publication; e05512)
Idioma: inglés
DOI: 10.26668/businessreview/2025.v10i5.5472
Enlaces
- Texto completo (pdf)
Resumen
- Objective: This study investigates web-based threat signals using predictive analytics and feature attribution to determine whether a webpage is phishing or legitimate.
  
  Theoretical Framework: The research is grounded in Protection Motivation Theory (PMT), which offers a behavioral lens to interpret phishing indicators. PMT connects web features to users’ cognitive threat and coping appraisals, providing a theoretical rationale for selecting and organizing features.
  
  Method: A logistic regression model, regularized with L1 (Lasso), was developed for its interpretability and ability to handle feature sparsity and convergence issues. Using a dataset of 11,055 labeled websites, the model incorporates three core feature sets: structural (e.g., IP-based URLs, SSL status), behavioral (e.g., redirection, form handler anomalies), and domain metadata (e.g., traffic rank, Google indexing).
  
  Results and Discussion: The model rejects the null hypothesis that website-level features are non-predictive, confirming that structural, behavioral, and metadata-based signals significantly distinguish phishing from legitimate sites. This thematic decomposition supports both the conceptual framework and the empirical model design.
  
  Research Implications: The findings offer actionable insights for cybersecurity professionals, especially those in regulated industries. The model enhances detection capability while maintaining transparency, crucial for compliance and risk management.
  
  Originality/Value: This study contributes to literature by integrating PMT into a predictive modeling framework for phishing detection, an approach that bridges behavioral theory and machine learning. Its originality lies in aligning cognitive appraisal theory with interpretable statistical methods. The results are highly relevant to cybersecurity practice, offering scalable, transparent tools that support real-time decision-making and inform strategic defenses in high-risk sectors.
Referencias bibliográficas
- Adebowale, M. A., Lwin, K. T., & Hossain, M. A. (2022.). Intelligent Phishing Detection Scheme Using Deep Learning Algorithms. https://docs.apwg.org/reports/apwg_trends_report_q2_2019.pdf
- Al-Ahmadi, S., Alotaibi, A., & Alsaleh, O. (2022). PDGAN: Phishing Detection With Generative Adversarial Networks. IEEE Access, 10, 42459–42468....
- Alazaidah, R., Al-Shaikh, A., AL-Mousa, M. R., Khafajah, H., Samara, G., Alzyoud, M., Al-Shanableh, N., & Almatarneh, S. (2024). Website...
- Aljofey, A., Jiang, Q., Qu, Q., Huang, M., & Niyigena, J. P. (2020). An effective phishing detection model based on character level convolutional...
- Aljofey, A., Jiang, Q., Rasool, A., Chen, H., Liu, W., Qu, Q., & Wang, Y. (2022). An effective detection approach for phishing websites...
- Alshingiti, Z., Alaqel, R., Al-Muhtadi, J., Haq, Q. E. U., Saleem, K., & Faheem, M. H. (2023). A Deep Learning-Based Phishing Detection...
- Atlam, H. F., & Oluwatimilehin, O. (2023). Business Email Compromise Phishing Detection Based on Machine Learning: A Systematic Literature...
- Bax, S., McGill, T., & Hobbs, V. (2021). Maladaptive behaviour in response to email phishing threats: The roles of rewards and response...
- Cahusac, P. (2022). Log Likelihood Ratios for Common Statistical Tests Using the likelihoodR Package.
- Catal, C., Giray, G., Tekinerdogan, B., Kumar, S., & Shukla, S. (2022). Applications of deep learning for phishing detection: a systematic...
- Gandotra, E., & Gupta, D. (2021). An Efficient Approach for Phishing Detection using Machine Learning (pp. 239–253). https://doi.org/10.1007/978-981-15-8711-5_12
- Hassan, S., Ahmad, R., Katuk, N., Ghazali, N. N., Aripin, J. A., & Ali, F. (2024). Staying One Step Ahead: Exploring Protection Motivation...
- Mughaid, A., AlZu’bi, S., Hnaif, A., Taamneh, S., Alnajjar, A., & Elsoud, E. A. (2022). An intelligent cyber security phishing detection...
- Nattino, G., Pennell, M. L., & Lemeshow, S. (2020). Assessing the goodness of fit of logistic regression models in large samples: A modification...
- Sahingoz, O. K., Buber, E., Demir, O., & Diri, B. (2019). Machine learning based phishing detection from URLs. Expert Systems with Applications,...
- Sahingoz, O. K., Buber, E., & Kugu, E. (2024). DEPHIDES: Deep Learning Based Phishing Detection System. IEEE Access, 12, 8052–8070. https://doi.org/10.1109/ACCESS.
- Senaviratna, N. A. M. R., & Cooray, T. M. J. A. (2019). Detecting Multicollinearity of Binary Logistic Regression Model: An Analysis of...
- Shahrivari, V., Darabi, M. M., & Izadi, M. (2020). Phishing Detection Using Machine Learning Techniques. http://arxiv.org/abs/2009.11116
- Silva, C. M. R. da, Feitosa, E. L., & Garcia, V. C. (2020). Heuristic-based strategy for Phishing prediction: A survey of URL-based approach....
- Sonowal, G., & Kuppusamy, K. S. (2020). PhiDMA – A phishing detection model with multi-filter approach. Journal of King Saud University...
- Tan, C. C. L., Chiew, K. L., Yong, K. S. C., Sebastian, Y., Than, J. C. M., & Tiong, W. K. (2023). Hybrid phishing detection using joint...
- Toptancı, Ş., Erginel, N., & Acar, I. (2023). Predicting the severity of occupational accidents in the construction industry using standard...
- Ugba, E. R., & Gertheiss, J. (2023). A modification of McFadden’s R2 for binary and ordinal response models. Communications for Statistical...
- Van Dooremaal, B., Burda, P., Allodi, L., & Zannone, N. (2021, August 17). Combining Text and Visual Features to Improve the Identification...
- van Geest, R. J., Cascavilla, G., Hulstijn, J., & Zannone, N. (2024). The applicability of a hybrid framework for automated phishing detection....
- Vijayalakshmi, M., Mercy Shalinie, S., Yang, M. H., & Raja Meenakshi, U. (2020). Web phishing detection techniques: A survey on the state-of-the-art,...
- Wei, W., Ke, Q., Nowak, J., Korytkowski, M., Scherer, R., & Woźniak, M. (2020). Accurate and fast URL phishing detector: A convolutional...
- Yang, P., Zhao, G., & Zeng, P. (2019). Phishing website detection based on multidimensional features driven by deep learning. IEEE Access,...
- Yao, W., Ding, Y., & Li, X. (2018). Deep Learning for Phishing Detection. 2018 IEEE Intl Conf on Parallel & Distributed Processing...
- Com), 645–650. https://doi.org/10.1109/BDCloud.2018.00099
- Yavartanoo, F., Brossard, M., Bull, S. B., Paterson, A. D., & Yoo, Y. J. (2025). Dimension Reduction Using Local Principal Components...

Mi Ágora

Selección

Opciones de artículo

Seleccionado

Opciones de compartir

Opciones de entorno

Sugerencia / Errata

Acceso de usuarios registrados

Phish or legit? quantifying web-based threat signals through predictive analytics and feature attribution

Mi Ágora

Opciones de artículo

Opciones de compartir

Opciones de entorno