MODIFIED TF-IDF ALGORITHM FOR SEO OPTIMIZATION BASED ON CONTEXTUAL ANALYSIS OF THE BERT MODEL AND GROQ API

Authors

Keywords:

SEO-optimization of web content, statistical methods, neural network models, TF-IDF, BERT filtering, lemmatization, headline generation

Abstract

The article presents an integrated approach to automated SEO optimization of web content based on a combination of classical statistical methods and modern neural network models.The aim of the study was to improve the accuracy of keyword selection and reduce the amount of “noise” in the analysis without significant human intervention.At the initial stage, the sitemap was analyzed, from which auxiliary documents were eliminated, and a large corpus of texts was formed. We performed lemmatization and cleaning of stop words, which significantly reduced the variety of word forms and prepared the data for deeper analysis. The main tool for selecting terms was the classical method of frequency- reverse-frequency ranking, which successfully identifies the most frequently used words but does not take into account the semantic relations of complex phrases.To increase semantic relevance, we integrated a neural model with its own filtering of low- value structures. Additionally, a sliding window method is used to form contextual triplets with the keyword in the center, which helps to identify truly meaningful phrases. At the final stage, the text model automatically generates and corrects titles, meta tags, paragraphs, and image descriptions in accordance with current recommendations.The results demonstrate a noticeable increase in the accuracy of keyword selection, a reduction in the level of “noise” and the alignment of text characteristics with practical standards, which ensures a predictable increase in click-through rates. The automated audit also revealed an imbalance in the use of headings of different levels, which affects indexing, and based on this, recommendations were formulated to optimize the page structure.In general, the combination of statistical algorithms and neural network tools creates an effective platform for the development of autonomous SEO assistants that can adapt to dynamic changes in the digital environment without constant human control.

References

Optimize content rank in AI search results : веб-сайт [Електронний ресурс] / Xponent21 Insights. URL: https://xponent21.com/insights/optimize-content-rank-in-ai- search-results/ (дата звернення: 01.05.2025).

AI SEO Revolution – or Risk? : веб-сайт [Електронний ресурс] / IthElps Digital. URL: https://www.ithelps-digital.com/en/blog/ai-seo-revolution-or-risk (дата звернення: 01.05.2025).

Using AI for Metadata Tagging to Improve Resource Discovery : веб-сайт [Електронний ресурс] / Choice360. URL: https://www.choice360.org/libtech-insight/using-ai-for-metadata-tagging-to-improve-resource-discovery/ (дата звернення: 01.05.2025).

AI-Based Content Optimization: Trends and Applications [Електронний ресурс] / IJNRD. URL: https://www.ijnrd.org/papers/IJNRD2405009.pdf (дата звернення: 01.05.2025).

Neural Text Transformations for SEO Tasks [Електронний ресурс] / arXiv. URL: https://arxiv.org/html/2312.07214v1 (дата звернення: 01.05.2025).

Keyword Research // Search Engine Optimization (SEO): An Hour a Day : монографія [Монографія] / J. Grappone, G. Couzin. Indianapolis : Sybex, 2007. С. 45–68.

Морфологічний аналіз і семантична розмітка // Просування сайтів у пошукових системах : монографія [Монографія] / І. С. Ашманов, О. І. Іванов. Київ : BHV, 2014. С. 138–172.

Devlin J., Chang M.-W., Lee K., Toutanova K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding // Proceedings of NAACL-HLT 2019. Minneapolis, MN : Association for Computational Linguistics, 2019. С. 4171–4186.

llama-4-scout-17b-16e-instruct [Електронний ресурс] / Meta. URL: https://huggingface.co/meta-llama/llama-4-scout-17b-16e-instruct (дата звернення: 01.05.2025).

Alterens – Автономні сонячні електростанції від Alterens! : веб-сайт [Елек- тронний ресурс] / Alterens. URL: https://alterens.com (дата звернення: 01.05.2025).

On-Page SEO // Search Engine Optimization (SEO): An Hour a Day : монографія [Монографія] / J. Grappone, G. Couzin. Indianapolis : Sybex, 2007. С. 95–120.

Groq Cloud API for chat completions : веб-сайт [Електронний ресурс] / Groq. URL: https://api.groq.com/openai/v1/chat/completions (дата звернення: 01.05.2025).

Metadata & Microformats // Search Engine Optimization (SEO): An Hour a Day : монографія [Монографія] / J. Grappone, G. Couzin. Indianapolis : Sybex, 2007. С. 121–140.

Tracking & Analytics // Search Engine Optimization (SEO): An Hour a Day : монографія [Монографія] / J. Grappone, G. Couzin. Indianapolis : Sybex, 2007. С. 217–242.

Published

2025-05-29

How to Cite

Сілін, І. Д., Потапова, К. Р., & Наливайчук, М. В. (2025). MODIFIED TF-IDF ALGORITHM FOR SEO OPTIMIZATION BASED ON CONTEXTUAL ANALYSIS OF THE BERT MODEL AND GROQ API. Таuridа Scientific Herald. Series: Technical Sciences, (2), 180-192. Retrieved from http://journals.ksauniv.ks.ua/index.php/tech/article/view/883

Issue

Section

COMPUTER SCIENCE AND INFORMATION TECHNOLOGY