Wikipedia Text Generation
Main Article Content
Abstract
Wikipedia, which is one of the most popular online knowledge repositories, is confronted with the difficulty of having content that is inconsistent, incomplete, and out of date. The study examines the automatic generation of Wikipedia-style articles using online retrieval and abstractive summarisation to solve these difficulties. This work uses recent NLG and RAG model advances to combine information retrieval, content structuring, and factual grounding. A modular architecture will be used to develop an automated system that produces consistent, reliable, and up-to-date encyclopaedia entries. This method uses a multi-stage pipeline to generate topic-to-outline, targeted web search using the Serper API, semantic filtering using sentence embeddings, abstractive summarisation using transformer-based models like PEGASUS and BART, and quality evaluation using the ROUGE-1 metric.F1 scores range from 0.26 to 0.44, with higher precision suggesting factual accuracy but lower recall due to summarisation loss. The results of the experiments that used "Data Structures" as the test topic reveal that the F1 scores fall within this range. According to the findings of the study, some of the most important strengths are high outline coverage, modularity, scalability, and semantic accuracy. However, limited recall, content drift, and retrieval quality reliance are drawbacks. This study proves automated Wikipedia article production is possible. It also advises adding adaptive user feedback, classifier-based section mapping, long-context summarisation, and content quality assessment. Educational and academic applications may benefit from structured and accessible summaries and scalable knowledge synthesis systems.