Image captioning is a dynamic and crucial research area focused on automatically generating image textual descriptions. Traditional models. primarily employing an encoder-decoder framework with Convolutional Neural Networks (CNNs). often struggle to capture the complex spatial and sequential relationships inherent in visual data. https://www.nacrack.com/