Explore summaries of key scientific papers in Data Science and AI.
by Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton
In this paper, the authors present a large, deep convolutional neural network trained on 1.2 million high-resolution images in the ImageNet dataset, achieving top-1 and top-5 error rates of 37.5% and 17.0% respectively, significantly better than previous methods.
The architecture consists of five convolutional layers followed by three fully-connected layers. Techniques like local response normalization, overlapping pooling, and training on multiple GPUs were employed to optimize performance.
This architecture became foundational for modern computer vision tasks like object detection, segmentation, and image classification, influencing the development of deeper models like ResNet.
AlexNet represents a breakthrough in deep learning, demonstrating that large datasets and GPU computing can enable substantial improvements in visual recognition tasks.