Abstract
Over the past years, convolutional neural networks (CNNs) have achieved remarkable success in deep learning.The performance of CNN-based models has caused major advances in a wide range of tasks from computer vision tonatural language processing. However, the exposition of the theoretical calculations behind the convolution operation israrely emphasized. This study aims to provide better understanding the convolution operation entirely by means of divinginto the theory of how backpropagation algorithm works for CNNs. In order to explain the training of CNNs clearly, theconvolution operation on images is explained in detail and backpropagation in CNNs is highlighted. Besides, LabeledFaces in the Wild (LFW) dataset which is frequently used in face recognition applications is used to visualize what CNNslearn. The intermediate activations of a CNN trained on the LFW dataset are visualized to gain an insight about how CNNsperceive the world. Thus, the feature maps are interpreted visually as well, alongside the training process.