Identifying Near Duplicate Images Utilizing Convolutional Neural Networks

A new method for identifying near-duplicate images using a convolutional neural network and k-nearest neighbor algorithm is proposed in this paper. Although easy for human beings, identifying near-duplicates of a given image is difficult for computers. Near-duplicate images are defined as images whose visual context is essentially the same, but the images themselves were acquired at a different spatial or temporal point. Traditional approaches to solving this problem have been based on deriving invariant image features and measuring the distance between them. This approach has not been entirely successful. Instead of a classical approach we propose a new method that utilizes a pre-trained network to retrieve the features of an images and then a k-nearest neighbor algorithm to identify if there is near-duplicate image in the dataset. We tested this approach on the Holidays dataset and achieve much better results than previous networks and traditional approaches. We then proceed to analyze the performance of different combinations of pre-trained models and k-nearest neighbor algorithms for further optimization.

Author
Francisco Reveriano
School
University of Houston-Clear Lake
Department
Computer Engineering
Research Advisor
Dr. Volodymyr Kindratenko
Department of Research Advisor
Supercomputing Applications
Year of Publication
2019