Caffe nomacs Plugin | Computer Vision Lab

Status: closed
Supervisor: Markus Diem

nomacs – Image Lounge is a light-weight image viewer developed at CVL. It is written in C++ with Qt and OpenCV and published under GPL. Since two years, nomacs has a plugin system which allows for adding modules dedicated to specific tasks.

In the last decade Deep Learning has proven to achieve good results for object classification and thus image labeling tasks.

Task

An image search plugin will be implemented within the practical work. Therefore, a trained Caffe model can be applied to associate images with tags and their corresponding weights. The labeling is carried out as batch process where the user decides which images to label. Moreover, text will be collected from the filename and Exif tags such as the comment section. Having created a database which associates these tags with images, the user can search for specific tags (e.g. Eiffel Tower). The retrieved images are then presented in the thumbnail preview.

Objectives

Basic knowledge of deep learning and specifically of the Caffe framework is gained. In addition, this work deepens programming skills in C++ with libraries such as Qt. If the work is a master thesis, models for specific sub-tasks will be trained.

Requirements

C++ knowledge
Computer Vision knowledge (e.g. Machine Learning for Visual Computing)
Experience in coding with Qt and/or OpenCV (nice to have)