PPGCC - Programa de Pós-graduação em Ciência da Computação
URI permanente desta comunidadehttp://www.hml.repositorio.ufop.br/handle/123456789/596
Navegar
2 resultados
Resultados da Pesquisa
Item Logo detection with second judge single shot multibox.(2017) Coelho, Leonardo Bombonato Simões; Cámara Chávez, Guillermo; Bianchi, Andrea Gomes Campos; Cámara Chávez, Guillermo; Ferreira, Anderson Almeida; Bianchi, Andrea Gomes Campos; Schwartz, William RobsonWith the increasing popularity of Social Networks, the way people interact has changed and the huge amount of data generated open doors to new strategies and marketing analysis. According to Instagram 1 and Tumblr2 an average of 95 and 35 million photos, respectively, are published every day. These pictures contain several implicit or explicit brand logos, this allows us to research how can a brand be better widespread based in regional, temporal and cultural criteria. Using advanced computer vision techniques for object detection and recognition, we can extract information from these images, making possible to understand the impact and the comprehensiveness of a specific brand. This thesis proposes a logo detection technique based on a Convolutional Neural Network (CNN), also used as a second judge. Our proposal is built on the Single Shot Multibox (SSD). In our research, we explored several approaches of the second judge and managed to reduce significantly the number of false positives in comparison with the original approach. Our research outperformed all the others researches on two different datasets: FlickrLogos-32 and Logos-32plus. On the FlickrLogos-32, we surpass the actual state-of-the-art method by 5.2% of F-score and for the Logos-32Plus by 3.0% of F-score.Item Exploring deep learning representations for biometric multimodal systems.(2019) Luz, Eduardo José da Silva; Gomes, David Menotti; Moreira, Gladston Juliano Prates; Ferreira, Anderson Almeida; Moreira, Gladston Juliano Prates; Gomes, David Menotti; Cavalin, Paulo; Cámara Chávez, Guillermo; Santos, Thiago Oliveira dosBiometrics is an important area of research today. A complete biometric system comprises sensors, feature extraction, pattern matching algorithms, and decision making. Biometric systems demand high accuracy and robustness, and researchers are using a combination of several biometric sources, two or more algorithms for pattern matching and di↵erent decision-making systems. These systems are called multimodal biometric systems and today represent state-of-the-art for biometrics. However, the process of extracting features in multimodal biometric systems poses a major challenge today. Deep learning has been used by researchers in the machine learning field to automatize the feature extraction process and several advances were achieved, such as the case of face recognition problem. However, deep learning based methods require a large amount of data and with the exception of facial recognition, there are no databases large enough for the other biometric modalities, hindering the application of deep learning in multimodal methods. In this thesis, we propose a set of contributions to favor the use of deep learning in multimodal biometric systems. First of all, we explore data augmentation and transfer learning techniques for training deep convolution networks, in restricted biometric databases in terms of labeled images. Second, we propose a simple protocol, aiming at reproducibility, for the creation and evaluation of multimodal (or synthetic) multimodal databases. This protocol allows the investigation of multiple biometric modalities combination, even for less common and novel modalities. Finally, we investigate the impact of merging multimodal biometric systems in which all modalities are represented by means of deep descriptors. In this work, we show that it is possible to bring the expressive gains already obtained with the face modality, to other four biometric modalities, by exploring deep learning techniques. We also show that the fusion of modalities is a promising path, even when they are represented by means of deep learning. We advance state-of-the-art for important databases in the literature, such as FRGC (periocular region), NICE / UBIRIS.V2 (periocular region and iris), MobBio (periocular region and face), CYBHi (o↵-the-person ECG), UofTDB (o↵-the-person ECG) and Physionet (EEG signal). Our best multimodal approach, on the chimeric database, resulted in the impressive decidability of 9.15±0.16 and a perfect recognition in (i.e., EER of 0.00%±0.00) for the intra-session multimodal scenario. For inter-session scenario, we reported decidability of 7.91±0.19 and an EER of 0.03%±0.03, which represents a gain of more than 22% for the best inter-session unimodal case.