Inference of vine structures from images.
Type of content
Making inference and extracting information out of images is one of the most important aspects that computer vision methods are presented with. From the many techniques that exist, Markov Random Fields has been found to be a powerful mathematical tool that is both flexible – it can be adapted easily to different applications; and that has the ability to model, with different depths of complexity, the uncertainty that is found in many vision problems.
In this PhD I propose and evaluate comparatively a novel Hidden Markov Model for modeling and extracting vine structure from images, that is, hierarchical connectivity of vine canes. Extracting canes from vine images is a challenging problem given there are many occluded regions and overlapping canes present in such images. Previous research in the area of modeling trees and plants in images make use of manual intervention for solving the mentioned issues, or they make assumptions of the input images that are not valid in my setup. My proposed model aims to tackle directly the inference of connectivity between visible parts of canes, it is fully automatic and can be adapted to different structures other than vines. Here, connectivity inference can be done using any MAP inference method. Therefore, I have selected four methods for comparison, which are Iterated Conditional Model, Simulated Annealing, a heuristic random search based on Gibbs Sampling and Belief Propagation. These four methods are commonly used in computer vision for solving similar tasks to vine structure retrieval. In this thesis, I show comparative results of my proposed methods against manually annotated ground truth data. My Markov model and MAP inference methods generalize and achieve two times higher precision values when compared to prior research. They also compare similarly to heuristic methods for vine structure that were developed as a part of the same project this thesis belongs to. Furthermore, I also analyzed experimentally the convergence of the selected inference methods using vine images from which I know the true optimum value and conclude that Gibbs sampling achieves better performance in comparison to the other methods that usually get stuck at local optimums.
Finally, the architecture of the system proposed in this thesis is similar to current methods in image parsing and scene understanding in computer vision. The results indicate that my proposed Markov Model together with the selected Maximum APosteriori– MAP inference framework are state-of-the-art methods in computer vision applied to the problem of vine structure extraction.