Learning strategies to select point cloud descriptors for large-scale 3-D object classification
Machine learning is a wide field of computer science, which is concerned with algorithms that are able to learn from data. Not only the acquisition of knowledge is its main goal. Rather there is a growing need for self-learning, intelligent systems, that are able to act under dynamic conditions in their environment. One common way to structure the field of machine learning is the classification into supervised, unsupervised, and reinforcement learning methods. The latter are especially suited to create autonomous agents that learn while interacting with the environment. The reinforcement learning problems can be outlined as long term interaction between a learning agent and a dynamic environment, where the underlying model is not visible to the agent, so that the agent learns only from the observations of the environment. On the one hand reinforcement learning serves as a theoretical tool for studying the principles of agents learning to act. On the other hand there are many examples of practical implementations, e.g., autonomous robots or industrial systems that improve themselves with experience. Other fields of application are combinatorial search problems. While the most practical cases of such search problems in the context of reinforcement learning are restricted solely to the games context, one motivation of this thesis is to show, that reinforcement learning can also be used to solve such search problems at the frontier of computer vision research. This will be done with special focus on an important part of computer vision that currently faces unsolved challenges, for example because of the large amount of data that accumulate in practical applications: the classification of 3-D point clouds. A wide range of applications such as scene understanding, navigation, or applications in robotics like grasping or scene manipulation would benefit from a reliable and efficient classification of 3-D point clouds. Most of the current 3-D classification systems use so called 3-D feature descriptors to compute lower dimensional descriptions of the entire 3-D point cloud or of local parts of it. While some of them are fast but inaccurate, others have a large number of parameters or their computational costs of calculation and comparison are high. However, there are no 3-D feature descriptors which work satisfying in all situations. This suggests the combination of different feature descriptors. In this thesis the benefit of a successive application of state-of-the-art 3-D feature descriptors for the classification of 3-D objects is investigated. This can be regarded as a combinatorial search problem arising from the question which feature descriptors should be used to classify an object and in which order. To be able to translate this into a reinforcement learning problem and to overcome its known limitations, this thesis addresses fundamental obstacles such as a possibly large state space and shows how on-line learning can lead to an adaptive system, that is, for example, able to adaptively integrate new feature descriptors into its learned classification strategy. Finally, it is shown that the proposed approach of combining several algorithms leads to better classification results than the application of one single algorithm alone. Thus, through the example of 3-D point cloud classification, the thesis shows how a self-learned combination of different approaches to the same problem can improve the final result. Besides this more theoretically relevant insight the main contribution of this work consists on its practical value, namely in the demonstration that the proposed proceeding of an adaptive 3-D object classification via reinforcement learning is a possible approach to finding a solution for a current challenge in computer vision.
Preview
Cite
Access Statistic
Rights
Use and reproduction:
All rights reserved