Visual navigation for mobile robots using the Bag-of-Words algorithm
Thesis DisciplineComputer Science
Degree GrantorUniversity of Canterbury
Degree NameDoctor of Philosophy
Robust long-term positioning for autonomous mobile robots is essential for many applications. In many environments this task is challenging, as errors accumulate in the robot’s position estimate over time. The robot must also build a map so that these errors can be corrected when mapped regions are re-visited; this is known as Simultaneous Localisation and Mapping, or SLAM.
Successful SLAM schemes have been demonstrated which accurately map tracks of tens of kilometres, however these schemes rely on expensive sensors such as laser scanners and inertial measurement units. A more attractive, low-cost sensor is a digital camera, which captures images that can be used to recognise where the robot is, and to incrementally position the robot as it moves. SLAM using a single camera is challenging however, and many contemporary schemes suffer complete failure in dynamic or featureless environments, or during erratic camera motion. An additional problem, known as scale drift, is that cameras do not directly measure the scale of the environment, and errors in relative scale accumulate over time, introducing errors into the robot’s speed and position estimates.
Key to a successful visual SLAM system is the ability to continue operation despite these difficulties, and to recover from positioning failure when it occurs. This thesis describes the development of such a scheme, which is known as BoWSLAM. BoWSLAM enables a robot to reliably navigate and map previously unknown environments, in real-time, using only a single camera. In order to position a camera in visually challenging environments, BoWSLAM combines contemporary visual SLAM techniques with four new components. Firstly, a new Bag-of-Words (BoW) scheme is developed, which allows a robot to recognise places it has visited previously, without any prior knowledge of its environment. This BoW scheme is also used to select the best set of frames to reconstruct positions from, and to find efficient wide-baseline correspondences between many pairs of frames. Secondly, BaySAC, a new outlier- robust relative pose estimation scheme based on the popular RANSAC framework, is developed. BaySAC allows the efficient computation of multiple position hypotheses for each frame. Thirdly, a graph-based representation of these position hypotheses is proposed, which enables the selection of only reliable position estimates in the presence of gross outliers. Fourthly, as the robot explores, objects in the world are recognised and measured. These measurements enable scale drift to be corrected. BoWSLAM is demonstrated mapping a 25 minute 2.5km trajectory through a challenging and dynamic outdoor environment in real-time, and without any other sensor input; considerably further than previous single camera SLAM schemes.