This work was motivated by the goal of building a navigation system that could guide people or robots around in large complex urban environments, even in situations in which Global Positioning Systems (GPS) cannot provide navigational information. Such environments include indoor and crowded city areas where there is no line of sight to the GPS satellites. Because installing active badges or beacon systems involves substantial effort and expense, we have developed a system which navigates solely based on naturally occurring landmarks. As sensory input, we only use a panoramic camera system which provides omnidirectional images of the environment. During the training stage, the system is led around in the environment while recording images at constant time intervals. Offline, these images are automatically ordered in a world model. Unlike traditional approaches we don't build up a euclidean metrical map. The used world model is a graph reflecting the topological structure of the environment: e.g. for indoor environments rooms are nodes and corridors are edges of the graph. Image comparison is done using both global color measures and matching of specially developed local features. These measures are designed to be invariant to both image distortions caused by viewpoint changes and illumination changes. This leads to a sytem that can recognize a certain place even if its location is not exactly the same as the location frow where the reference image was taken, and even if the illumination is substantially different. Using this world model, localization can be done by comparing a new query image with the images in the model. A bayesian framework makes it possible to track the system's position in quasi real time. When the present location is known, a path to a target location can be carried out easily using the topological map.