In AI, the idea of expert systems - programs which use predefined rules to sort through facts and arrive at a conclusion - gained popularity in the seventies but were later found to be insufficient to tasks involving large quantities of data, as in detecting and tracking objects in video containing millions of pixels per second. In the past ten years, great advances have been made using statistics-based approaches to discriminate and detect such objects, but only recently have rule-based systems been making a comeback in the form of graphical models. Graphical models are a technique for propagating inferences between nodes in a graph, each of which represents a concept to be reasoned about. In this case, we would like to infer the number and 3D position of pedestrians in a video. We use the available data to infer these parameters for each frame in the video: the number and location of each pedestrian from the previous frame, and the output of statistics-based image-processing algorithm run on the current frame. In the near future, I expect to see the scope of these inferences expanding, taking into account and tracking a holistic model of the scene. For example, we should be able to teach the computer that a pedestrian is much more likely to be found walking on a sidewalk than walking 10 feet up in thin air, but we have to model and track the location and appearance of the sidewalk in order to do that. This research will appear in an upcoming issue of PAMI: A. Ess, B. Leibe, K. Schindler, L. Van Gool, "Robust Multi-Person Tracking from a Mobile Platform", IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. to appear, 2009.