Human observers make a variety of perceptual inferences about pictures of places based on prior knowledge and experience. In this paper we apply computational vision techniques to the task of pre- dicting the perceptual characteristics of places by leveraging recent work on visual features along with a geo-tagged dataset of images associated with crowd-sourced urban perception judgments for wealth, uniqueness, and safety. We perform extensive evaluations of our models, training and testing on images of the same city as well as training and testing on im- ages of different cities to demonstrate generalizability. In addition, we collect a new densely sampled dataset of streetview images for 4 cities and explore joint models to collectively predict perceptual judgments at city scale. Finally, we show that our predictions correlate well with ground truth statistics of wealth and crime.


Vicente Ordonez, Tamara L. Berg.  Learning High-level Judgments of Urban Perception.
European Conference on Computer Vision (ECCV) 2014.  Zurich, Switzerland.  September 2014.

  title     = {Learning High-level Judgments of Urban Perception},
  author    = {Vicente Ordonez and Tamara L. Berg},
  year      = {2014},
  booktitle = {ECCV}


Only utilities to download Street View Images (417 KB)[Download]
Code to run classification and regression experiments (337 MB)[Download]
Code including additional experiment on collective prediction [~soon]