CS 6501-003: Computational Visual Recognition

Instructor: Vicente Ordóñez R (vicente at cs.virginia.edu)

Instructor's Office: Rice Hall 310
Time: Monday & Wednesdays between 10:30AM and 11:45AM.
Office Hour: Monday between 2pm and 3pm.
Teaching Assistant: Fuwen Tan, TA Office Hour: Tuesdays from 2 to 3pm at Rice 224
Link to Piazza: Piazza

How can we use computers to recognize objects, people, actions, animals, places, etc from images? This seemingly trivial task that people perform without much effort has remained one of the core problems in Computer Vision. In this class we will study, play with, and implement algorithms for computational visual recognition using machine learning and deep learning. The class sessions will be a mix of classic lectures for the most foundational topics, and student-led paper review sessions for more advanced topics. We will also be reading recent research papers on computational visual recognition ranging from classifying images, to detecting and outlining every object in an image. In summary, after successful completion of this course you should be able to teach a robot how to distinguish dogs from cats.

More about this class

Topics: Signup for this class if you are interested in any of the following:

Prerrequisites: This course requires no previous background in computer vision or machine learning but knowledge in either of those will be helpful. You need to know about matrices, calculating derivatives, and probabilities (bayes rule). You will also need to be a proficient programmer in a high-level language. There will be two homework assignments designed to be accessible to almost anybody fitting the previous criteria. These assignments will show you the basics of a modern general visual recognition system, and will give you the tools for implementing more advanced ones. There will also be a series of quizzes directly related to the assignments and material covered during class. Finally, we will have a class project where you will be able to work on something beyond your assignments and where you will have more freedom to pursue a focused problem that is of your interest and better matches your background. Finally we will be using Lua/Torch in the lecture notes, so running the following tutorial before the first day of class is recommended [html] [notebook].

Grading: Labs: 25pts, Paper presentation: 15pts, Paper summaries: 10pts, Project: 50pts.
Summary of Hands-on Activities:
1. Image Processing Lab: [html] [notebook] (Requires iTorch)
2. Linear Classifier + SGD Lab: [html] [notebook] (Requires iTorch)
3. Convolutional Networks Lab: [html] [notebook] (Requires iTorch)
4. Recurrent Neural Networks Lab: [html] [notebook] (Requires iPython + Keras + Tensorflow)

Syllabus

Date     Topic
Wed, August 24th Lecture: Introduction [slides]
  • Welcome
  • Why is Visual Recognition hard?
  • Challenges in Computer Vision
  • Problems and applications
  • Lab for next class -- Image Processing: [html] [notebook]
    (Requires installing iTorch)
Instructions for running iTorch from Docker: [here] (Useful if you have trouble installing dependencies or if you are a Windows user)
Mon, August 29th Lecture: Image Processing Basics & Introduction to Computer Vision [slides]
  • Overview of the field
  • Basic Image Processing
  • Convolutions, and filtering
  • Lab for next next week -- Linear Classifier + SGD (Sept 12th): [html] [notebook]
Reading: Szeliski Book, Chapter 3.
Wed, August 31st Lecture: Machine Learning for Vision I [slides]
  • Discussion: Supervised vs Unsupervised Learning
  • Supervised learning: k-Nearest neighbors
  • Pedestrian Detection using Histogram of Oriented Gradients
  • Unsupervised learning: Clustering
For more into Train, Validation, Test splitting watch Andrew Ng's lecture on model selection [video]
Mon, September 5th No classes this day -- Work on Linear Softmax + SGD Lab.
Wed, September 7th Lecture: Machine Learning for Vision II [slides]
  • Supervised learning: Linear models
  • Gradient Descent
  • Stochastic Gradient Descent
  • Regularization
Useful related material: [Linear regression (Nando de Freitas/Oxford)] [Softmax vs Max-margin Loss (Andrej Karpathy/Stanford)]
Mon, September 12th Lecture: Deep Learning for Vision I [slides]
  • The perceptron model
  • Neural Networks
  • Convolutional Networks
  • Lab for next week -- Convolutional Networks (Sept 21st): [html] [notebook]
Wed, September 14th Paper Review: Image Classification
  • ImageNet Classification with Deep Convolutional Neural Networks, NIPS 2012. [pdf]
    by Alex Krizhevsky, Ilya Sutskever, and Geoff Hinton.
  • Very Deep Convolutional Networks for Large-Scale Visual Recognition , ICLR 2015. [arxiv] by Karen Simonyan and Andrew Zisserman.
Presenters:
Vicente [slides]

Vicente [slides]
Submit a 1 or 2 page project proposal in PDF (Deadline September 19th).
Mon, September 19thStudent Paper Review: Scene Recognition and Sense of Place.
  • Learning High-level Judgments of Urban Perception, ECCV 2014. [pdf]
    by Vicente Ordonez and Tamara L. Berg.
  • Learning Deep Features for Scene Recognition using the Places Database, NIPS 2014. [pdf] by Bolei Zhou, Agata Lapedriza, Jianxiong Xiao, Antonio Torralba, and Aude Oliva
Presenters:
Leandra, Ian [slides]

Luke, Vijay [slides]
Wed, September 21stStudent Paper Review: Object Detection.
  • Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation, CVPR 2014. [pdf]
    by Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik.
  • You Only Look Once: Unified, Real-Time Object Detection, CVPR 2016. [arxiv]
    by Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi.
Presenters:
Tom, Nick [slides]

Xuwang [slides]
Meet with the instructor this week to define your Final Project Proposal
(Monday 26th office hours will be from 2pm to 5pm for this purpose)
Mon, September 26th Student Paper Review: Image Parsing and Semantic Segmentation
  • Fully Convolutional Networks for Semantic Segmentation, CVPR 2015. [arxiv]
    by Jon Long*, Evan Shelhamer*, Trevor Darrell.
  • Instance-sensitive Fully Convolutional Networks, ECCV 2016. [arxiv]
    by Jifeng Dai, Kaiming He, Yi Li, Shaoqing Ren, and Jian Sun.
Presenters:
Haina [slides]

Hao
Wed, September 28thLecture: Adversarial Examples and Generative Adversarial Networks.
  • Explaining and Harnessing Adversarial Examples, [arxiv].
    by Ian J. Goodfellow, Jonathon Shlens, Christian Szegedy.
  • Generative adversarial nets, NIPS 2014. [arxiv]
    by Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, Yoshua Bengio.
Vicente [slides]

Torch code to train a generative adversarial network [link], and to obtain Deep Dream images [link]
Mon, October 3rd Fall Break - no classes this day.
Wed, October 5thStudent Paper Review: Recurrent Neural Networks
  • Show and Tell: A Neural Image Caption Generator, CVPR 2015. [arxiv]
    by Oriol Vinyals, Alexander Toshev, Samy Bengio, Dumitru Erhan.
  • Exploring Models and Data for Image Question Answering, NIPS 2015. [arxiv]
    by Mengye Ren, Ryan Kiros & Richard Zemel.
Presenters:
Tianlu, Yin [slides]

Weiqiang [slides]
Mon, October 10thStudent Paper Review: Action Recognition from Videos.
  • Hollywood in Homes: Crowdsourcing Data Collection for Activity Understanding. ECCV 2016. [arxiv] by Gunnar A. Sigurdsson, Gül Varol, Xiaolong Wang, Ali Farhadi, Ivan Laptev, Abhinav Gupta
  • The Open World of Micro-Videos. arXiv 2016. [pdf] by Phuc Xuan Nguyen, Gregory Rogez, Charless Fowlkes, Deva Ramanan
Presenters:
Luyao, Jonathan [slides]

Yujia, Sharon [slides]
Wed, October 12thStudent Paper Review: Topic of your choice
  • Deep Mask: Learning to Segment Object Candidates, arXiv 2015. [arxiv] by Pedro O. Pinheiro, Ronan Collobert, Piotr Dollar
  • DeepBox: Learning Objectness with Convolutional Networks, ICCV 2015. [arxiv] by Weicheng Kuo, Bharath Hariharan, Jitendra Malik
Presenters:
Siva [slides]

Divya [slides]
Mon, October 17thStudent Paper Review: Topic of your choice
  • Robust Image Sentiment Analysis Using Progressively Trained and Domain Transfered Deep Networks, arXiv 2015. [arxiv] by Quanzeng You, Jiebo Luo, Hailin Jing and Jianchao Yang
  • Learning to Compare Image Patches via Convolutional Neural Networks, CVPR 2015.[pdf] by Sergey Zagoruyko and Nikos Komodakis
  • Lab for next week -- Recurrent Neural Networks (October 24th): [html] [notebook]
    (Requires iPython + Keras + Tensorflow)
Presenters:
Abhimanyu [slides]

Gautam [slides]
Wed, October 19thGuest Lecture
  • Fuwen's Presentation on Tensorflow.
  • Vicente's Presentation on Torch's nngraph, Keras.
Mon, October 24thStudent Paper Review: Unsupervised learning of Deep Neural Networks.
  • Unsupervised Visual Representation Learning by Context Prediction, ICCV 2015. [arxiv] by Carl Doersch, Abhinav Gupta, Alexei A. Efros
  • Learning Visual Groups From Co-occurrences in Space and Time, ICLR 2016. [arxiv] by Phillip Isola, Daniel Zoran, Dilip Krishnan, Edward H. Adelson
Presenters:
Colin, Joel [slides]

Di Fang, Jeff [slides]
Wed, October 26thStudent Paper Review: Multi-label Image Classification.
  • Deep Convolutional Ranking for Multilabel Image Annotation, ICLR 2014. [arxiv] by Yunchao Gong, Yangqing Jia, Thomas Leung, Alexander Toshev, Sergey Ioffe
  • CNN-RNN: A Unified Framework for Multi-label Image Classification, CVPR 2016. [arxiv] by Jiang Wang, Yi Yang, Junhua Mao, Zhiheng Huang, Chang Huang, Wei Xu
Presenters:
Jiayu, Xiao [slides]

Jiankun, Xueying [slides]
Submit a 3-page progress report of your Project
(Monday October 31st)
Mon, October 31stLecture: Revisiting the Principles of Categorization [slides].
  • The Principles of Categorization
  • Pictures and Names: Making the Connection
  • Predicting Entry-level Categories
[Rosch,Lloyd,1978]
[Jolicoeur,Gluck,Kosslyn,1984]
Wed, November 2ndStudent Paper Review: Pushing the Limits in Visual Recognition
  • Deep Residual Learning for Image Recognition, CVPR 2016. [arxiv] by Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun.
  • Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning. arXiv. [arxiv] by Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke, Alex Alemi
Presenters:
Jingyun [slides]

Tianyi, Haoran [slides]
Mon, November 7thStudent Paper Review: Vision and Location.
  • Learning Deep Representations for Ground-to-Aerial Geolocalization, CVPR 2015. [pdf] by Tsung-Yi Lin, Yin Cui, Serge Belongie, James Hays
  • Ontological Supervision for Fine Grained Classification of Street View Storefronts, CVPR 2015. [pdf] by Yair Movshovitz-Attias, Qian Yu, Martin C. Stumpe, Vinay Shet, Sacha Arnoud, Liron Yatziv
Presenters:
Yutong, Zheyuan [slides]

Minghua [slides]
Wed, November 9th Brainstorming Session for New Tasks & Methods in Visual Recognition
Mon, November 14th No classes this day -- Please use this time to work on your projects.
Wed, November 16thStudent Paper Review: Efficient Deep Models.
  • XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks, ECCV 2016. [arxiv] by Mohammad Rastegari, Vicente Ordonez, Joseph Redmon, Ali Farhadi
  • EIE: Efficient Inference Engine on Compressed Deep Neural Network, ISCA 2016. [arxiv] by Song Han, Xingyu Liu, Huizi Mao, Jing Pu, Ardavan Pedram, Mark Horowitz, Bill Dally
Presenters:
Naveen [slides]

Sihang [slides]
Mon, November 21stLast Lecture: Course Recap [slides].
  • Recap on Recurrent Neural Networks
  • Course Overview and recap
  • Scholarships and Ethics in AI
Wed, November 23rdDay Before Thanksgiving - no classes this day.
Mon, November 28thProject Presentations
Wed, November 30thProject Presentations
Mon, December 5thProject Presentations
Project Deadline including Report or Technical Blog post (Mon, December 5th)

Other similar courses that might be of interest:

Department of Computer Science, University of Virginia, 2016.