Whitepaper
-
Facial Recognition Technologies in the Wild: A Call for a Federal Office
Facial Recognition Technologies: A Primer [Companion Document]
This whitepaper makes the case for a federal office in charge of regulating Face Recognition Technologies (FRTs). We argue that benchmarks are insufficient for determining the appropriateness for FRTs and a more holstic approach is needed that takes into account technical, societal and legal challenges.May 29th 2020. https://www.ajlunited.org/federal-office-call
Preprints
-
Taming Data and Transformers for Audio Generation
arXiv:2406.19388 June 2024. [project page] [arxiv] -
Generative Visual Instruction Tuning
arXiv:2406.11262 June 2024. [github] [arxiv] -
Learning from Models and Data for Visual Grounding
arXiv:2403.13804 March 2024. [project page] [arxiv]
Publications
-
NEW! PropTest: Automatic Property Testing for Improved Visual Programming
Conf. on Empirical Methods in Natural Language Processing. EMNLP 2024 (Findings). [project page] [arxiv] -
NEW! Zero-Shot Controllable Image-to-Video Animation via Motion Decomposition
ACM Multimedia. MM 2024. Melbourne, Australia. [project page] [openreview] -
NEW! ViC-MAE: Self-Supervised Representation Learning from Images and Video with Contrastive Masked Autoencoders
European Conference on Computer Vision. ECCV 2024. Milan, Italy. [project page] [arxiv] [github] -
NEW! Grounding Language Models for Visual Entity Recognition
European Conference on Computer Vision. ECCV 2024. Milan, Italy. [github] [arxiv] -
NEW! Improved Visual Grounding through Self-Consistent Explanations
Conf. on Computer Vision and Pattern Recognition. CVPR 2024. Seattle, WA. [project page] [arxiv] -
NEW! ElasticDiffusion: Training-free Arbitrary Size Image Generation through Global-Local Content Separation
Conf. on Computer Vision and Pattern Recognition. CVPR 2024. Seattle, WA. [project page] [arxiv] [code] [demo] -
NEW! SCoRD: Subject-Conditional Relation Detection with Text-Augmented Data
Winter Conference on Applications of Computer Vision WACV 2024. Waikoloa, HI. [arxiv] [code] [demo] -
Variation of Gender Biases in Visual Recognition Models Before and After Finetuning
Workshop on Algorithmic Fairness through the Lens of Time at NeuRIPS 2023. New Orleans, LA. [arxiv] [code] -
Going Beyond Nouns With Vision & Language Models Using Synthetic Data
International Conference on Computer Vision. ICCV 2023. Paris, France. [project page] [arxiv] [code] -
Improving Visual Grounding by Encouraging Consistent Gradient-based Explanations
Conf. on Computer Vision and Pattern Recognition. CVPR 2023. Vancouver, Canada. [arxiv] [code] [demo] -
Estimating and Maximizing Mutual Information for Knowledge Distillation
Workshop on Fair, Data Efficient and Trusted Computer Vision at CVPR 2023. Vancouver, Canada. [arxiv] -
CLIP-Lite: Information Efficient Visual Representation Learning from Textual Annotations
International Conference on Artificial Intelligence and Statistics AISTATS 2023. Valencia, Spain (Hybrid). [arxiv] -
On the Transferability of Visual Features in Generalized Zero-Shot Learning
arXiv:2211.12494 November 2022. [arxiv] [github] -
SimVQA: Exploring Simulated Environments for Visual Question Answering. Conf. on Computer Vision and Pattern Recognition CVPR 2022. [project page] [arxiv] [bibtex]
-
Towards Understanding Gender-Seniority Compound Bias in Natural Language Generation. Language Resources and Evaluation Conference LREC 2022. [arxiv]
-
Backpropagation-Based Decoding for Multimodal Machine Translation
Frontiers in Artificial Intelligence. January 2022. [link] [bibtex] -
Evolving Image Compositions for Feature Representation Learning
British Machine Vision Conference. BMVC 2021. November 2021. [project page] [arxiv] [bibtex] -
Visual News : Benchmark and Challenges in Entity-aware Image Captioning
Empirical Methods in Natural Language Processing. EMNLP 2021. Virtual / Punta Cana, Dominican Republic. November 2021. [arxiv] [code] [bibtex] (~Oral presentation) -
Instance-level Image Retrieval using Reranking Transformers
International Conference on Computer Vision ICCV 2021. [arxiv] -
MEDIRL: Predicting the Visual Attention of Drivers via Maximum Entropy Deep Inverse Reinforcement Learning.
International Conference on Computer Vision ICCV 2021. [project page] [code] [arxiv] -
General Multi-label Image Classification with Transformers
Conference on Computer Vision and Pattern Recognition CVPR 2021. [arxiv] [bibtex] -
Black-box Explanation of Object Detectors via Saliency Maps
Conference on Computer Vision and Pattern Recognition CVPR 2021. [arxiv] (~Oral presentation) -
Curriculum Labeling: Revisiting Pseudo-Labeling for Semi-Supervised Learning
The 35th AAAI Conference on Artificial Intelligence. AAAI 2021. February 2021 [arxiv] [code] [bibtex] -
Enabling AI at the Edge with XNOR-Networks
.
Communications of the ACM. December 2020 (Vol. 62, No. 12). (~Research Highlight)
[link] [bibtex] -
Chair Segments: A Compact Benchmark for the Study of Object Segmentation
arxiv:2011.14027 Nov 2020. [project page] [code] [arxiv] [bibtex] -
Using Visual Feature Space as a Pivot Across Languages
Findings of the Association for Computational Linguistics: Findings of EMNLP 2020. [pdf] [project page] [code] [bibtex] -
Double-Hard Debias: Tailoring Word Embeddings for Gender Bias Mitigation
Association for Computational Linguistics. ACL 2020. Seattle, Washington. July 2020. [arxiv] -
Generative-discriminative Feature Representations for Open-set Recognition
Conference on Computer Vision and Pattern Recognition CVPR 2020. [pdf] [bibtex] -
Testing DNN Image Classifiers for Confusion & Bias Errors
International Conference on Software Engineering. ICSE 2020. Seoul, South Korea, October 2020. [arxiv] [bibtex] -
Drill-down: Interactive Retrieval of Complex Scenes using Natural Language Queries
Conf. on Neural Information Processing Systems. NeurIPS 2019. Vancouver, Canada. December 2019. [arxiv] [code] [bibtex] -
Balanced Datasets Are Not Enough: Estimating and Mitigating Gender Bias in Deep Image Representations
International Conference on Computer Vision. ICCV 2019. Seoul, South Korea. October 2019. [arxiv] [project page] [code] [demo] [bibtex] -
Text2Scene: Generating Compositional Scenes from Textual Descriptions
. Intl. Conference on Computer Vision and Pattern Recognition. CVPR 2019. Long Beach, California. June 2019. [arxiv] [code] [demo] [bibtex]
(~Oral presentation + Best Paper Finalist -- top 1% of submissions)
- IBM Research Blog Coverage
- NVIDIA News Coverage -
Moviescope: Large-scale Analysis of Movies using Multiple Modalities
arXiv:1908.03180. August 2019. [arxiv] [project page] [bibtex]
- TechXplore News Coverage -
Gender Bias in Contextualized Word Embeddings
North American Chapter of the Association for Computational Linguistics. NAACL 2019. short. Minneapolis, Minnesota. June 2019. [arxiv] [bibtex] (~Oral presentation) -
Chat-crowd: A Dialog-based Platform for Visual Layout Composition
North American Chapter of the Association for Computational Linguistics. NAACL 2019. System Demonstrations. Minneapolis, Minnesota. June 2019. [arxiv] [project page] [code] -
Deep Feature Aggregation and Image Re-ranking with Heat Diffusion for Image Retrieval
IEEE Transactions on Multimedia 2019 (Journal). [Accepted October 2018].
[arxiv] [bibtex] -
Feedback-prop: Convolutional Neural Network Inference under Partial Evidence
. Intl. Conference on Computer Vision and Pattern Recognition. CVPR 2018. Salt Lake City, Utah. June 2018. [pdf] [project page] [arXiv] [code] [bibtex] -
Gender Bias in Coreference Resolution: Evaluation and Debiasing Methods
North American Chapter of the Association for Computational Linguistics. NAACL 2018. short. New Orleans, Louisiana. June 2018. [pdf] [arXiv] [code] [bibtex] -
Building Discriminative CNN Image Representations for Object Retrieval using the Replicator Equation
Pattern Recognition 2018 (Journal). Volume 83. Pages 150-160.
[link] [code] [bibtex] -
Where and Who? Automatic Semantic-Aware Person Composition
Winter Conference on Applications of Computer Vision. WACV 2018. Lake Tahoe, Nevada. March 2018.
[pdf] [arXiv] [supp. material] [code] [bibtex] -
Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints.
Empirical Methods in Natural Language Processing. EMNLP 2017. Copenhagen, Denmark. September 2017. [pdf] [code] [bibtex] (~Oral presentation + Best Long Paper Award!)
- WIRED News Coverage
- Daily Mail News Coverage
- Times of London News Coverage -
Obj2Text: Generating Visually Descriptive Language from Object Layouts
Empirical Methods in Natural Language Processing. EMNLP 2017. Copenhagen, Denmark. September 2017. [pdf] [arxiv] [code] [bibtex] (~Oral presentation) -
Commonly Uncommon: Semantic Sparsity in Situation Recognition
.
Intl. Conference on Computer Vision and Pattern Recognition. CVPR 2017. Honolulu, Hawaii. July 2017. [pdf] [arXiv] [bibtex] [demo] -
XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks
.
European Conference on Computer Vision. ECCV 2016. Amsterdam, The Netherlands. October 2016. [arXiv] [project page] [code] [bibtex] (~Oral presentation)
- New York Times News Coverage
- Article on University of Washington News -
Stating the Obvious: Extracting Visual Common Sense Knowledge
. North American Chapter of the Association for Computational Linguistics. NAACL 2016. short. San Diego, CA. June 2016. [pdf] [bibtex] (~Oral presentation) -
Learning to Name Objects
.
Communications of the ACM. March 2016 (Vol. 59, No. 3). (~Research Highlight)
[pdf] [link] [technical perspective] [bibtex] -
Predicting Entry-Level Categories
.
International Journal of Computer Vision - Marr Prize Special Issue. IJCV 2015.
[pdf] [link] [bibtex] -
Large Scale Retrieval and Generation of Image Descriptions
.
International Journal of Computer Vision. IJCV 2015. [August 2016 Issue]. [pdf] [link] [bibtex] -
ReferItGame: Referring to Objects in Photographs of Natural Scenes
.
Empirical Methods on Natural Language Processing. EMNLP 2014. Doha, Qatar. October 2014. [pdf] [project page] [game] [bibtex] (~Oral presentation) -
Learning High-level Judgments of Urban Perception
.
European Conference on Computer Vision. ECCV 2014. Zurich, Switzerland. September 2014. [pdf] [project page] [bibtex] -
TreeTalk: Composition and Compression of Trees for Image Descriptions
.
Transactions of the Association for Computational Linguistics. TACL 2014.
To be presented at EMNLP 2014 in Doha, Qatar. October 2014. [pdf] [bibtex] -
Furniture-Geek: Understanding Fine-Grained Furniture Attributes from Freely Associated Text and Tags
. IEEE Winter Conference on Applications of Computer Vision. WACV 2014. Steamboat Springs, CO. March 2014. [pdf] [bibtex] -
From Large Scale Image Categorization to Entry-Level Categories
.
IEEE International Conference on Computer Vision. ICCV 2013. Sydney, Australia. December 2013. [pdf] [supplemental material] [slides] [project page] [bibtex] (~Oral Presentation + Best Paper Award - Marr Prize!) -
Generalizing Image Captions for Image-Text Parallel Corpus
.
Association for Computational Linguistics. ACL 2013. short. Sofia, Bulgaria. August 2013. [pdf] [data+results] [bibtex] -
Baby Talk: Understanding and Generating Simple Image Descriptions
.
IEEE Transactions on Pattern Analysis and Machine Intelligence. PAMI 2013
[pdf] [link] [bibtex] -
Collective Generation of Natural Image Descriptions
.
Association for Computational Linguistics. ACL 2012. Jeju, South Korea. July 2012.
[pdf] [data] [bibtex] (~Oral presentation) -
Im2Text: Describing Images Using 1 Million Captioned Photographs
.
Conf. in Neural Information Processing Systems. NeurIPS 2011. Granada, Spain. December 2011. [pdf] [code+dataset] [poster] [search tool] [bibtex] (~Spotlight presentation) -
High Level Describable Attributes for Predicting Aesthetics and Interestingness
.
IEEE Computer Vision and Pattern Recognition. CVPR 2011. Colorado Springs, CO. June 2011. [pdf] [related code for saliency + low DoF attributes] [bibtex] -
The Ariadne Infrastructure for Managing and Storing Metadata
. IEEE Internet Computing 2009 . Emerging Internet Technologies and Applications for E-learning. [link]