Read a summary of our workshop on the Follow the Crowd blog.

Keynote Speakers

Meredith Ringel Morris

Microsoft Research

Combining Human and Machine Intelligence to Describe Images to People with Visual Impairments

Download Slides

Digital imagery pervades modern life. More than a billion images per day are produced and uploaded to social media sites, and we also encounter digital images within websites, apps, digital documents, and eBooks. Engaging with digital imagery is part of the fabric of participation in contemporary society, including education, the professions, e-commerce, civic participation , entertainment, and social interactions. However, most digital images remain inaccessible to the 39 million people worldwide who are blind. By some estimates, nearly half of online images lack any alternative text descriptions that can be read aloud by screen reader software, and many images that do contain alt text have captions that are of poor quality; many popular emerging formats like social media and mobile apps do not even offer content authors the ability to specify caption information. Emerging AI techniques, such as vision-to-language systems, offer a cheap, scalable means of labeling digital images; however, these technologies have a long way to go before they can be a reliable information source for people who are visually impaired. To help supplement, correct, and train AI captioning systems, human-in-the-loop techniques such as crowdsourcing and friendsourcing can play an important role in advancing caption coverage and quality. In this talk, I will discuss the tradeoffs of various image-description techniques, and present example hybrid intelligence systems for making digital imagery accessible to screen reader users.

Meredith Ringel Morris is a Principal Researcher at Microsoft Research; she is also an affiliate Professor at the University of Washington in both the School of Computer Science & Engineering and the School of Information. Dr. Morris’s research focuses on human-computer interaction, specializing in computer-supported cooperative work and social computing. Her past research contributions have included interaction techniques to support group work around large, shared displays and novel systems supporting collaborative and social web search. Her current research focuses on the intersection of accessibility and social technologies. Dr. Morris earned her Ph.D. and Master’s degrees in computer science from Stanford University, and her Sc.B. in computer science from Brown University. More information about her research, including her full list of publications, can be found at http://merrie.info.


Walter Lasecki

University of Michigan

Real-Time Crowdsourcing for On-Demand Training of Computer Vision Systems

Systems that see and understand visual scenes can help people with disabilities better access the world around them, help complete dangerous jobs in hazardous conditions, and generally allow us more control over our physical environments. Computer vision has had significant and wide-spread success with machine learning-based approaches for specific classes of problems, but effectively transferring that knowledge to new, more general domains is an open problem. Thus, generating the massive, tailored training data sets that are needed to retrain these ML algorithms for use in new settings is a critical challenge. Crowdsourcing has provided a means of collecting data at scale, but is typically an offline/batch process that takes days or weeks to generate data. Significant prior work has focused on improving efficiency in this context. In this talk, I argue that the future of large-scale computer vision lies in on-demand streams of data that allow training to be done on-the-fly. The resulting systems are more robust, more flexible, and more efficient than those using a priori training data. I will discuss my lab’s work on real-time crowdsourcing and show how human insight can be brought to bear when and where they are encountered (within seconds or less) by intelligent systems in real-world scenarios.

Walter S. Lasecki is an Assistant Professor of Computer Science and Engineering at the University of Michigan, Ann Arbor, where he directs the Crowds+Machines (CROMA) Lab. He and his students create interactive intelligent systems that are robust enough to be used in real-world settings by combining both human and machine intelligence to exceed the capabilities of either. These systems let people be more productive and help improve access to the world for people with disabilities. Dr. Lasecki received his Ph.D and M.S. from the University of Rochester in 2015 and a B.S. in Computer Science and Mathematics from Virginia Tech in 2010. He has previously held visiting research positions at CMU, Stanford, Microsoft Research, and Google[x].


Accepted Papers

Schedule

Location: Beauport Meeting Room on the 2nd floor of the Hilton Quebec Hotel

9:00 Welcome
9:15 Keynote - Meredith Ringel Morris
10:00 2 paper discussions
10:30 Break (with posters)
11:00 3 paper discussions
12:00 Lunch sponsored by Evolv
1:15 Keynote - Walter Lasecki
2:00 2 paper discussions
2:30 Themed breakout groups
3:30 Break (with posters)
4:00 Next steps and awards
5:00 End