The notion of task-oriented vision was pioneered by Katsushi Ikeuchi in the early 1990's as an alternative to the traditional general-purpose Marr's school of thought. In task-oriented vision, the appropriate architecture should be dependent on the goals of the system and the environment in which the task is to be achieved. This workshop centers around the importance of task-oriented vision and its influence in computer vision as a whole. Researchers who are familiar with Katsushi Ikeuchi's work are invited to provide their perspectives.
|09:45-10:20||Takeo Kanade||Research on Vision Algorithm Compiler|
|10:20-10:55||Olivier Faugeras||Revisiting the Geometry of Low-level Vision from the Point of View of Biological and Machine Vision|
|11:15-11:50||Harry Shum||Internet Vision: Challenges and Opportunities|
|11:50-12:25||Makoto Nagao||Annotation and Retrieval of Pictures for Use in Digital Libraries|
|14:00-14:35||Martial Hebert||Steps Toward Modeling and Understanding a User's Environment|
|14:35-15:10||Berthold Horn||Closed Loop System Test for Machine Vision|
|15:10-15:45||In So Kweon||Robust Computer Vision Techniques and Applications|
|16:00-16:35||Shree Nayar||A Digital Camera for Education|
Revisiting the Geometry of Low-level Vision from the Point of View of Biological and Machine Vision
Common practice in image processing, computer vision, and evidence from the neurophysiology of vision indicate that a great deal of the processing performed on the input flow of images by artificial or biological systems can be represented at a suitable abstract level by a set of mathematical equations of a certain type. There is a deep and quite complex interplay between the geometry of the space of the solutions of these equations and the perception of the visual space that they are meant to represent. We describe in a pedestrian way the underlying neurophysiology and theory that involves bifurcation theory, Euclidean and hyperbolic geometry. We illustrate our ideas with several examples drawn from the perception/processing of edges, textures and motion and comment on their implications for biological and machine vision.
Olivier Faugeras is a graduate from the Ecole Polytechnique, France (1971). He holds a PhD in Computer Science and Electrical Engineering from the University of Utah (1976) and a Doctorate of Science from Paris VI University (1981). He is currently Research Director at INRIA (National Research Institute in Computer Science and Control Theory), where he leads the Odyssee laboratory located in Sophia-Antipolis and at the Ecole Normale Superieure, Paris. His research interests include the application of mathematics to computer and biological vision, shape representation and recognition, the use of functional imaging (MR, MEG, EEG) for understanding brain activity and in particular visual perception.
He has published extensively in archival Journals, International Conferences, has contributed chapters to many books and is the author of "Artificial 3-D Vision" published in 1993 by MIT Press and, with Quang-Tuan Luong and Theo Papadopoulo, of "The Geometry of Multiple Images" which appeared in March 2001, also at MIT Press. He has co-edited with Nikos Paragios and Yunmei Chen "The Handbook of Mathematical Models in Computer Vision" published in 2005 by Springer.
He was an adjunct Professor from 1996 to 2001 in the Electrical Engineering and Computer Science Department of the Massachusetts Institute of Technology and a member of the AI Lab. He has served as Associate Editor for IEEE PAMI from 1987 to 1990 and as co-Editor-in-Chief of the International Journal of Computer Vision from 1991 to 2004. He is currently the Editor-in-Chief of the forthcoming Encyclopedia of Computer Vision http://refworks.springer.com/CV/.
In April 1989 he received the "Institut de France - Fondation Fiat" award from the French Academy of Sciences for his work in Vision and Robotics. In July 1998 he received the "France Telecom" award from the French Academy of Sciences for his work on Computer Vision and Geometry.
In November 1998 he was elected a member of the French Academy of Sciences and was in 2000 one of the founding members of the French Academy of Technology.
Steps Toward Modeling and Understanding a User's Environment
Understanding the environment of a user from images involves estimating the user's location in the environments, recognizing the identity of the objects in the environment, and reconstructing the geometric structure of the environment. Roughly speaking, these vision tasks attempt to answer respectively: Where am I? Which objects are there? What is around me? Extensive work has been done in the past, but the resulting systems are still brittle. In this talk, I will review some of the computer vision projects that we have recently undertaken in these areas. I will show (hopefully) new approaches which enables to obtain higher performance than existing approaches or to addressed tasks that cannot be currently addressed. Insights used in these approaches include the use of reasoning techniques from early work in computer vision, better use of 3D geometry, and use of various sources of visual data.
Martial Hebert is a professor at the Robotics Institute, Carnegie Mellon University. His current research interests include object recognition in images, video, and range data, scene understanding using context representations, and model construction from images and 3D data. His group has explored applications in the areas of autonomous mobile robots, both in indoor and in unstructured, outdoor environments, automatic model building for 3D content generation, and video monitoring. He has served on the program committees of the major conferences in computer vision and robotics. He is a member of the IEEE.
Berthold K.P. Horn
Closed Loop System Test for Machine Vision
From the beginning, one common view of the task of machine vision was that it was to derive some sort of description of the world from images. However, since there are an infinite number of possible descriptions, this view in itself is not very satisfactory unless some use for that description is also indicated. The task-oriented approach to machine vision evolved from this germ of an idea. One version of the notion is that if machine vision is part of an overall closed loop system, and the overall system "works", then perhaps the machine vision part must have "worked" also. Early exemplars of such systems include those in robotics experiments such as SRI's Shakey robot, MIT's "Copy Demo" and Katsushi Ikeuchi's work on "Bin Picking".
Berthold K. P. Horn received the B.Sc.Eng. degree from the University of the Witwatersrand, Johannesburg, South Africa, in 1965 and the S.M. and Ph.D. degrees from the Massachusetts Institute of Technology (MIT) in 1968 and 1970, respectively. Aside from a short period teaching at the University of the Witwatersrand and at an early computer software company, he has been on the Faculty of the Electrical Engineering and Computer Science Department, MIT. He also has been a Member of the Artificial Intelligence Laboratory at MIT since his student years. He is the author or coauthor of three books on machine vision and programming. Dr. Horn was awarded the Rank Prize for pioneering work leading to practical vision systems in 1989 and was elected a Fellow of the American Association of Artificial Intelligence in 1990. He was elected to the National Academy of Engineering in 2002.
Research on Vision Algorithm Compiler
Given an object and/or its CAD model, producing a program to recognize it in an image, dubbed as a vision algorithm compiler, was one of the important computer vision goals set by Katsu Ikeuchi in the mid-1980fs. I will talk about the effort towards that goal, and how those efforts have changed between then and now.
Takeo Kanade is the U. A. and Helen Whitaker University Professor of Computer Science and Robotics, and the Director of Quality of Life Technology Engineering Research Center at Carnegie Mellon University. He received his Doctoral degree in Electrical Engineering from Kyoto University, Japan, in 1974. After holding a faculty position in the Department of Information Science, Kyoto University, he joined Carnegie Mellon University in 1980, where he was the Director of the Robotics Institute from 1992 to 2001.
Dr. Kanade works in multiple areas of robotics: computer vision, multi-media, manipulators, autonomous mobile robots, and sensors. He has written more than 250 technical papers and reports in these areas, and holds more than 15 patents. He has been the principal investigator of more than a dozen major vision and robotics projects at Carnegie Mellon.
Dr. Kanade has been elected to the National Academy of Engineering and the American Academy of Arts and Sciences. He is a Fellow of the IEEE, a Fellow of the ACM, a Founding Fellow of American Association of Artificial Intelligence (AAAI), and the former and founding editor of International Journal of Computer Vision. The awards he received include the C&C Award, the Okawa Prize, IEEE PAMI-TC A. Rosenfeld Life Time Achievement Award, Joseph Engelberger Award, IEEE Robotics and Automation Society Pioneer Award, FIT Funai Accomplishment Award, Allen Newell Research Excellence Award, JARA Award, IEEE Computer Vision Longuet-Higgins Prize, and Marr Prize Award. Dr. Kanade has served on government, industry, and university advisory or consultant committees, including the Aeronautics and Space Engineering Board (ASEB) of the National Research Council, NASA's Advanced Technology Advisory Committee, the PITAC Panel for Transforming Healthcare Panel, and the Advisory Board of the Canadian Institute for Advanced Research.
In So Kweon
Robust Computer Vision Techniques and Applications
Research in KAIST Robotics and Computer Vision (RCV) Lab. has been focused on developing robust methods concerning important computer vision problems: 3D structure recovery, image processing and object recognition. In this talk, we first present robust methods for finding feature correspondences from an image pair with significant deformation. We then introduce a new theory to model the sensor noise of CCD cameras for low-level image processing, such as edge and corner detection. The robustness against illumination variations will be demonstrated by extensive experiments. Finally, we will present a graphical model based object recognition framework for recognizing objects under strong cluttered backgrounds. The framework is designed to resemble the characteristics of the human vision system. Experimental results using the standard DBs and real images show the feasibility of the proposed method for real-world applications, such as intelligent service robots.
In So Kweon received the B.S. and the M.S. degrees in Mechanical Design and Production Engineering from Seoul National University, Seoul, Korea, in 1981 and 1983, respectively, and the M.S. and Ph.D. degree in Robotics from the Robotics Institute at Carnegie Mellon University, Pittsburgh, U.S.A, in 1986 and 1990,respectively. He worked for Toshiba R&D Center, Japan, and joined the Department of Automation and Design Engineering at KAIST in 1992. He is now a Professor in the Department of Electrical Engineering at KAIST.
His research interests are in computer vision, robotics, pattern recognition, and automation. Specific research topics include invariant based visions for recognition and assembly, 3D sensors and range data analysis, color modeling and analysis, robust edge detection, and moving object segmentation and tracking. He is a member of ICASE, IEEE, and ACM.
Makoto NagaoPhoto: Shinji Kubo
Annotation and Retrieval of Pictures for Use in Digital Libraries
Nowadays digital cameras and video devices are widely used and lot of still and motion pictures are stored in a computer. Everybody confronts with the difficulty of utilizing these pictures because a good retrieval system for pictures is not available. Pictures include varieties of features, such as objects, their properties and functions, and relations between objects in a picture, besides the bibliographic information of pictures such as author, date, theme and genre. These features are not easily extracted automatically from pictures. When the amount of pictures becomes quite huge, retrieval accuracy must be high enough to retrieve only pictures which are wanted. When retrieval requests are related to meanings implied from pictures, such as a picture of my son running in an athletic meeting, automatic analysis of pictures cannot give any satisfactory answer. Therefore annotation by human hands is unavoidable. This talk will discuss these problems.
Dr. Nagao graduated from Graduate School of Engineering, Kyoto University and received his Ph.D in Information Engineering from Kyoto University in 1966.
He was appointed the President of Kyoto University in 1997 and became Emeritus Professor in 2003. He took up the position of President of the National Institute of Information and Communications Technology in 2004 and has been acting as the President of the National Diet Library since April 2007.
He also served as the President of the Japan Association of National Universities; as the founder President of International Association for Machine Translation (IAMT) and the Association for Natural Language Processing (NLP); and as the President of the Institute of Electronics, Information and Communication Engineers (IEICE), Information Processing Society of Japan (IPSJ), and Japan Library Association (JLA).
Dr. Nagao's research activities cover a variety of topics, including natural language processing, image processing, machine translation, information engineering, digital library system, and intelligence information science.
His academic contributions were recognized through the IEEE Emanuel R. Piore Award (1993) and the Medal with Purple Ribbon honored by the Japanese Government (1997). He was Japan Prize Laureate in the prize category of Information and Media Technology (2005), and Chevalier de la Legion d'honneur, France (2005). He was appointed as the Person of Culture Merit in 2008.
A Digital Camera for Education
Today, the camera is almost exclusively designed for, and marketed to, adults. A typical consumer camera comes with a sleek silver or black exterior and is densely packed with components and features. If one tries to open up one of these devices to study its innards, it is unlikely to function when put back together.
We believe camera manufacturers have overlooked a large demographic in kids and a compelling application in education. In this talk, I will present a digital camera that has been designed to expose students to important concepts in science and engineering. Our target audience is under-privileged students between the ages of 8 and 16 years.
Shree K. Nayar received his PhD degree in Electrical and Computer Engineering from the Robotics Institute at Carnegie Mellon University in 1990. He is currently the T. C. Chang Professor of Computer Science at Columbia University. He co-directs the Columbia Vision and Graphics Center. He also heads the Columbia Computer Vision Laboratory (CAVE), which is dedicated to the development of advanced computer vision systems. His research is focused on three areas; the creation of novel cameras, the design of physics based models for vision, and the development of algorithms for scene understanding. His work is motivated by applications in the fields of digital imaging, computer graphics, and robotics.
He has received best paper awards at ICCV 1990, ICPR 1994, CVPR 1994, ICCV 1995, CVPR 2000 and CVPR 2004. He is the recipient of the David Marr Prize (1990 and 1995), the David and Lucile Packard Fellowship (1992), the National Young Investigator Award (1993), the NTT Distinguished Scientific Achievement Award (1994), the Keck Foundation Award for Excellence in Teaching (1995) and the Columbia Great Teacher Award (2006). In February 2008, he was elected to the National Academy of Engineering.
Internet Vision: Challenges and Opportunities
Former managing director of Microsoft Research Asia, Dr. Harry Shum, a Corporate Vice President at Microsoft now, has taken the new role of leading the Core Search Development of Microsoft.
Dr. Shum is an Institute of Electrical and Electronics Engineers (IEEE) Fellow and an Association for Computing Machinery (ACM) Fellow. He serves on the editorial board of the International Journal of Computer Vision, and is a Program Chair of the International Conference of Computer Vision (ICCV) 2007. Dr. Shum has published more than 100 papers in computer vision, computer graphics, pattern recognition, statistical learning, and robotics. He holds more than 50 U.S. patents.
Dr. Shum joined Microsoft Research in 1996 where he worked in Redmond, WA as a researcher on computer vision and computer graphics. In 1999, Shum moved to Beijing to help start Microsoft Research China (later renamed Microsoft Research Asia). His tenure there began as a research manager and subsequently moved up to Assistant Managing Director, Managing Director of Microsoft Research Asia, Distinguished Engineer, and Corporate Vice President. In 2007, Shum became Microsoft Corporate Vice President, and was lauded for his leadership in technology and management.
Dr. Shum received a doctorate in robotics from the School of Computer Science at Carnegie Mellon University in Pittsburgh, PA. In his spare time, he enjoys playing basketball, rooting for the Pittsburgh Steelers, and spending quality time with his family.
- Sing Bing Kang (Microsoft Research)
- Yoichi Sato (University of Tokyo)
- Shree Nayar (Columbia University)
- Martial Hebert (Carnegie Mellon University)
- Heung-Yeung (Harry) Shum (Microsoft)
Local Arrangements Chair
- Jun Takamatsu (Nara Advanced Institute of Science and Technology)
Yoichi Sato <ysato AT iis.u-tokyo.ac.jp>
Institute of Industrial Science, The University of Tokyo
4-6-1 Komaba, Meguro-ku, Tokyo 153-8505, JAPAN