Location: Kumo Conference Room, FX Palo Alto Laboratory (FXPAL), 3174 Porter Drive, Palo Alto, California 94304                       (Parking is available around the building)

Time: Nov.7, Thursday, 1:30pm - 4:30pm     Agenda     See Who's Attending  Gallery

Keynote Speaker - Towards Mobile Augmented Reality

Bernd Girod

Stanford University

Mobile devices are expected to become ubiquitous platforms for visual search and mobile augmented reality applications. For object recognition on mobile devices, a visual database is typically stored in the cloud. Hence, for a visual comparison, information must be either uploaded from, or downloaded to, the mobile over a wireless link. The response time of the system critically depends on how much information must be transferred in both directions, and efficient compression is the key to a good user experience. We review recent advances in mobile visual search, using compact feature descriptors, and show that dramatic speed-ups and power savings are possible by considering recognition, compression, and retrieval jointly. For augmented reality applications, where image matching is performed continually at video frame rates, interframe coding of SIFT descriptors achieves bit-rate reductions of 1-2 orders of magnitude relative to advanced video coding techniques. We will use real-time implementations for different example applications, such as recognition of landmarks, media covers or printed documents, to show the benefits of implementing computer vision algorithms on the mobile device, in the cloud, or both.




Introduction to Large Scale Nearest Neighbor Search Problems and Methods

Junfeng He


We are witnessing a big data era, in which billions or more data with high dimensions can easily be found on the Web multimedia, social networks, enterprise data centers, surveillance sensor systems, etc. Nearest neighbor (NN) search is fundamental to many applications dealing with those large scale data sets,  including content based retrieval, ranking, recommendation, graph/social network research, as well as other machine learning problems. This talk will give a brief introduction/overview about large scale nearest neighbor search problem and methods, and show some applications and demos on visual search engine.



Keynote Speaker - Reshaping User Experiences with Analytics

Haohong Wang

General Manager, TCL Research America

In the past few years, devices with screens have been getting much smarter, however, far from sufficient for the large screens. Almost all industry giants tried and failed, some hurt badly, in bringing pleasant user experiences to the home screens, thus this trillion-dollar market has not been really conquered so far. Now we are marching into the era of Ultra High-Definition (UHD), the screen size and resolution will increase again significantly, however, the pace of user interaction development seems lag behind. In this talk, we discuss using data analytics to improve user experiences for home entertainment. With the incorporate of analytics components, such as user behaviors learning and mining, user preference understanding, media low-level features and high-level semantics extraction, object detection and recognition, media recognition, and real-time recommendation and so on, we showcase that user experience innovations can be achieved to make the devices with screens much more user friendly.


Supporting Media Bricoleurs with Cemint

Scott Carter


When expository video is made interactive it can be useful above and beyond non-interactive video because it can be repeated, accessed randomly, and annotated. FXPAL and many other labs have in the past explored a variety of such interactive video tools. But we would like to go a step further than interaction to facilitate what Lévi-Strauss described as a “dialogue with the materials”. By applying to video the same techniques and metaphors we apply to other media such as cut-and-paste, drag-and-drop, and spatial editing, we can support the construction of a new type of multimedia document in which spatial and temporal layouts have equal weight, can influence one another, and through which content can flow in any direction. 

In our lab we are beginning to develop a suite of tools to support such seamless inter-media synthesis in multimedia documents. The suite, called Cemint (for Component Extraction from Media for Interaction, Navigation, and Transformation), includes mobile- and web-based tools that allow users to create temporal content from spatial resources and vice versa. In this talk I  will discuss some tools we have built within this framework as well as opportunities for future work.


What's BAMMF?

BAMMF is a bi-monthly Bay Area Multimedia Forum series. Experts from both academia and industry are invited to exchange ideas and information through talks, tutorials, panel discussions and networking sessions. Topics of the forum will include emerging areas in multimedia, advancement in algorithms and development, demonstration of new inventions, product innovation, business opportunities, etc. If you are interested in giving a talk at the forum, please contact us.

