Selper Pro: 2015

Thursday, 17 December 2015

Four years of Schema.org - Recent Progress and Looking Forward

Posted by Ramanathan Guha, Google Fellow

In 2011, we announced schema.org, a new initiative from Google, Bing and Yahoo! to create and support a common vocabulary for structured data markup on web pages. Since that time, schema.org has been a resource for webmasters looking to add markup to their pages so that search engines can use that data to index content better and surface it in new experiences like rich snippets, GMail, and the Google App.

Schema.org, which provides a growing vocabulary for describing various kinds of entity in terms of properties and relationships, has become increasingly important as the Web transitions to a multi-device, mobile-oriented world. We are now seeing schema.org being used on many millions of Web sites, defining data types and properties common across applications, platforms and products, in order to enhance the user experience by delivering the most relevant information they need, when they need it.

Schema.org in Google Rich Snippets

Schema.org in Google Knowledge Graph panels

Schema.org in Recipe carousels

In Schema.org: Evolution of Structured Data on the Web, an overview article published this week on ACM, we report some key schema.org adoption metrics from a sample of 10 billion pages from a combination of the Google index and Web Data Commons. In this sample, 31.3% of pages have schema.org markup, up from 22% one year ago. Structured data markup is now a core part of the modern web.

The schema.org group at W3C is now amongst the largest active W3C communities, serving as a hub for diverse groups exploring schemas covering diverse topics such as sports, healthcare, e-commerce, food packaging, bibliography and digital archive management. Other companies, also make use of the same data to build different applications, and as new use cases arise further schemas are integrated via community discussion at W3C. Each of these topics in turn have subtle inter-relationships - for example schemas for food packaging, for flight reservations, for recipes and for restaurant menus, each have different approaches to describing food restrictions and allergies. Rather than try to force a common unified approach across these domains, schema.org's evolution is pragmatic, driven by the combination of available Web data, and the likelihood of mainstream consuming applications.

Schema.org is also finding new kinds of uses. One exciting line of work is the use of schema.org marked up pages as training corpus for machine learning. John Foley, Michael Bendersky and Vanja Josifovski used schema.org data to build a system that can learn to recognize events that may be geographically local to a particular user. Other researchers are looking at using schema.org pages with similar markup, but in different languages, to automatically create parallel corpora for machine translation.

Four years after its launch, Schema.org is entering its next phase, with more of the vocabulary development taking place in a more distributed fashion, as extensions. As schema.org adoption has grown, a number groups with more specialized vocabularies have expressed interest in extending schema.org with their terms. Examples of this include real estate, product, finance, medical and bibliographic information. A number of extensions, for topics ranging from automobiles to product details, are already underway. In such a model, schema.org itself is just the core, providing a unifying vocabulary and congregation forum as necessary.

Tuesday, 15 December 2015

Text-to-Speech for low resource languages (episode 2): Building a parametric voice

Posted by Alexander Gutkin, Google Speech Team

This is the second episode in the series of posts reporting on the work we are doing to build text-to-speech (TTS) systems for low resource languages. In the previous episode, we described the crowdsourced data collection effort for Project Unison. In this episode, we describe our work to construct a parametric voice based on that data.

In our previous episode, we described building TTS systems for low resource languages, and how one of the objectives of data collection for such systems was to quickly build a database representing multiple speakers. There are two main justifications for this approach. First, professional voice talents are often not available for under-resourced languages, so we need to record ordinary people who get tired reading tedious text rather quickly. Hence, the amount of text a person can record is rather limited and we need multiple speakers for a reasonably sized database that can be used by others as well. Second, we wanted to be able to create a voice that sounds human but is not identifiable as a real person. Various concatenative approaches to speech synthesis, such as unit selection, are not very suitable for this problem. This is because the selection algorithm may join acoustic units from different speakers generating a very unnatural sounding result.

Adopting parametric speech synthesis techniques is an attractive approach to building multi-speaker corpora described above. This is because in parametric synthesis the training stage of the statistical component will take care of multiple-speakers by estimating an averaged out representation of various acoustic parameters representing each individual speaker. Depending on number of speakers in the corpus, their acoustic similarity and ratio of speaker genders, the resulting acoustic model can represent an average voice that is indistinguishable from human and yet cannot be traced back to any actual speakers recorded during the data collection.

We decided to use two different approaches to acoustic modeling in our experiments. The first approach uses Hidden Markov Models (HMMs). This well-established technique was pioneered by Prof. Keiichi Tokuda at Nagoya Institute of Technology, Japan and has been widely adopted in academia and industry. It is also supported by a dedicated open-source HMM synthesis toolkit. The resulting models are small enough to fit on mobile devices.

The second approach relies on Recurrent Neural Networks (RNNs) and vocoders that jointly mimic the human speech production system. Vocoders mimic the vocal apparatus to provide a parametric representation of speech audio that is amenable to statistical mapping. RNNs provide a statistical mapping from the text to the audio and have feedback loops in their topology, allowing them to model temporal dependencies between various phonemes in human speech. In 2015, Yannis Agiomyrgiannakis proposed Vocaine, a vocoder that outperforms the state-of-the-art technology in speed as well as quality. In 2013, Heiga Zen, Andrew Senior and Mike Schuster proposed a neural network-based model that mimics deep structure of human speech production for speech synthesis. The model has further been extended into a Long Short-Term Memory (LSTM) RNN. This allows long term memorization, which is good for speech applications. Earlier this year, Heiga Zen and Hasim Sak described the LSTM RNN architecture that has been specifically designed for fast speech synthesis. The LSTM RNNs are also used in our Automatic Speech Recognition (ASR) systems recently mentioned in our blog.

Using the Hidden Markov Model (HMM) and LSTM RNN synthesizers described above, we experimented with a multi-speaker Bangla corpus totaling 1526 utterances (waveforms and corresponding transcriptions) from five different speakers. We also built a third system that utilizes LSTM RNN acoustic model, but this time we made it small and fast enough to run on a mobile phone.

We synthesized the following Bangla sentence "এটি একটি বাংলা বাক্যের উদাহরণ" translated from “This is an example sentence in Bangla”. Though HMM synthesizer output can sound intelligible, it does exhibit some classic downsides with a voice that sounds buzzy and muffled. With the LSTM RNN configuration for mobile devices, the resulting audio sounds clearer and has improved intonation over the HMM version. We also tried a LSTM RNN configuration with more network nodes (and thus not suitable for low-end mobile devices) to generate this waveform - the quality is slightly better but is not a huge improvement over the more lightweight LSTM RNN version. We hypothesize that this is due to the fact that a neural network with many nodes has more parameters and thus requires more data to train.

These early results are encouraging for several reasons. First, they confirm that natural-sounding speech synthesis based on multiple speakers is practically possible. It is also significant that the total number of recordings used was relatively small, yet were able to build intelligible parametric speech synthesis. This means that it is possible to collect training data for such a speech synthesizer by engaging the help of volunteers who are not professional voice artists, for a short period of time per person. Using multiple volunteers is an advantage: it results in more diverse data, and the resulting synthetic voice does not represent any specific individual. This approach may well be the foundation for bringing speech technology to many more traditionally under-served languages.

NEXT UP: But can it say, “Google”? (Ep.3)

Monday, 14 December 2015

Making online learning even easier with a re-envisioned Course Builder

Posted by Adam Feldman, Product Manager and Pavel Simakov, Technical Lead, Course Builder Team

(Cross-posted on the Google for Education blog)

The Course Builder team believes in enabling new and better ways to learn (for both the instructor and learner). Today's release of Course Builder v1.10 furthers these goals in three ways, by being easier to use, embeddable and applicable to more types of content.

Easier to use
We took a step back and re-envisioned the menus and navigation of the administrative interface based on the steps instructors take as they create a course. These are designed to help you through the process of creating, styling, publishing and managing your courses. This re-imagined design gives a solid foundation for future versions of Course Builder.

A completely redesigned navigation simplifies content authoring and configuration.

To support this redesign, we’ve also completely revamped our documentation. There’s now one home for all of Course Builder’s materials: Google Open Online Education. Here, you’ll find everything you need to conceptualize and construct your content, create a course using Course Builder, and even develop new modules to extend Course Builder’s capabilities. The content now reflects the latest features and organization. This re-imagined design gives a solid foundation for future versions of Course Builder.

Embeddable assessment support
What if you want to use some of Course Builder’s features but already have an existing learning site? To help with these situations, Course Builder now supports embeddable assessments (graded questions and answers with an optional due date). Simply create your assessments in Course Builder, copy the JavaScript snippet and paste it on any site. Your users will be able to complete the assessments from the comfort of your existing site and you’ll be able to benefit from Course Builder’s per-question feedback, auto-grading and analytics with just two short lines of code that are automatically generated for you.

We started with embeddable assessments because evaluation is so important to learning, but we don’t plan to stop there. Watch for additional embeddable components in the future.

Applicable to more types of content
Many types of online learning content, like tutorials, exercises and documentation, are a lot like online courses. For instance, they might involve presenting content to users, having them do exercises or assessments and allowing them to stop and return later. Yet, you might not think of them as traditional courses.

To make Course Builder a better fit for a broader set of online content, we’ve added a new “guides” experience. Guides are a new way for students to browse and consume your content. Compared to typical online courses -- which can enforce a strict linear path (from unit 1 to unit 2, etc.) -- guides present your content as a non-numbered list. Users are free to enter and exit in any order. It also allows you to show the content for many courses together.

You could imagine each guide being a documentation page or tutorial section. Guides also work with any existing Course Builder units and can be made available by simply enabling that feature in the dashboard. Here are a couple of our courses, when viewed as guides:

Within each guide, the user is guided through the steps, which could be portions of a docs page or lessons in a unit, as in this example from the “Power Searching with Google” sample course:

By letting users jump in and out of the content as they like, guides are ideally suited to the on-the-go learner and look great on phones and tablets. It’s our first foray into responsive mobile design... but it won’t be our last.

Guides currently support public courses, but we’ll be adding registration, enhanced statefulness and interface customization, as well as elements of dynamic learning (think of a personalized list of guides).

This release has focused on making Course Builder easier to use and more relevant. It sets up the framework to give future features a natural home. It adds embeddable assessments to make Course Builder useful in more places. And it introduces guides, a new, less linear format for consuming content.

For a full list of features, see the release notes, and let us know what you think. Keep on learning!

Thursday, 10 December 2015

Use Smart Goals, powered by Google Analytics, to optimize in AdWords

To advertise smart, you have to measure smart. And a key metric for almost any business is conversions, also known as “that moment when users do the thing that you want them to do.”

Many AdWords advertisers are already measuring their website conversions, using either AdWords Conversion Tracking or imported Google Analytics Ecommerce transactions. Measuring actual conversions is ideal, because it allows you to optimize your bids, your ads and your website with a clear goal in mind.

However, hundreds of thousands of small and medium businesses aren't measuring their website conversions today. Some businesses may not have a way for users to convert on their website and others may not have the time or the technical ability to implement conversion tracking.

The Google Analytics team is committed to helping our users use their data to drive better marketing and advertising performance. So, for businesses that don’t measure conversions in AdWords today, we’ve created an easy-to-use solution: Smart Goals. Smart Goals help you identify the highest-quality visits to your website and optimize for those visits in AdWords.

"Smart Goals helped us drive more engaged visits to our website. It gave us something meaningful to optimize for in AdWords, without having to change any tags on our site. We could tell that optimizing to Smart Goals was working, because we had higher sales than usual across our channels during the testing period."

- Richard Bissell, President/Owner, Richard Bissell Fine Woodworking, Inc

How Smart Goals Work

To generate Smart Goals, we apply machine learning across thousands of websites that use Google Analytics and have opted in to share anonymized conversion data. From this information, we can distill dozens of key factors that correlate with likelihood to convert: things like session duration, pages per session, location, device and browser. We can then apply these key factors to any website. The easiest way to think about Smart Goals is that they reflect your website visits that our model indicates are most likely to lead to conversions.

Step 1: Activate Smart Goals in Google Analytics

To activate Smart Goals in Google Analytics, simply go to the Admin section of your Google Analytics account, click Goals (under the View heading) and select Smart Goals. The highest-quality visits to your website will now be turned into Smart Goals automatically. No additional tagging or customization is required; Smart Goals just work.

To help you see how Smart Goals perform before you activate them, we’ve built a Smart Goals report in the “Conversions” section of Google Analytics. The behavior metrics in this report indicate the engagement level of Smart Goals visits compared to other visits, helping you evaluate Smart Goals before you activate the feature.

Click image for full-sized version

Step 2: Import Smart Goals into AdWords

Like any other goal in Google Analytics, Smart Goals can be imported into AdWords to be used as an AdWords conversion. Once you’ve defined a conversion in AdWords, you’re able to optimize for it.

Click image for full-sized version

Step 3: Optimizing for Smart Goals in AdWords

One of the benefits of measuring conversions in your Adwords account is the ability to set a target cost per acquisition (CPA) as opposed to just setting a cost per click (CPC). If you aren’t measuring actual conversions today, importing Smart Goals as conversions in Adwords allows you to set a target CPA. In this way, you’re able to optimize your Adwords spend based on the likelihood of conversion as determined by our model.

Smart Goals will be rolling out over the next few weeks. To be eligible for Smart Goals, your Google Analytics property must be linked to your AdWords account(s). Learn how to link your Google Analytics property to your AdWords account(s) in the Analytics Help Center or the AdWords Help Center. Note that your Google Analytics view must receive at least 1,000 clicks from AdWords over a 30-day period to ensure the validity of your data.

Posted by Abishek Sethi (Software Engineer) and Joan Arensman (Product Manager)

Tuesday, 8 December 2015

When can Quantum Annealing win?

Posted by Hartmut Neven, Director of Engineering

During the last two years, the Google Quantum AI team has made progress in understanding the physics governing quantum annealers. We recently applied these new insights to construct proof-of-principle optimization problems and programmed these into the D-Wave 2X quantum annealer that Google operates jointly with NASA. The problems were designed to demonstrate that quantum annealing can offer runtime advantages for hard optimization problems characterized by rugged energy landscapes.

We found that for problem instances involving nearly 1000 binary variables, quantum annealing significantly outperforms its classical counterpart, simulated annealing. It is more than 10⁸ times faster than simulated annealing running on a single core. We also compared the quantum hardware to another algorithm called Quantum Monte Carlo. This is a method designed to emulate the behavior of quantum systems, but it runs on conventional processors. While the scaling with size between these two methods is comparable, they are again separated by a large factor sometimes as high as 10⁸.

Time to find the optimal solution with 99% probability for different problem sizes. We compare Simulated Annealing (SA), Quantum Monte Carlo (QMC) and D-Wave 2X. Shown are the 50, 75 and 85 percentiles over a set of 100 instances. We observed a speedup of many orders of magnitude for the D-Wave 2X quantum annealer for this optimization problem characterized by rugged energy landscapes. For such problems quantum tunneling is a useful computational resource to traverse tall and narrow energy barriers.

While these results are intriguing and very encouraging, there is more work ahead to turn quantum enhanced optimization into a practical technology. The design of next generation annealers must facilitate the embedding of problems of practical relevance. For instance, we would like to increase the density and control precision of the connections between the qubits as well as their coherence. Another enhancement we wish to engineer is to support the representation not only of quadratic optimization, but of higher order optimization as well. This necessitates that not only pairs of qubits can interact directly but also larger sets of qubits. Our quantum hardware group is working on these improvements which will make it easier for users to input hard optimization problems. For higher-order optimization problems, rugged energy landscapes will become typical. Problems with such landscapes stand to benefit from quantum optimization because quantum tunneling makes it easier to traverse tall and narrow energy barriers.

We should note that there are algorithms, such as techniques based on cluster finding, that can exploit the sparse qubit connectivity in the current generation of D-Wave processors and still solve our proof-of-principle problems faster than the current quantum hardware. But due to the denser connectivity of next generation annealers, we expect those methods will become ineffective. Also, in our experience we find that lean stochastic local search techniques such as simulated annealing are often the most competitive for hard problems with little structure to exploit. Therefore, we regard simulated annealing as a generic classical competition that quantum annealing needs to beat. We are optimistic that the significant runtime gains we have found will carry over to commercially relevant problems as they occur in tasks relevant to machine intelligence.

For details please refer to http://arxiv.org/abs/1512.02206.

Monday, 7 December 2015

How to Classify Images with TensorFlow

Posted by Pete Warden, Software Engineer

Prior to joining Google, I spent a lot of time trying to get computers to recognize objects in images. At Jetpac my colleagues and I built mustache detectors to recognize bars full of hipsters, blue sky detectors to find pubs with beer gardens, and dog detectors to spot canine-friendly cafes. At first, we used the traditional computer vision approaches that I'd used my whole career, writing a big ball of custom logic to laboriously recognize one object at a time. For example, to spot sky I'd first run a color detection filter over the whole image looking for shades of blue, and then look at the upper third. If it was mostly blue, and the lower portion of the image wasn't, then I'd classify that as probably a photo of the outdoors.

I'd been an engineer working on vision problems since the late 90's, and the sad truth was that unless you had a research team and plenty of time behind you, this sort of hand-tailored hack was the only way to get usable results. As you can imagine, the results were far from perfect and each detector I wrote was a custom job, and didn't help me with the next thing I needed to recognize. This probably seems laughable to anybody who didn't work in computer vision in the recent past! It's such a primitive way of solving the problem, it sounds like it should have been superseded long ago.

That's why I was so excited when I started to play around with deep learning. It became clear as I tried them out that the latest approaches using convolutional neural networks were producing far better results than my hand-tuned code on similar problems. Not only that, the process of training a detector for a new class of object was much easier. I didn't have to think about what features to detect, I'd just supply a network with new training examples and it would take it from there.

Those experiences converted me into a deep learning enthusiast, and so when Jetpac was acquired and I had the chance to join Google and work with many of the stars of the field, I couldn't resist. What impressed me more than anything was the team's willingness to share their knowledge with the rest of the world.

I'm especially happy that we've just managed to release TensorFlow, our internal machine learning framework, because it gives me a chance to show practical, usable examples of why I'm so convinced deep learning is an essential tool for anybody working with images, speech, or text in ML.

Given my background, my favorite first example is using a deep network to spot objects in an image. One of the early showcases for the new approach to neural networks was an annual competition to recognize 1,000 different classes of objects, from the Imagenet data set, and TensorFlow includes a pre-trained network for that task. If you look inside the examples folder in the source code, you'll see “label_image”, which is a small C++ application for using that network.

The README has the instructions for building TensorFlow on your machine, downloading the binary files defining the network, and compiling the sample code. Once it's all built, just run it with no arguments, and you should see a list of results showing "Military Uniform" at the top. This is running on the default image of Admiral Grace Hopper, and correctly spots her attire.

Image via Wikipedia

After that, try pointing it at your own images using the “--image” command line flag, and you should see a set of labels for each. If you want to know more about what's going on under the hood, the C++ section of the TensorFlow Inception tutorial goes into a lot more detail.

The only things it will spot are those that are in the original 1,000 Imagenet classes, and it will always try to find something, which can lead to some funny results. There are no people categories, so on portraits you'll often see objects that are associated with people like seat belts or oxygen masks, or in Lincoln’s case, a bow tie!

Image via U.S History Images

If the image is poorly lit, then “nematode” is usually the top pick since most training photos of those are taken in very dim surroundings. It's also not perfect in its identification, with an error rate of 5.6% for getting the right label in the top five results. However, that’s not all that bad considering Stanford’s Andrej Karpathy found that even someone who was trained at the job could only achieve a slightly-better 5.1% error doing the same task manually. We can do even better if we combine the outputs of four trained models into an "ensemble", with an error rate of just 3.5%.

It's unlikely that the set of labels it produces is exactly what you need for your application, so the next step would be to train your own network. That is a much bigger task than running a pre-trained one like this, but one of the things I like about TensorFlow is that it spans the whole lifecycle of a machine learning model, from experimentation, to training, and into production, as this example shows. To get started training, I'd recommend looking at this simple tutorial on recognizing hand-drawn digits from the MNIST data set.

I hope that sharing this framework will help developers build amazing user experiences we’d never even think of. We’ve been having a massive amount of fun with TensorFlow, and I can’t wait to see what interesting image tools you build using it!

Sunday, 6 December 2015

NIPS 2015 and Machine Learning Research at Google

Posted by Sanjiv Kumar, Research Scientist

This week, Montreal hosts the 29^th Annual Conference on Neural Information Processing Systems (NIPS 2015), a machine learning and computational neuroscience conference that includes invited talks, demonstrations and oral and poster presentations of some of the latest in machine learning research. Google will have a strong presence at NIPS 2015, with over 140 Googlers attending in order to contribute to and learn from the broader academic research community by presenting technical talks and posters, in addition to hosting workshops and tutorials.

Research at Google is at the forefront of innovation in Machine Intelligence, actively exploring virtually all aspects of machine learning including classical algorithms as well as cutting-edge techniques such as deep learning. Focusing on both theory as well as application, much of our work on language understanding, speech, translation, visual processing, ranking, and prediction relies on Machine Intelligence. In all of those tasks and many others, we gather large volumes of direct or indirect evidence of relationships of interest, and develop learning approaches to understand and generalize.

If you are attending NIPS 2015, we hope you’ll stop by our booth and chat with our researchers about the projects and opportunities at Google that go into solving interesting problems for billions of people. You can also learn more about our research being presented at NIPS 2015 in the list below (Googlers highlighted in blue).

Google is a Platinum Sponsor of NIPS 2015.

PROGRAM ORGANIZERS
General Chairs
Corinna Cortes, Neil D. Lawrence
Program Committee includes:
Samy Bengio, Gal Chechik, Ian Goodfellow, Shakir Mohamed, Ilya Sutskever

ORAL SESSIONS
Learning Theory and Algorithms for Forecasting Non-stationary Time Series
Vitaly Kuznetsov, Mehryar Mohri

SPOTLIGHT SESSIONS
Distributed Submodular Cover: Succinctly Summarizing Massive Data
Baharan Mirzasoleiman, Amin Karbasi, Ashwinkumar Badanidiyuru, Andreas Krause

Spatial Transformer Networks
Max Jaderberg, Karen Simonyan, Andrew Zisserman, Koray Kavukcuoglu

Pointer Networks
Oriol Vinyals, Meire Fortunato, Navdeep Jaitly

Structured Transforms for Small-Footprint Deep Learning
Vikas Sindhwani, Tara Sainath, Sanjiv Kumar

Spherical Random Features for Polynomial Kernels
Jeffrey Pennington, Felix Yu, Sanjiv Kumar

POSTERS
Learning to Transduce with Unbounded Memory
Edward Grefenstette, Karl Moritz Hermann, Mustafa Suleyman, Phil Blunsom

Deep Knowledge Tracing
Chris Piech, Jonathan Bassen, Jonathan Huang, Surya Ganguli, Mehran Sahami, Leonidas Guibas, Jascha Sohl-Dickstein

Hidden Technical Debt in Machine Learning Systems
D Sculley, Gary Holt, Daniel Golovin, Eugene Davydov, Todd Phillips, Dietmar Ebner, Vinay Chaudhary, Michael Young, Jean-Francois Crespo, Dan Dennison

Grammar as a Foreign Language
Oriol Vinyals, Lukasz Kaiser, Terry Koo, Slav Petrov, Ilya Sutskever, Geoffrey Hinton

Stochastic Variational Information Maximisation
Shakir Mohamed, Danilo Rezende

Embedding Inference for Structured Multilabel Prediction
Farzaneh Mirzazadeh, Siamak Ravanbakhsh, Bing Xu, Nan Ding, Dale Schuurmans

On the Convergence of Stochastic Gradient MCMC Algorithms with High-Order Integrators
Changyou Chen, Nan Ding, Lawrence Carin

Spectral Norm Regularization of Orthonormal Representations for Graph Transduction
Rakesh Shivanna, Bibaswan Chatterjee, Raman Sankaran, Chiranjib Bhattacharyya, Francis Bach

Differentially Private Learning of Structured Discrete Distributions
Ilias Diakonikolas, Moritz Hardt, Ludwig Schmidt

Nearly Optimal Private LASSO
Kunal Talwar, Li Zhang, Abhradeep Thakurta

Learning Continuous Control Policies by Stochastic Value Gradients
Nicolas Heess, Greg Wayne, David Silver, Timothy Lillicrap, Tom Erez, Yuval Tassa

Gradient Estimation Using Stochastic Computation Graphs
John Schulman, Nicolas Heess, Theophane Weber, Pieter Abbeel

Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks
Samy Bengio, Oriol Vinyals, Navdeep Jaitly, Noam Shazeer

Teaching Machines to Read and Comprehend
Karl Moritz Hermann, Tomas Kocisky, Edward Grefenstette, Lasse Espeholt, Will Kay, Mustafa Suleyman, Phil Blunsom

Bayesian dark knowledge
Anoop Korattikara, Vivek Rathod, Kevin Murphy, Max Welling

Generalization in Adaptive Data Analysis and Holdout Reuse
Cynthia Dwork, Vitaly Feldman, Moritz Hardt, Toniann Pitassi, Omer Reingold, Aaron Roth

Semi-supervised Sequence Learning
Andrew Dai, Quoc Le

Natural Neural Networks
Guillaume Desjardins, Karen Simonyan, Razvan Pascanu, Koray Kavukcuoglu

Revenue Optimization against Strategic Buyers
Andres Munoz Medina, Mehryar Mohri

WORKSHOPS
Feature Extraction: Modern Questions and Challenges
Workshop Chairs include: Dmitry Storcheus, Afshin Rostamizadeh, Sanjiv Kumar
Program Committee includes: Jeffery Pennington, Vikas Sindhwani

NIPS Time Series Workshop
Invited Speakers include: Mehryar Mohri
Panelists include: Corinna Cortes

Nonparametric Methods for Large Scale Representation Learning
Invited Speakers include: Amr Ahmed

Machine Learning for Spoken Language Understanding and Interaction
Invited Speakers include: Larry Heck

Adaptive Data Analysis
Organizers include: Moritz Hardt

Deep Reinforcement Learning
Organizers include : David Silver
Invited Speakers include: Sergey Levine

Advances in Approximate Bayesian Inference
Organizers include : Shakir Mohamed
Panelists include: Danilo Rezende

Cognitive Computation: Integrating Neural and Symbolic Approaches
Invited Speakers include: Ramanathan V. Guha, Geoffrey Hinton, Greg Wayne

Transfer and Multi-Task Learning: Trends and New Perspectives
Invited Speakers include: Mehryar Mohri
Poster presentations include: Andres Munoz Medina

Learning and privacy with incomplete data and weak supervision
Organizers include : Felix Yu
Program Committee includes: Alexander Blocker, Krzysztof Choromanski, Sanjiv Kumar
Speakers include: Nando de Freitas

Black Box Learning and Inference
Organizers include : Ali Eslami
Keynotes include: Geoff Hinton

Quantum Machine Learning
Invited Speakers include: Hartmut Neven

Bayesian Nonparametrics: The Next Generation
Invited Speakers include: Amr Ahmed

Bayesian Optimization: Scalability and Flexibility
Organizers include: Nando de Freitas

Reasoning, Attention, Memory (RAM)
Invited speakers include: Alex Graves, Ilya Sutskever

Extreme Classification 2015: Multi-class and Multi-label Learning in Extremely Large Label Spaces
Panelists include: Mehryar Mohri, Samy Bengio
Invited speakers include: Samy Bengio

Machine Learning Systems
Invited speakers include: Jeff Dean

SYMPOSIA
Brains, Mind and Machines
Invited Speakers include: Geoffrey Hinton, Demis Hassabis

Deep Learning Symposium
Program Committee Members include: Samy Bengio, Phil Blunsom, Nando De Freitas, Ilya Sutskever, Andrew Zisserman
Invited Speakers include: Max Jaderberg, Sergey Ioffe, Alexander Graves

Algorithms Among Us: The Societal Impacts of Machine Learning
Panelists include: Shane Legg

TUTORIALS
NIPS 2015 Deep Learning Tutorial
Geoffrey E. Hinton, Yoshua Bengio, Yann LeCun

Large-Scale Distributed Systems for Training Neural Networks
Jeff Dean, Oriol Vinyals

Monday, 30 November 2015

Digital Analytics Association San Francisco Symposium: ‘Tis the Season for Data

The fourth annual Digital Analytics Association (DAA) San Francisco Symposium is coming up! Join us on Tuesday, December 8th as we host the symposium at Google’s San Francisco office. This year’s event is focused on how all businesses use data to optimize, personalize, and succeed through the holidays.

Our lineup of great speakers includes:

Jim Sterne, Target Marketing and the DAA
Kristina Bergman, Ignition Partners
Adam Singer, Analytics Advocate, Google
Prolet Miteva, Senior Manager Web Analytics Infrastructure, Autodesk
Joshua Anderson, Senior Manager Analytics, BlueShield
Michele Kiss, Senior Partner, Analytics Demystified
David Meyers, Co-Founder/CEO, AdoptAPet
and other great speakers

Theme: Optimization, personalization, and how to succeed through the holidays

When: Tuesday, December 8th, 2015. Registration starts at 12:30. Program runs from 1:00 to 5:30, followed by a networking reception.

Where: Google San Francisco, 345 Spear Street, 7th Floor, San Francisco, CA 94105

Cost: $25 for DAA members/$75 for non-members

Event website and registration: register here

Space is limited so register early!

San Francisco locals, this Symposium is organized by local DAA members and volunteers. We encourage you to become a member of the DAA and join our local efforts. Become a member and reach out to one of the local chapter leaders, Krista, Charles or Feras.

Happy Holidays!

Posted by Krista Seiden, Google Analytics Advocate

Tuesday, 24 November 2015

Engagement Jumps 30% for Wyndham Vacation Rentals With Help from Google Analytics Premium

Wyndham Vacation Rentals runs 9,000 North American rental properties from the mountains of Utah to the beaches of South Carolina. As you can imagine, that many guests and destinations creates some interesting challenges for Wyndham's online booking system.

They turned to Google Analytics Premium and Google Tag Manager for help, and we've just published a new case study showing the results. (Spoiler alert: property search CTRs are up by 30%.)

Wyndham did some very clever things with both tools. For instance, they used Google Tag Manager to implement Google Analytics Premium Custom Dimensions to capture user behavior around metrics like rental dates and length of stay. Then they used Google Analytics Premium to dig into the details and gather insights. That's how they learned that, while a "good view" is one of the top things customers included in searches, the scenic view attribute actually had a lower conversion rate than other features offered in their suites.

As a result, Wyndham redesigned its search results to put the properties with the most profitable mix of attributes on the first page. The Wyndham team also learned how far in advance people begin searching for various vacations, and have adjusted their campaigns and spending to match the peaks in demand.

With changes like these, Wyndham's customers are maintaining more interest through all stages of the funnel. Wyndham says its property search CTR has skyrocketed by more than 30%. Here's what Nadir Ali, their Director of eCommerce Analytics, has to say about this success:

“Google Analytics Premium is helping us connect the dots. As a data-driven organization, we strive to approach each business challenge objectively and back our assumptions with data. Google Analytics Premium gives us the flexibility to customize the data we collect in a manner that makes it easy to answer our business questions.”

We're always happy to see the creative ways partners use Google Analytics Premium and tools like Google Tag Manager. Congrats to Wyndham on some excellent (and ongoing) results.

See the Wyndham Vacation Rentals case study

Posted by Rachel Reader, Google Analytics Marketing

Friday, 20 November 2015

Cancer.org donations rise 5.4% with help from Google Analytics

The American Cancer Society has been working for more than 100 years to find a cure for cancer and to help patients fight back, get well and stay well. Today, the Society uses a number of websites and mobile apps to provide information on cancer detection and treatment, offer volunteer opportunities, and accept donations.

The Society knew they were being visited by users with different needs and goals, but it was a challenge to isolate these customer segments and to help them achieve their goals. The Society also wanted to address concerns with the Google Analytics implementation on its sites, monitor how its users changed behavior over time, and remarket to all segments once they were identified.

In order to find the data and insights necessary to answer the challenges above, the Society partnered with Search Discovery, a Google Analytics Certified Partner. To achieve these goals, they analyzed the website user segments and created personas to represent them. Then, they used segmentation and custom metrics to score each group based on how it was behaving on the website.

To learn how the American Cancer Society and Search Discovery worked together to implement a process to understand, optimize and monitor the overall health of the site for each user segment, download the detailed case study. And if you want to help saving lives, donate today.

According to Ashleigh Bunn, Director of Digital Analytics:

“The insights we’ve gained from Google Analytics and working with Search Discovery continue to influence the Society business decisions for the positive. Not only are our marketing decisions well informed, but our digital content is driven by user experience and engagement. We’re looking forward with enthusiasm and optimism.”

Posted by Daniel Waisberg, Analytics Advocate

Tuesday, 17 November 2015

Progressive Builds a Better Mobile App with Google Analytics Premium

You've probably seen Progressive Insurance's terrific commercials with Flo the enthusiastic cashier. But have you seen their terrific new mobile app?

It's a story of perseverance. Their full site at Progressive.com has been rated as America's best insurance carrier website for more than a decade.¹ But a few years back, as consumers shifted to mobile, the company realized it needed to make a whole new push to build a mobile app that matched its customers' changing behavior. They had a mobile app—they just didn't have a great mobile app.

Progressive recognized to get started, they'd need sophisticated analytics to really understand what their customers wanted most from a mobile experience. They turned to Google Analytics Premium, in combination with other Google measurement tools, for the solution. Features like Custom Reports, Custom Dimensions and the integration with BigQuery let them streamline the app testing process, spot the root causes of app crashes, and simplify user logins.

“The Google Analytics Premium user interface lets us easily understand the consumer experience on apps. Both our IT and Business organizations rely on this data.” — Kaitlin Marvin, Digital Analytics Architect, Progressive Insurance

Google Analytics Premium helped Progressive move fast. Their team lowered app testing time by 20% and boosted successful customer logins by 30%. The result is a mobile app that may not be quite as famous as Flo, but continues the best-in-class tradition that keeps Progressive customers happy and loyal.

See how they did it: read the full Progressive case study.

¹Progressive Insurance, "Keynote Recognizes Progressive Insurance for the 24th Time as Premiere Insurance Carrier Website," March 17, 2015.

Posted by Rachel Reader, Google Analytics Marketing

Thursday, 12 November 2015

Share Google Analytics data and remarketing lists more efficiently using manager accounts (MCC)

The following was originally posted on the AdWords Blog.

From monitoring account performance at scale to making cross-account campaign changes, manager accountshelp many of the most sophisticated AdWords advertisers get more done in less time. To deliver more insightful reporting and scale your remarketing efforts, we’re introducing two new enhancements to manager accounts: Google Analytics account linking, and remarketing tag and list sharing.

Access your data with a single link

You can now link your Google Analytics or Google Analytics Premium account directly to your AdWords manager account using the new setup wizard in AdWords under Account Settings. This streamlined workflow for linking accounts eliminates the need to link each of your Google Analytics and AdWords accounts individually.

Click image for full-size version

Now when you import your goals, website metrics, remarketing lists, or other data from Google Analytics, you'll only need to do it once. And whenever you add a new AdWords account to your manager account, it will automatically be linked with the same Analytics properties.

These enhancements save time so you can focus on optimizing your campaigns. You can learn more about linking your Google Analytics account into your manager account in the AdWords Help Center.

Scale your remarketing strategy

Many advertisers are seeing tremendous success re-engaging customers and finding new ones using Display remarketing, remarketing lists for search ads, and similar audiences. To help scale these efforts across the AdWords accounts you manage, you now have options for creating and sharing remarketing lists directly in your manager account from the new “Audiences” view, including any lists imported from Google Analytics or Customer Match.

You can also create remarketing lists using a manager-level remarketing tag and use them across your managed accounts. This eliminates the need to retag your website and manage multiple lists in each AdWords account. If any of your managed accounts have their own lists, they can be made available for use in your other managed accounts.

These enhancements make it easier and faster than ever before to get your remarketing strategy up and running. You can learn more about sharing remarketing tags and lists in the AdWords Help Center.

Posted by Vishal Goenka, Senior Product Manager, AdWords

Happy 10th Birthday, Google Analytics!

Today marks the 10th anniversary of the launch of Google Analytics. So much has changed over the last decade! If you think back ten years ago, the most popular smartphone was the Blackberry and a 128 megabyte flash drive cost about $30. Today you can get 250X the storage for half the price.

During that time, the world we now refer to as “digital analytics” has changed significantly.

Our mission when Google acquired Urchin Software was to empower a broad range of website developers and marketers to better understand and improve their business through powerful, yet easy to use analytics tools. In pursuit of that goal we've continued to bring new digital analytics capabilities to the market with Google Analytics Premium, Google Tag Manager, and Adometry. I’m really proud of the effort and innovation we have put forth in pursuit of that goal.

Looking back on the journey, here are my top ten Google Analytics highlights in no particular order:

Event Tracking - A powerful feature added early on that allows users to track visitor actions that don't correspond directly to a pageview. With event tracking, specific actions like PDF downloads and video views are easily tracked, categorized, and analyzed. This feature has become critical as websites have moved away from a page-structured model. Read how PUMA uses custom filters, event tracking, and advanced segments to kick up order rates by 7%.
Real-Time Reporting - Providing insights into what is happening at any given moment, real-time reporting is a powerful set of reports that are invaluable when checking campaign tagging, launching new campaigns, or understanding the immediate impact of social media. Read how Obama for America used Google Analytics to democratize rapid, data-driven decision making.
Multi-Channel Funnels, Attribution Modeling, and Data-Driven Attribution - Multi-channel funnels was the first step in helping marketers move from last-click attribution and gain insights into the full path to conversion. Next up was Attribution Modeling, which helps businesses distribute credit to all marketing touch-points in the conversion process. We currently provide algorithmic models and a new set of reports designed to take the guesswork out of attribution and make it more accurate. Watch how attribution modeling increases profit for Baby Supermall.
Tag Management - As the complexity of digital marketing and data collection continues to increase, it became clear that our users needed better tools for managing tags. Google Tag Manager consolidates website tags with a single snippet of code and lets users manage everything from an easy to use interface. Read how Domino’s Increased Monthly Revenue by 6% with Google Analytics Premium and Google Tag Manager.
Analytics Academy - A great product is only as good as its users ability to take advantage of all it offers. As the world of marketing and analytics increased in complexity, it became more important for Google Analytics users to be able to stay up to speed on all the changes. In response to this need, the Analytics Academy offers users a hub to participate in free, online, community-based video courses about digital analytics and, specifically, Google Analytics.
Universal Analytics - Universal Analytics (UA) was a big step for Google Analytics on two dimensions. First, UA helps address the challenges of today’s multi-screen, multi-device world by combining visitor activities across devices into a single view. Secondly, it is the foundation of people-centric analytics and enables features like User ID, Lifetime-Value, and Cohorts reporting. Read how 1stdibs luxury marketplace hit new heights with Google Analytics Premium.
Measurement Protocol - This feature was one of the foundations for Universal Analytics. It helped Google Analytics change from a “web analytics” tool to “digital analytics” platform. The measurement protocol allows users to send, store, and visualize interaction data via an HTTP request. This enables developers to measure how users interact with their business from almost any environment including offline transactions and IoT (Internet of Things) devices. Read how AccuWeather unlocks cross-channel impact using Google Analytics Premium.
Mobile App Analytics - With the explosion of mobile apps and devices, being able to measure and improve both app marketing performance and the app experience is critical. Mobile App Analytics reports are tailored for mobile app developers and marketers to measure the entire mobile customer journey—from discovery to download to engagement. And, when used with the UserID feature, businesses can better understand cross device user behavior. Read how Certain Affinity used Google’s Mobile App Analytics to improve game design.
Enhanced Ecommerce - A complete revamp of how Google Analytics measures the ecommerce experience. It provides clear insight into new, important metrics about shopper behavior and conversions including: product detail views, ‘add to cart’ actions, internal campaign clicks, the success of internal merchandising tools, the checkout process, and purchase. Read how Brian Gavin Diamonds saw a 60% Increase in customer checkouts with enhanced ecommerce.
Remarketing - Remarketing with Google Analytics helps you easily create audiences based on behaviors of people who visit your website and mobile app. Those audiences are then made available for remarketing campaigns in AdWords, GDN and DoubleClick Bid Manager. Read how TransUnion sees drastic cost efficiencies and conversion improvement with Google Analytics Premium.

I want to thank all the Googler’s, our certified partners, and most importantly, our users who have made this past decade such a fantastic experience. Rest assured, we’re not done yet. We continue to enhance Google Analytics, build new products, and provide new and innovative ways to help all businesses make better decisions. Stay tuned and cheers to the possibilities of the next decade.

Posted by Paul Muret, VP Engineering