Machine Learning for the Layperson

Audience: Non-technical business professionals who have a hard time understanding technology and are trying to implement new technology in their companies.

Machine Learning is creeping into companies of all sizes and I’ve found that many of those who want to implement it are those who aren’t in IT. Business clients are able to explain the results of machine learning within an application or set of applications, yet have trouble understanding exactly what machine learning actually is, how it works and why it takes longer than they think to get the results they want.

Thus, I am here to help those of you who struggle with understanding what Machine Learning is in such a way that I hope will better connect the dots.

Google’s “Did you mean”

Google’s “Did you mean”

Let’s Start Simple

Most of the audience is largely familiar with Google Search. You search for something and it returns the answer. You search for something and misspell itand it gives you a suggestion like the image above: I searched for “vegn” and it asks if I meant “vegan”. Well, yes, I did mean “vegan”! My next action is to click on the “vegan” link and I get the results that I originally intended to receive. Easy!

What’s Happening Here

How did Google come to know that “vegan” is what I wanted to search for? Why not “vegas” or the acronym”VEGB”?

Before Google Search learned that those who type “vegn” usually mean “vegan”, something very important had to happen in order for Google Search to come to that prediction. So, how did it learn?

Let’s take a look at what happens when I butcher the word “vegan” a little bit differently:

Not all misspellings are recognized by Google Search, nor should they be

Not all misspellings are recognized by Google Search, nor should they be

Well, you learn something new every day! My misspelling of “vegan” to “vengan” apparently is a Spanish word meaning “come”.

Why doesn’t it ask me like it did before “Did you mean vegan”?

Google Search and Supervised Training

In order for Google Search to learn to ask me “Did you mean vegan,” I first need to realize that I spelled “vegan” incorrectly, remove “vengan” from the search bar and type the correct spelling of the word “vegan”. In more of a step-by-step format, the cycle looks like this:

  1. Type “vengan”, hit enter, get the wrong information back
  2. Realize that I misspelled it and that’s why I’m seeing Spanish translations
  3. Immediately re-type “vegan”, hit enter, and get the right information back.

When more people just like me repeat the small cycle of typing the wrong search term “vengan” and replacing it with the right search term “vegan”, Google will eventually begin to display the “Did you mean vegan” suggestion. It is at this point that Google Search has finally learned that people who misspell “vengan” meant “vegan”. It doesn’t appear when I search vengan today because there haven’t been enough cycles of people other than mysefl who have followed steps 1 to 3 above for these particular search terms. If enough people don’t usually misspell “vegan” with the “vengan” spelling, then it likely never will show “Did you mean vegan.”

This is a good example of supervised training. When you think about how many times this must be done to get to the “Did you mean…” suggestion, you can start to see how this may take more time than you think to train a machine learning model, especially one more complex than this.

One Step Further

Keeping the cycle of misspellings in mind, now think about a time when you asked a full question in Google and subsequently ask many other questions. It is quite possible that those questions are related to the first question. It is also possible that others have asked those same related questions in a similar manner just like you.

Let’s say that I’m having a conversation over a short work lunch with a close friend of mine and I’ve completely blanked on what “a person who doesn’t eat meat” is called. I grab my phone off of the table, type “a person who doesn’t eat meat” into Google and lo and behold my results with a new “People also ask” section:

 

Google Search returns suggested related questions

Google Search returns suggested related questions

 

This is a more true representation of machine learning in that Google Search may use a combination of related questions that people have asked after asking the search of “a person who doesn’t eat meat” search (supervised learning) plus a machine learning algorithm that has learned what questions are related to it all on its own and begins to cluster similar questions together without the guidance of a human to teach it.

This now becomes an example of unsupervised training where the machine learning algorithm learns from a very large set of data (all Google Search questions) and creates its own clusters of related data (questions about people who don’t eat or only eat certain types of foods) to our initial search (a person who doesn’t eat meat). As soon as Google Search has enough data (enough examples of this is the question and that is the answer), it can build relationships between those questions and others that may be similar.

In order for it to get to that point, however, Google Search needs an initial set of good data (with good human training) to make sure that they get the answer right. What happens when it gets bad supervised training? Although more rare these days, you may want to search for “Google Bombing” to get an idea. Business can get the best results in their own machine learning environments by the careful collaboration between their human trainers and the machine learning algorithms.

Why does any of this matter?

Time and Money. In the age of everything now, one thing that Google Search has done is allow us to get to our answer faster via a machine learning algorithm or set of algorithms (which is a fancy word for a set of rules or step-by-step processes).

Yes, I may have only shaved 1 or 2 seconds of time with Google’s “Did you mean vegan” link instead of retyping the right word, but seconds these days become very valuable, especially in business when these seconds are scaled to the thousands of employees in an organization that waste time searching for answers from and about internal documents, manuals, products, knowledge bases, customers, tasks, suggestions, etc. It’s money down the drain.

If you’re interested in learning more about how machine learning can help your organization, reach out to me via Twitter or LinkedIn. There are no stupid questions and I am happy to help educate anyone wanting to learn about this fascinating technology.

Syndi Espinoza is a solutions consultant & sales professional who aligns business and technology objectives.