Source: http://fortune.com/2015/01/22/the-algorithmic-ceo/

Why is Machine Learning so Hard?

Machine Learning is a fascinating field that is rapidly emerging and heavily marketed as solution to many of today’s problems. Yet, the application of Machine Learning in a real-world production setting can be quite difficult to execute with promising results. In this post let’s examine what is Machine Learning and why it can be so hard.

What is Machine Learning?

Machine Learning (ML) is the concept that computers can be given a framework in the form of algorithms that can learn own their own without the need for rules-based programming. Cheap computing power and large data sets available with the advent of the internet propelled Machine Learning from a concept into a scientific field.1

It is important to note that Machine Learning is different than Artificial Intelligence (AI). As Bernard Marr writes for Forbes these two topics are hot buzzwords which cause confusion when used interchangeably. Marr explains that Artificial Intelligence is a more general concept of machines being able to execute tasks in such a manner that is consider smart – aka intelligent. The concept of AI has been around since Greek mythology and continued to advance in very early European computers reproducing simple arithmetic calculations. As technology advanced so too has our expectation of what makes a machine smart; we now want machines to complete tasks in a more human manner.

AI consist of two fundamental schools of thought – applied and general. Marr states that applied AI is a more common application; systems designed to carry out specific tasks such as trade stock or drive cars. The second category is general AI which is defined by algorithms or machines that can handle any task. General AI is much less common, but all the more fascinating. It is in this area that Machine Learning has developed. Marr advises that don’t think of ML as a subset of AI, but rather a “current state-of-the-art” version of AI.2

Why is it so hard?

It is important to remember that Machine Learning is just another tool at your disposal to solve a problem. Said differently, “don’t put Machine Learning on a pedestal” writes Jason Brownlee for MachineLearningMastery.com.3 This just one mistake Brownlee calls out that programmers make when starting in Machine Learning. So after we knock ML out of it’s ivory tower relegated only to academic publications and really start thinking about how to apply it to our real-world problems, what other challenges will we face?

We need to trust Machine Learning to effectively use it. Peter Norvig, Director of Research at Google, stated at a recent AI conference “What is produced [by machine learning] is not code but more or less a black box—you can peak in a little bit, we have some idea of what’s going on, but not a complete idea.”4 To build trust between ourselves and ML we must understand how to evaluate the output while conceding that we won’t know or control the exact rules the algorithms follow; just as we would build trust between ourselves and another individual.

From a technical perspective Machine Learning can be considered a “fundamentally hard debugging problem” according to S. Zayd Enam.5 Enam is the Founder of Stealth and Stanford University PhD candidate. He goes on to write that ML is tough because either the algorithm doesn’t work, or it doesn’t work well enough. These issues are then compounded when problems exist with actual model and/or the data. Enam does offer hope in the statement that humans can build an intuition for investigating how a model is going wrong, debug the issue, and produce a working ML pipeline.

Another consideration is your Machine Learning model achieving the intended result. Jure Leskovec, Chief Scientist at Pinterest, recently discussed with our Stanford class his learnings with implementing ML to improve user retention at Pinterest. The specific business metric his team wanted to improve was WAR28 – weekly active repinner after 28 days. Pinterest looks to WAR28 as a vital sign of their business and a measurement of how “sticky” their users view the platform after 4 weeks of membership. The hypothesis was if they could suggest topics of relevance to a user when a user initially signs-up, then the user will stay more engaged with Pinterest. To test this hypothesis Leskovec and team overcame many technical challenges in building an accurate ML model and created a user experience that suggests relevant topics for users curated to their individual interests. Unfortunately, this enhancement had no effect on WAR28. Leskovec and team re-evaluated their assumptions and proposed a new hypothesis, that a user’s interest upon initial sign-up were meaningless and it was their early interactions with Pinterest that would determine the content they wanted to see. This hypothesis plus ML models for application in turn positively impacted WAR28 and produced a successful result for the business.

Despite the challenges, Machine Learning offers a tremendous opportunity on all fronts. We should not view these obstacles as barriers to entry, but rather caution signs on our road to discovery. The significance in the competitive business landscape is paramount. Ram Charan writes that “any organization that is not a math house now or is unable to become one soon is already a legacy company”.6

 

1 Source: McKinsey –  An executive’s guide to machine learning

2 Source: Forbes – What is the Difference Between Artificial Intelligence And Machine Learning?

3 Source: Machine Learning Mastery – 5 Mistakes Programmers Make when Starting in Machine Learning

4 Source: Forbes – 12 Observations About Artificial Intelligence From The O’Reilly AI Conference

5 Source: ai.Stanford.edu – Why is machine learning ‘hard’?

6 Source: Ram Charan, The Attacker’s Advantage: Turning Uncertainty into Breakthrough Opportunities, New York: PublicAffairs, February 2015.

0

3 comments on “Why is Machine Learning so Hard?”

  1. Great article Murphy!

    I just have a question: you said that we still are searching for practical usages of ML. You provided one example with Dr. Leskovec’s use of it in Pinterest (in WAR28). Do you have any other examples of other uses of the technology, not only in social media companies but in other areas as well? Or is ML only limited to “cybernetical” tasks?

    Thank you!

    0
    1. Luca – I’m not Will but I work in the energy industry and I feel I can answer this.

      Many groups, especially in the heavy industries, are interested in understanding more about their equipment uptimes. ML is being investigated industry-wide to look for leading indicators for equipment failures so that repairs can be made prior to expensive down-time events.

      0
  2. Hi Murphy. I think you provided an excellent overview of machine learning and the challenges that this field has been facing. I agree with you that sometimes machine learning can be a black box and that is why many data scientists would always say “understand your data very very well”. It is definitely a trend that more and more industries are going to use machine learning algorithms. I work in finance, and I can see the strong trend that financial firms rely more and more on the data and being the first to implement these new algorithms would bring the firm a lot of competitive advantage. However, it is true that people should always be very careful about these machine learning models because sometimes the maths can be misleading if you don’t understand it well and as you said, there is still a big part of human intuition that AI still cannot replace with current technology.

    0

Comments are closed.