Machine Learning is useless

Preamble

I would like to say “recently”, but actually is almost a few years I heard - and I’m still hearing a lot about Machine Learning and I didn’t want to believe it until now - believe me, I truly didn’t want to believe it - but yes here we are Machine Learning ufficially replace Big Data as buzzy word of the this past years, most problably will be still the word of the next year and I could not be more sad, frustrated, and worried about. Please haters don’t hate me, Internet don’t misunderstand me, companies don’t hire me, but first of all - please - don’t teach anything to your machines before finishing this post (!) 🤓 because they never learnt anything until now and they always felt good about so please - keep them simple operating system as they are, or at least talk with them before enrolled them in any advanced analytics course.

Introduction

The first question everyone in the world should first ask to himself before even going to Google searching for the latest super cool tool to solve his/her problem is when a ML tool is good to solve a problem? I said it again, quoting the question because it’s crucial:

When a Machine Learning tool is good to solve a problem?

The answer is quite simple, but it becomes complicated because there are many considerations to do before having the right answer. To better provide an answer to this question, maybe it’s important to remember what Machine Learning is and what it is not.

What Machine Learning is not

Machine learning is not a one-step solution, like “I need to prepare a cake. I need flour eggs sugar and lemon cream. Done”. It can not solve all business problems or turns struggles into successes: I said it again just to be sure will be printed in the T-Shirt for the next year.

Machine Learning can not solve all business problems or turns struggles into successes.

The 4 commandments.

Cons number 1

This is to say that no, you can’t go to your customers - internal or external, it doens’t matter - and convince them with sentence like “I can solve your problem with Machine Learning” because the answer is “No, you can’t” and if you’re now worried because you think you’re not smart enough / prepared enough / young enough / Batman enough to do it then again my opinion is “No, you are not Batman enough for sure, and this is not your fault but please DON’T spread Machine Learning religion more than what has already been done by government, taxi-driver and even icecream-seller”.

Cons number 2

Machine Learning is not a tool to increase customer satisfaction - yes, I know you’re thinking about beautifull recommendation system to provide insights and bring your customer to spend 200$ dollars on services / product / whatever you sell because no, machine learning will not provide more money to spend to your customer so most probably no, you will not increase your revenue - like magically transform milions in billions because you now know from your past that you wasted a lot of money in doing / producing / party-rocking / whatever - whatever. Unfortunately, you and your customer will remain poor as you are right now.

Cons number 3

If a problem require identify causality, the Machine Learning probably won’t be a good solution: what I mean with causality? Sorry, wrong question. Why? Well it’s super simple, just think about it for a second before going ahead. The answer is - of course - because you really don’t know the causes of almost anything inside your business and if you’re thinking “No dude, I perfectly know!” then you are most probably making a lot of assumptions about things out of your perimeter - you probably don’t know - or even worst - you’re not interested in - about the problem you want to solve in the business you find yourself.

Why even worst? Because if something is out of your scope, is unlikely finding yourself jump into for any reasons, so he will remain out of your perimeters. The biggest your business is, the more is difficult to have a clear detailed big pictures of causes and consequences of everything and thus, taking / making the right assumption about something.

Cons number 4

If there is not a lot of relevant data to fill a machine learning model, then it will not produce a valuable solution. This is one is pretty simple: how can you image to produce valuable information from not relevant data. I mean, it’s already the challenge of a lifetime produce valuable insights from cleaned-approved-by-NASA data! Don’t get me wrong, but relevant data are a must, otherwise skip without even investigating a machine learning solution: and please, don’t forget that even in the case ML is the right solution, the model built is no more valuable than the data you provide to it. Repeat it again:

The model built is no more valuable than the data you provide to it.

Ok then….what can machine learning do?

What Machine Learning can do

If you follow AWS/Google/Microsoft/YourFamilyDoctor guidance, they all will agree is saying that It opens doors to innovation, true collaboration and can help applications in providing smarter solutions. - and no, I will not quote this kinda supermarket sentence. Yes, but then…what can we say about Machine Learning before going for some insights? Because, as far as we all know there are many general problems (more on that later) solved by people around world companies, but only a portion of these companies succed in taking advantage of machine learning models.

Machine Learning is a tool that can provide you solution for solving persistent business problems: ok, fair enough in the end, because this is the same approach we use for automation. You don’t automate something that has to be done only once: ok maybe we both do, but it’s only because we have a problem, OK?!

When

Starting from that, let’s do a bullet list when a Machine Learning tool is good to solve a problem - our initial question.

If the problem you want to solve is persistent - already discussed;
If the team that aimed to solve the problem has persistent problem (first evaluate challenges they need to face and starting from the solution they want to put in place to solve the problem, evaluate the pro and cons);
If the solution needs to scale;
If the problem requires personalization in order to be solved;

How

Still, it’s identify if your problem and team fit these points and, even if you are able to, then you should start worring about what does a successfull ML solution require to be applied.

People: there are several skillsets that are necesserally to have in order to address correctly ML solvable problems. These are Machine Learning Scientist, Applied Scientist, Data Scientist, Data Engineers, Software Engineers, Program Managers and Techincal Program Managers only to name a few of them;
Time: thinking an ML solution, building it, testing in production and evaluate it along the way it’s a super time consuming activity. This can take weeks, months and possibly even years depending on the problems. And this imply taking in consideration human factors, discussions, alignment, etc.
Money: there are cost not only for infrastructure, but also in term of right skillset, technologies to be learnt, etc.

Not only Data: the Six Questions

Much of the data are useless, we all now about this. The problem is that even the ML model results are often hard to understand, and if they seems easy to understand they still could be wrong. Machine Learning cannot help you identify team who can provide data, team that can clean them correctly or team that can correlate dataset with problems. More in general there are a few questions you should first ask to everyone who want to use ML inside your company, before going for an ML solution to solve a business problem.

What are the made assumptions? Ask detailed explanation about assumptions on data used and algorithm used, to identify critical blocker for your ML solution to perform well.
What is your learning target? The learning target of an algorithm is namely the value that should be output or the hypthesis. If you show add for a particular customer, it will buy the product? Hypthesis testing over huge amount of data it’s the basis for ML success.
What type of ML problem is it? There are many kind of problems already solved and identify similar problems can arise good discussion points.
Why did you choose this algorithm? Ask whoever decided to do something why it was decided to do it like that, which is the ratio behind is key. There could be a baseline in literature to have comparison, etc.
How will you evaluate the model performance? Depending on the business problem, performance of the Machine Learning problem can be evaluated and is useful for every team know a priori how they will evaluate results.
How confindent are you that can you can generalize result? If the ML will work on specific dataset, maybe is not so a good solution.

Scientist

It is important to understand that contribute to - and open source even - new algorithms can be a main driver to scientist to apply for or even only stay in particular company. The collaboration opportunity with open source comunity create best solutions. Since Machine Learning grows fast, another crucial aspect is that building good ML models require scientists that constatly learn and pick up latest trends in ML. I will quote this because it’s CRUCIAL

Another crucial aspect is that building good ML models require scientists that constatly learn and pick up latest trends in ML

Scientists should have access to relevant literature, and the opportunities to attend relevant techtalks, conferences and workshops.

Conclusion

Understand your process and your business internally is mandatory. Figure out if you can correlate data you have with the problem you want to solve is mandatory. Remember also that no ML model will help you in doing this.

Thank you a lot Lauren Thomas for your speech.

Thank you everybody for reading!

Preamble#

Introduction#

What Machine Learning is not#

Cons number 1#

Cons number 2#

Cons number 3#

Cons number 4#

What Machine Learning can do#

When#

How#

Not only Data: the Six Questions#

Scientist#

Conclusion#