Demystifying AI & Data Science

With the rise in performance in the AI technologies, we hear more and more about it. While everybody talk about it, very few people are able to give a clear and easy-to-understand explanation of what AI is. We have on one side the one that talks without knowing and on the others the expert that think they are making it easy to understand, while actually this is still too complicated

A second buzz word we see a lot is Data Science. Where most people do not know what this is, the rest that think they know get it actually wrong. And one of the big mistake that people make about Data Science is thinking that AI and Data Science are the same. Which is not the case, they are related but not the same.

So it has been clear to me that the first things we need to do, before thinking about innovation or hiring people inside an organisation, is to define our terms and get a clear understanding of those areas of expertises.

What is Artificial Intelligence?

AI or Artificial Intelligence is about making machine intelligent. But what does it mean? What is intelligence? And how does expert transfer this ability to a machine?

Intelligence

Intelligence is the ability to adapt to new event, or situation based on our past experience. We, as human, accumulate experience throughout our lives and use them to act and react in the present moment to the situations that we are facing every day.

We are making prediction based on past data we’ve accumulated in order to decide the best action to take.

Intelligence is also the ability to understand connection and pattern in information that we never seen before. To do that, we either use our experience by relating what we are facing to something we know or we enter in a trial-and-error mode, which helps us discover what work and what doesn’t.

So let’s recap the basis of intelligence:

Adaptation to new situation based on past experience
Understand new pattern and concepts by relating them to our current knowledge base
When facing a brand new situation that it cannot relate to, intelligence enters a test mode to discover information by itself

Artificial Intelligence

As a result, Artificial Intelligence is a subfield of Computer Science that tries to replicate humans’ intelligence / capabilities by the use of mathematical programs.

I talk here about capabilities because Artificial Intelligence refers to something that is actually bigger that just the imitation of human intelligence.

They are things that, until the rise of AI technology, only human could perform. And computer science had a hard time figuring out solution to just match the performance of humans.

For instance, we have everything that relates to perception, such as Vision & Sound processing. Another example is the processing of the language. Perception and Human natural language is hard for machine to understand.

Because for developers to build programs that will have a decent performance in those areas, they will need to implement thousands of rules. By extracting all their knowledge about the subject and putting it into code. Which is a very long and complicated task, very sensitive to human error.

So AI technologies also rise with this need of having the machine learn those rules rather than us spending countless hours implementing them while always forgetting some that we don’t know about.

General AI and Narrow AI

Lot of people when we talk about AI imagine some supra intelligence that will replace humans’ job in the near future. This miss understanding come simply from a lack of knowledge about the subject.

In AI we have two types of development. On that is primarily used today, and when we hear about AI people usually refer to that one, which is Narrow AI. And the other one, that is yet in its very early stage of development, General AI.

I can’t stress that enough: 99.9% of the AI technology that you hear about are Narrow AI, and only a fraction is about General AI. Now that this is clear, let’s define what those two are.

Narrow AI refers to the set of tools and technologies that target specific task. Development in those areas focus on building highly expert systems, each on one specific domain. For instance, NLP (Natural Langage Processing) is a Narrow AI technology that targets the comprehension of human language.

We are far from the general idea of intelligence. We are focusing on a specific domain and we try to have an intelligent machine on it. Which brings us to the concept of General AI.

General AI refers to intelligence as we define it previously. This is the general idea that people have about AI, which is an intelligent machine that can do everything that a human can do but faster and more efficiently than a human can.

This is an active area of research, and little steps are made each year in order to get closer to that. But by no means we are yet to get some significant result or have the technology ready to be used in the business environment.

Machine Learning & Deep Learning

Now that the difference between Narrow and General AI is clarified, let’s explore the most common use of AI technology nowadays: Machine Learning.

Machine Learning is a subfield of AI which attempt to build algorithms that are able to learning from data in order to do 3 kinds of things:

Predict something – Supervised Learning
Understand something about the data – Unsupervised Learning
Taking efficient action – Reinforcement Learning

Supervised, Unsupervised and Reinforcement Learning are the 3 main ways machine learn from the data. Data which can come from a wide range possibility: Images, Sounds, Text, …

I will not enter into the details of each type of learning. That I will do in a future post.

Then we also hear a lot about Deep Learning (DL), but what is it? Deep Learning is a subfield of Machine Learning, which as I mention just above is a subfield of AI technologies. So DL is a very specific way to go about solving the task that we talked about earlier.

DL primary use different variations of neural networks. Put it simply, it mimic the way the human brain work using neurons. By having thousands’ of millions’ of them, the machine is able to create very abstract representation of the problem in order to find a solution.

The reason we hear about it a lot is that this approach made very promising result in certain area of computer science. Like computer vision, where the machine is now able to reach human level performance in those tasks.

But there is 3 primaries issue with this technology:

Large amount of quality data is required
It’s is computationally extremely intensive
The interpretability of the model can be an issue in risk adverse domains

The first and second point are related to the fact that millions of parameters take a lot of time and data sample to build accurate models.

And the last point regarding the interpretability of the model. The model is creating a very abstract representation of the problem in order to find a solution. Those are algorithms that we call “Black Box”, because the approximation of reality that is generated is so complex, that this is not an easy task to understand what it does to actually make decision.

This can be a “No Go” for some business. For instance, when we need to explain our result to regulatory agencies or during auditability checks. It really depends on the circumstance of each business to decide how and why they need to explain their models.

Where on the opposite side, Machine Learning relies on simpler algorithms, that can be easier to understand and explain.

Now that AI is clear, let’s move on the Data Science.

What is Data Science?

Definition of Data Science

Data Science is the science of answering question using data. Those questions come from issues / problems that business are facing.

The question that a data scientist may need to answer come in 3 ways:

Question about the past – What happened?
Question about the present – What is happening?
Question about the future – What is going to happen?

And depending if the question is about the past, the present or the future, the Data Scientists have in their toolkits different tools to answer it.

But regardless of the question they need to ask, the process a Data Scientist goes through stays mostly the same:

Data understanding & preparation – Getting to know the data, exploring them and preparing them for the model
Modeling – Building the Statistical or Probabilistic model to answer the question
Putting model in production – Deployment of the model inside the information system of the company in order to use the model in real situation
Monitoring – Monitor the model overtime to make it stay relevant and check that no drifting are occurring.

What makes Data Science different from AI & Machine Learning

AI & Machine Learning are tools that a Data Scientist can use to solve problems and answer questions.

We already mention the toolkit of a Data Scientist, but what is in it exactly:

Analytics
Visualization
Optimisation
Statistics
Machine Learning (ML)
… Just to name a few

As we can see, ML is just one tool among others. That’s why Data Scientist have a wide range of competence in order to accomplish their tasks.

But this come at a cost. Being good at everything means you are not the best at anything.

What about Data Analyst & ML Engineer

On the opposite side we found the expert of Data Analyst and ML Engineer. Contrarily to Data Scientist, those experts are highly specialized.

Data analyst look for understanding the data to create comprehensive report to the business. While ML Engineer are focusing on ML technology in order to build models.

Data Scientist need to be used in order to fill the gap that exists between the Data analyst and the ML Engineer. Because if you know exactly what you need in your project and how to do it. Calling an expert Data Analyst or and expert ML Engineer might the right call.

But this is most likely not going to be the case. Where more businesses are not expert in data, having a data scientist that can make the right call on knowing which tool to use in which situation is very useful.

This is the primary reason behind the rise in popularity of the job of Data Scientist. It fills a need that is born from the rise of AI technology where most business wants to do it but don’t know how to. And Data Scientist have the answers to that.

So to summarise, doing Data Science doesn’t necessarily mean doing Machine Learning (aka AI). Data Scientist will help you assess your current level of maturity, in order to find the best solution to the problem you are trying to solve given your current level of development.

Myths surrounding AI & Data Science

I want to finish today’s post by talking about some most damaging myths surrounding AI & Data Science that prevent organizations from taking full advantage of those techniques. And in the information age, this can be the difference between staying in business or going out of business.

Myth 1 – Data Science means doing Machine Learning

This is just as a reminder. As I explain above Machine Learning is a tool that Data Scientist can use to solve business problems. And they will mitigate the necessity to use it or not according to the level of digital maturity of the business.

It all comes down to the 7 stages of automation I have talked about in my previous post. Where are you currently? And where do you need to go next? Those are the important questions.

Myth 2 – End To End Model – We can do anything using AI? Right?

This myth emerges from a core lack of understanding of what AI and Data Science does. So I am going to make it simple so you never make this mistake ever again.

What you want to build to innovate is a complete system. And some part are going to be Systematic and other are going to be Discretionary.

Systematic elements of a system are the part that can be developed using basic algorithmic and rule based system. Given the set of input that we have, we know exactly what set of output we are going to get. All cases are handled systematically and we know exactly what to expect from our system.

In this part, AI & Data Science are not require. I would even suggest to avoid using them at all in the systematic part of a system. Because they are used to introduce Discretionary into the system.

Now let’s talk about Discretionary. The discretionary part of a system is going to be the part that until the rise of AI was primarily handled by human. Those are the area where experience and learning from past example come into play. But more than that, this is where we don’t know for sure or we would need millions of rules in order to decide.

This is where AI and Data Science are going to deliver the most value. Where we are going to build Statistical and Probabilistic model in order to help make the same decision that was previously made by humans.

So to wrap things up on the “End To End” AI model. It doesn’t exist. You first need to build a system that handles the Systematic part of your business processes. And once this system is built, you will be able to identify the gray area where innovation using AI and Data Science will actually bring you the most value.

Myth 3 – AI is going to replace Job

I think most people involves in organization that want to innovate using AI don’t actually believe that. Generally, this is a belief held by the people that don’t really understand this technology.

But after reading this blog post, I hope this is not your case anymore.

The reason I still added this one is that as a manager or leader of an organisation. You might consider that this is a subconscious fear of your employees. Even if they consciously accept that this is not the case, part of themselves still fear for their jobs.

So I think this is important to share the vision of why you are pushing for the development and the use of AI. And also how you believe this is going to improve the working condition of your employees rather than just eradicate their jobs.

AI & Data Science require strong involvement of the business in the process of designing models. But if they are subconsciously fearing the progress, this creates massive frictions in the innovating process and can lead to the fail of many projects.

Share your vision, share the purpose that drives this innovation. And if you know you should do it, but don’t know how to explain why. Think about the fact that most innovation using AI & Data Science aim mostly toward the same things: Automating low value time-consuming tasks so the businesses can focus on the most valuable tasks, and as a result boosting the cost efficiency of the business time.

Conlusion

Artificial intelligence, Narrow & General AI, Machine Learning, Deep Learning, Data Science, the difference between Data Science & AI …

A lot of terms far from easy to understand at first glance. But I hope this post helped you understand them better. Understanding is the first step toward efficient use of knowledge.

So as a person willing to innovate inside an organisation or just someone curious about the subject, I think you have now a solid basis to understand when peoples are talking about those subjects and make better and well-informed decision in the future.

Throughout this year, I am going to dive into more details about each of those domains to help you further out your knowledge.

Thanks for reading and see you next time!

Demystifying AI & Data Science

Thomas Duforest

Demystifying AI & Data Science

What is Artificial Intelligence?

What is Data Science?

Myths surrounding AI & Data Science

Conlusion

Leave a Reply Cancel reply

What is Artificial Intelligence?

What is Data Science?

Myths surrounding AI & Data Science

Conlusion

Post navigation

Leave a Reply Cancel reply