Artificial Intelligence - Modeling and Evaluation

Overview

During the Modelling and Evaluation phase of the AI pipeline the Artificial Intelligence Engineer explores a taxonomy of AI cores that can be used to solve the AI business case at hand. There are many types of AI core methodologies and there have been numerous attempts to classify techniques in major genres. The list is an ever evolving one and different taxonomy attempts have been made that differ depending on the view one has looking at the algorithms. 

Supervised learning

Any AI whose learning involves historical knowledge of outcomes of the underlying target variables that map business objectives can be regarded as supervised learning. What this means is that before training the AI cores the machine learning engineer has available a labelled dataset that historically models the way the system has behaved and can effectively ask the AI cores to optimise decision performance based on that prior knowledge. The list of AI cores that can be used in this learning genre includes Neural Networks and Deep Neural Networks and their derivatives as well as simpler algorithms such as polynomial and logistic regressions, decision trees and all their derivatives and statistical techniques such as Naïve Bayes.

 

Unsupervised learning

In unsupervised learning the machine learning engineer, has no historical knowledge available concerning the underlying target variable that maps business objectives. What is available, is a dataset that historically models the general behaviour of the system within an ecosystem. What the AI cores do in this case is discover patterns within the data that can be used to optimise decision performance in the future. We can find numerous AI methodologies in this genre, for example Deep neural networks can be used in all types of unsupervised learning, while other techniques are more orientated to specific tasks such as OPTICS, k-means, or hierarchical clustering.  It is often the case that as unsupervised learning accumulates inference, it can be followed by early supervised learning, a genre that could be called semi-supervised learning.

 

Reinforcement learning

This type of learning is based on control theory and tries to imitate how we as humans learn. The genre itself is divided in two main categories, Markov processes and Evolutionary processes.  The main concept under this learning genre is that the AI system is left to explore a digital twin of the problem space and is only provided with positive or negative feedback depending on its decisions. The AI gradually calibrates its behaviour so that its decisions maximise the rewards it collects until it manages to find the solution to the problem. In this genre we find algorithms as Q-learning that tries to generate a navigation policy for a Markov Decision Process that maximises the total reward accumulated given a transitions rewards table between Markov states. Deep Reinforcement Learning also belongs in the genre, it combines Reinforcement Learning and Deep Learning to create an agent that is able to navigate an unstructured space, that is there is no need for predefined rewards on state space transitions.

 

Ensemble structures

A collection of AI cores working together and contributing their decisions to solve a problem is called an ensemble structure. Ensembles, tend to produce better results than unitary AI algorithms especially if there is enough diversification within the structure. These structures can be wrapped under a collective algorithm such as xgboost, a collection of the same algorithm replicated many times. They can also be proprietary collections working together under a governance structure, a methodology that is called stacking. The governance structure itself can be as simple as a voting system or as complex as an AI overseeing all AI cores and grading their contributions. Other techniques that can combine models to ensembles include, amongst others, bagging, boosting, Bayesian model averaging, and bucketing.

Learning optimisations

The choice of a training, validating, and testing methodology is a step that is implemented very close to the AI core. Essentially the AI engineer needs to decide on a combined strategy related to these three steps of learning. The result of the process should be an AI core that delivers a good tradeoff between variance and bias. The process involves dividing the data into appropriate subsets for training and testing as well as deciding on learning metrics to use to evaluate the AI cores, stoppage training criteria and hyperparameter optimization processes. Sometimes to prevent biasing during the choice of all these methodologies, part of the data is completely left out of sight to of the team during AI core implementations. Several issues may arise during the process with overfitting and underfitting being the most prominent ones. The process of optimising the AI cores during learning can be a very delicate balance of choices and very time-consuming. As mentioned, multiple hyperparameters, the structural parameters of the algorithm that can greatly affect the AI core performance need to be optimised. The list is rather large and can also be specific to the AI core however some of the well-known ones include the learning rate which is the size of the steps the algorithm takes when exploring the feature space, epochs or how many times will the algorithm explore the data and many more. There are a few common strategies that a Machine Learning Engineer can deploy during hyperparameter optimisations within the modelling and evaluation phase including simple ones such as grid searches where the AI will explore multiple fixed ranges of values in the hyperparameter space to more stochastic ones such as evolutionary genetic algorithms.