AI: A Primer
We need to understand the different types of AI if we have a hope to teach students to use it well. Here's a basic primer from a layperson.
Narrow AI vs. General AI
AI can be broadly categorised into two types: Narrow AI (also known as weak AI) and General AI (also known as strong AI). Narrow AI is designed to perform a specific task or a set of tasks, while General AI has the ability to understand and learn any thinking task that a human can do. It’s the goal of many of the researchers now focusing their time and attention on AGI or Artificial General Intelligence - to create a genuine thinking machine.
Most AI applications in education today fall into the Narrow AI category. AI-powered language translation tools, such as Google Translate, help students and teachers overcome language barriers in the classroom but are not in themselves intelligent. The same can be said for generative AIs like ChatGPT and text to image generators like Dall-E and Midjourney. All use algorithms to produce written and visual content based on being pre-trained on vast amounts of data.
GPT stands for Generative Pre-trained Transformer. It transforms prompts into written content through its prior training, a little like how a dog can perform tricks if trained well enough. Diffusion models such as Midjourney, Dall-E and Stable Diffusion work in a similar way but using training data on millions of images and associated text. None of these models uses genuine intelligence in the way we would categorise it as humans: in this respect the term Artificial Intelligence is in itself misleading. On the other hand, AGI, which is still a theoretical concept to most (even if some are extolling the AGI virtues of ChatGPT-4), will be truly world-changing when it finally becomes a reality. At the pace rate of change that may be sooner than we think. Moore’s Law seems a long way distant already.
Core AI Learning Methods: Supervised, Unsupervised, and Reinforcement Learning
AI systems typically employ three main learning methods: supervised learning, unsupervised learning, and reinforcement learning.
Supervised Learning
In supervised learning, an AI system is trained using a labeled dataset, where the input data is associated with the correct output. The system learns to recognise patterns in the data and make predictions based on examples provided. This is a little like when we teach students something and expect them to remember it in an exam. We give students the answers and their job is to remember what we taught them when they’re tested.
Predictive analytics (such as house prices, stock exchange prices, etc.), text recognition, spam detection, customer sentiment analysis, and object detection (like face detection) are some popular applications of supervised learning. Within education, supervised learning can be applied to automate the grading of multiple-choice exams or predict student performance based on historical data.
As Tom M. Mitchell describes in his book, Machine Learning, supervised learning algorithms can help identify the factors that contribute to student success, such as attendance, engagement, and prior knowledge, and recommend targeted interventions to support students at risk of falling behind. They are in some respects the lower order thinking of AI.
Unsupervised Learning
Unsupervised learning involves training AI systems with unlabelled data, meaning the system must identify patterns and relationships within the data without prior knowledge of the desired output. Advanced cybersecurity programs and marketing departments or companies use AI trained in unsupervised learning for their customer segmentation. Many of the personalised recommendations you receive online rely on unsupervised learning strategies. Applications with speech recognition also use unsupervised learning.
Think of unsupervised learning like not having a teacher and having to work things out on your own. You look at the world and try to group things together based on how they look or what they do. Children do this all the time. They work out through trial and error that they can’t eat their soft toys and that hot things burn. We don’t directly teach them the sensations they’ll experience when they have a mouth full of fur or the pain they’ll feel when they burn themselves. They just have to figure that stuff out through experience.
Like supervised learning, unsupervised learning AIs are embedded in our daily lives often without us realising it. The fact you receive curated Netflix content relies on both supervised and unsupervised machine learning: supervised learning to create a quality control algorithm that passes or fails the content such as audio, video, subtitle text, based on the data it was trained on, and unsupervised learning algorithms to aggregate data from its 167 million subscribers and plans to produces new content its viewers want to watch. It just works out, based on what most people watch, what most people are likely to want to watch in future.
Unsupervised learning can be applied in an educational context to group unsorted information according to similarities, patterns, and differences without any prior training of data. For example, unsupervised learning can be used to classify learning styles for learning path generation in online education platforms. It learns what techniques work best and applies them to future learning approaches.
Reinforcement Learning
Reinforcement learning is a type of machine learning where an AI learns to make decisions by trial and error. The AI learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to maximise the reward over time. Teachers, parents and indeed businesses use reinforcement learning all the time - any reward and punishment system is in essence an example of this.
Within the world of machines, a good example is how robots learn to walk. The robot is programmed to take steps and is rewarded when it takes a step without falling over. Over time, the robot learns to walk without falling by adjusting its movements based on the feedback it receives. Another example is how large language models have been trained. I’ll hand over to ChatGPT-4 who can explain it so much better than I:
As an AI language model, I was trained using a variation of reinforcement learning called "self-supervised learning." Self-supervised learning is a form of unsupervised learning where the model is trained to predict certain patterns in the data without explicit supervision.
In my case, my training process involved being fed vast amounts of text data from various sources, such as books, articles, and websites. The data was preprocessed and then presented to me in small batches. For each batch, I was tasked with predicting the next word in a sentence or the missing word in a phrase.
The reinforcement learning aspect of my training process came from the feedback loop that was created as I made predictions. If my prediction was close to the correct answer, I received a reward signal in the form of a high score. If my prediction was off the mark, I received a low score or no reward at all.
Over time, my model learned from these feedback signals and gradually improved its accuracy and understanding of language. Through self-supervised learning and reinforcement learning, I was able to become a more effective and sophisticated language model, capable of generating coherent and contextually appropriate responses to a wide range of input prompts.
Very well put, thanks.
Reinforcement learning can be particularly beneficial for educational games and simulations, where students interact with an AI-driven environment that adapts to their actions and provides personalised feedback. Any gamified learning uses this approach; for example, badges and leaderboards can offer a sense of competition which can be beneficial for reinforcing good learning habits.
AI Techniques: Rule-Based Systems, Decision Trees, and Support Vector Machines
Rule-Based Systems
Rule-based systems use a set of predefined rules to make decisions or solve problems. These rules are created by humans and are based on a set of if-then statements. The AI system uses these rules to make decisions based on the input it receives. For example, a spam filter is programmed with a set of rules that determine whether an email is spam or not. If the email meets the criteria set by the rules, it is marked as spam and sent to the spam folder
In education, these systems can be used to automate tasks such as generating personalised learning plans or providing feedback on student work. An AI-driven essay evaluation system will use rules to assess grammar, punctuation, and spelling, offering instant feedback to students and saving time for teachers.
Decision Trees
Decision trees are a type of machine learning algorithm that use a tree-like model of decisions and their possible consequences. The tree is constructed by splitting the data into subsets based on the values of the input variables. The algorithm looks at the data and decides which variables are the most important in making a decision. It then splits the data into smaller groups based on those variables and continues to do so until it reaches a decision.
Think of how a tree is constructed, with a trunk, branches and twigs, and imagine that the final decision is the leaf at the end of the twig. The goal of the decision tree is to create a model that predicts the value of a target variable based on several input variables.
One example is an algorithm that predicts whether a customer will buy a product or not. The decision tree is constructed by analysing the data of previous customers and their buying habits. The tree is then used to narrow down choices and predict whether a new customer will buy a product or not based on their input variables. Amazon uses decision trees for how they present us with buying options based on our and others in our demographic’s search habits.
In education, decision trees can be used to model complex decision-making processes, such as determining the most effective intervention strategies for struggling students. By analysing student data, a decision tree algorithm can identify the myriad factors that impact a student's performance and suggest interventions that support their specific needs. A decision tree may reveal that students with low attendance rates and limited access to technology are more likely to struggle in a particular subject. With this granular information, schools can implement targeted support, such as providing additional resources or setting up one to one tutoring or buddy programmes.
Support Vector Machines
Support vector machines (SVMs) are a type of AI algorithm that helps to classify data into two groups. It does this by finding a line (know as a hyper-plane) that separates the two groups. For example, if you have a dataset of fruits with two features (say weight and colour) you can then classify the fruits into apples and oranges using SVM to separate the two classes of fruit. The hyper-plane will be the one that maximises the distance between the two classes of fruits (e.g. red and orange).
In education, SVMs can help identify patterns in student performance and behaviour, enabling teachers to make informed decisions about their teaching strategies and interventions. For example, SVMs can be used to predict which students are at risk of dropping out, allowing schools to put in place strategies to keep students engaged and on track. By grouping students and applying classification based on different features, it can be easier to narrow down which students need the most support.
Key AI Concepts
Machine Learning Algorithms
Linear Regression
Linear Regression is a foundational algorithm that is used to predict the value of a dependent variable based on the value of one or more independent variables. For example, if you have a dataset of house prices with two features, size and location, and you want to predict the price of a house based on its size, you can use linear regression to find the line that best fits the data. The line will be the one that minimises the distance between the predicted values and the actual values.
In education, linear regression can be used to predict students' final grades based on their attendance and homework completion rates. The correlation between low attendance and low grades has been shown statistically over time, so using linear regression educators can better support learners by targeting those areas shown to most impact on grade improvement.
Logistic Regression
Logistic Regression is another fundamental algorithm, primarily used for classification tasks. It is used to predict the probability of an event occurring based on the value of one or more independent variables. For example, if you have a dataset of emails with two features, length and sender, and you want to predict whether an email is spam or not based on its length, you can use logistic regression to find the function that best fits the data. The function will be the one that maximises the likelihood of the observed data.
In a similar way to linear regression, logistic regression might be used to identify students at risk of dropping out based on various factors such as attendance, grades, and family background. The main difference is in terms of output. Linear regression predicts a value (e.g. grades) whilst logical regression predicts a probability (e.g. student drop out likelihood).
Neural Networks and Deep Learning Architectures
Advancements in AI have led to the development of more complex models known as neural networks. These networks, inspired by the human brain, consist of interconnected layers of nodes or neurons. The most popular neural network architectures include Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), and transformers.
Convolutional Neural Networks (CNNs) are particularly effective at image recognition tasks. In the context of education, a CNN can be used to automatically grade handwritten essays or recognise student emotions through facial expressions. Rose Luckin has explored how CNNs could support real time assessment of engagement in lesson time through in class cameras that pick up on student expressions.
Recurrent Neural Networks (RNNs) excel at processing sequences of data, making them ideal for tasks involving time series data or natural language processing. For instance, an RNN might be used to analyse students' writing samples to provide feedback on grammar and style.
Transformers are a type of machine learning model that specialises in processing and interpreting sequential data. We will now explore two areas where we we primarily find transformers: natural language processing and computer vision.
Natural Language Processing (NLP)
NLP is a subfield of AI that focuses on enabling computers to understand, interpret, and generate human language. Some key NLP techniques include tokenisation and sentiment analysis.
Tokenisation is the process of breaking down text into smaller units, allowing AI models to process and analyse the text more effectively. Understanding tokenisation is important when using LLMs: each model has a ‘token limit’, which refers to the number of tokens it can hold in its memory. For GPT-3.5 its about 4000 tokens, and is double for GPT-4. An even larger GPT-4 model is being rolled out with a token count of 32,000. For open AI’s GPT models, a token is roughly 4 characters or 0.75 of a word. What this means in practice is that, once you reach 4000 tokens (or roughly 3000 words) the AI can no longer remember what was inputted before. For 4.0 it’s about 6000 words. This is critical to know, as in order to work with a GPT model on a longer project (such as planning a book or writing complex code) you need to think of methods to remind the model from time time, to avoid it wandering off in the wrong direction.
In an educational setting, tokenisation could be used to help identify common themes and patterns in students' essays. The very structured nature of LLMs lends itself well to assessing student work, as it can clearly map their work against the rules that have been inputted.
Sentiment Analysis involves determining the sentiment or emotion expressed in a piece of text. Sentiment analysis is used extensively by market researchers, who analyse social media comments for how a market segment might be feeling about a certain news item or product. For teachers, sentiment analysis might be used to gauge students' feelings about a particular topic or assignment based on their written feedback.
Computer Vision
Computer vision is a field of artificial intelligence that trains computers to interpret and understand the visual world. It uses deep learning models to identify and classify objects in images and videos, and to recognise patterns and relationships between them. Computer vision has a wide range of applications, including self-driving cars, facial recognition, and medical imaging.
Because of the visual nature of teaching, computer vision can have a wide range of applications in education. For example, computer vision can help teachers evaluate student participation rates in both physical and interactive classrooms, and analyse user behaviour, eye movement, and posture to assess engagement levels.
The above is a primer and nothing more: it doesn't profess to be either the most comprehensive nor the most up to date analysis of where we currently are with AI. It was important to me to learn the basics, and I hope that this is also helpful to the community. We also need to ensure we teach students these ‘nuts and bolts’ so that can best understand how to use AI. They need to understand how AIs are already shaping their daily lives so they can be more discerning consumers, and how the computerised processes actually work, so that they can realise that ChatGPT isn’t necessarily intelligent in the way we define it as humans.