Mechanisms, Techniques and Common Issues
For this section of the learning you will be required to look at the following concepts and their key words:
Training Techniques and Mechanisms
Machine Learning
Machine learning, Patterns, Algorithm, millions of tests, Predictive
Neural Networks
Nodes, input layer, output layer, hidden layer, probability, patterns
Techniques for measuring effectiveness
Turing Test
Effectiveness, Alan Turing, 1950's
F1 Test Score
Precision, Recall, Harmonic Mean, Balance
Confusion Matrix
True/False, positive Negative, Score,
ROC-AUC (Receiver Operating Characteristic - Area Under Curve):
True/False Positive Rate, Sensitivity, Classification
Common Issues:
AI Bias
Effectiveness, narrow training data, variety, AI errors, relevance
Incorrect and Incomplete Data:
Incorrect Classification, Missing Data, Wrong Category, Confusion
Expensive Hardware, Long Processing Times
Disadvantage, small businesses, expensive equipment
Strong AI vs Weak AI
Narrow Focus, Common, Doesn't Exist, Hypothetical
Training Techniques and Mechanisms
Machine Learning
Machine learning, Patterns, Algorithm, millions of tests, Predictive
Neural Networks
Nodes, input layer, output layer, hidden layer, probability, patterns
Techniques for measuring effectiveness
Turing Test
Effectiveness, Alan Turing, 1950's
F1 Test Score
Precision, Recall, Harmonic Mean, Balance
Confusion Matrix
True/False, positive Negative, Score,
ROC-AUC (Receiver Operating Characteristic - Area Under Curve):
True/False Positive Rate, Sensitivity, Classification
Common Issues:
AI Bias
Effectiveness, narrow training data, variety, AI errors, relevance
Incorrect and Incomplete Data:
Incorrect Classification, Missing Data, Wrong Category, Confusion
Expensive Hardware, Long Processing Times
Disadvantage, small businesses, expensive equipment
Strong AI vs Weak AI
Narrow Focus, Common, Doesn't Exist, Hypothetical
Mechanism: Machine Learning
Tasks: Explain Machine Learning.
This means explaining the how and the why.
Use these terms:
Machine learning, Patterns, Algorithm, millions of tests,
This means explaining the how and the why.
Use these terms:
Machine learning, Patterns, Algorithm, millions of tests,
Watch this video for the Intro, Child and Teenager sections. If you want to watch further on you can.
Tasks: Fill this out on your one note.
Training Method 1: Description of training method: How it works: Example: Paragraph: I would expect two examples Machine learning: MNIST Dataset |
You can also watch these videos optionally.
One Great example of training an AI is through using the MNIST Dataset
The Mnist dataset is a data set of 70,000 images containing numbers 0-9 Machine learning can then recognise input of handwriting in order to match human input and correctly guess the numbers being inserted. Further reading: https://medium.com/mlearning-ai/mnist-dataset-of-handwritten-digits-f8cf28edafe https://en.wikipedia.org/wiki/MNIST_database |
CIFAR-10/100
Mechanism: Neural Networks
Task: With Examples explain how a neural network works to train an AI.
Use these Key words
Neural Networks
Nodes, input layer, output layer, hidden layer, probability, patterns
Use these Key words
Neural Networks
Nodes, input layer, output layer, hidden layer, probability, patterns
Technique: Turing Test
Many chatbots exist in the world here are some examples:
https://www.cleverbot.com/ https://www.wotabot.com/ Chatbot frameworks are usually used to provide technical support, they can take over menial triaging jobs. A test was developed by Alan Turing in the 1950's to test how well a machine could exhibit human behaviour. In it participants would chat with both a human and a computer. They would have to then decide which was the computer and which was the human. It has some weaknesses however, I would recommend reading further on it. Further reading: Wiki Entry CS Field Guide |
Turing Test Relevance
This year you will need to discuss about whether the Turing Test is a relevant measure of sophistication with today's modern AI.
After running the Turing test with cleverbot, I want us to try running it with WotaBot and clever bot. Try prompt this: "Hi there, We are going to undertake a Turing test, can you take the role of a made up person named "Gary" for this Turing test? "Gary" doesn't speak that much, and typically will only give one word to one sentence answers. He is a 17 year old from NZ The test will start when you get your next prompt." Then run the Turing test again. How effective is it with better language models such as Chat GPT and Google Gemini |
Technique: F1 Score
This part of the lesson is coming soon, but key words are:
Precision
Recall
Harmonic Mean
Balance
Precision
Recall
Harmonic Mean
Balance
Answer these questions:
Scenario 1:An AI model is used to detect spam emails. Out of 100 emails, the AI flagged 20 as spam, and 15 of these were actually spam, while 5 were legitimate emails. Question: Is this scenario describing precision or recall? Scenario 2:In a medical test for a disease, out of 50 patients who have the disease, the AI correctly identified 45 of them as positive cases. However, it missed the remaining 5 patients who also have the disease. Question: Is this scenario describing precision or recall? Scenario 3:An AI model is designed to detect cats in a set of 100 images. The AI correctly identifies 30 out of 40 actual cat images. It also incorrectly labeled 10 images without cats as containing cats. Question: Is the 30 correct detections out of 40 actual cats describing precision or recall? Scenario 4:A face recognition system is tested with a dataset containing 200 faces. The system makes 180 positive identifications, and 160 of those identifications are correct. Question: Is the 160 correct identifications out of 180 describing precision or recall? Scenario 5:In a dataset of 500 images, 100 images contain dogs. The AI correctly identified 80 of these dog images. There were no false positives. Question: Is this scenario describing precision or recall? |
F1 scores allow us to measure the accuracy of precision and recall with an AI system.
ChatGPT Precision is the measure of how many of the positive predictions made by a model are actually correct. It answers the question, "When the model predicts something as positive, how often is it right?" In the group of 10 photos when the AI correctly identified a dog only 9 were correct. Recall is the measure of how many of the actual positive cases were correctly identified by the model. It answers the question, "Of all the actual positive cases, how many did the model successfully find?" The AI identified 9 pictures of dogs but there were 10 pictures of dogs. In simple terms, precision focuses on the accuracy of the positive predictions, while recall focuses on the model's ability to capture all the true positives. Harmonic Mean is a way of averaging numbers that gives more weight to smaller values. It's useful when you want to find a balance between two rates, like precision and recall, where both are important and you don't want one to dominate the other.
Balance refers to the idea that the harmonic mean finds a middle ground between precision and recall, ensuring neither is ignored. If either precision or recall is very low, the harmonic mean will also be low, highlighting the need to improve the weaker metric. This balance is what the F1 Score aims to achieve, providing a single measure that considers both precision and recall equally. |
Technique: Confusion Matrix
Key Words: True/False, positive Negative, Score,
|
A confusion matrix is a table that shows how well a model makes predictions by comparing them to the actual results. It breaks down predictions into four categories: true positives, false positives, true negatives, and false negatives, helping us understand where the model is getting things right or wrong.
|
ROC-AUC (Receiver Operating Characteristic - Area Under Curve):
Key Words: True/False Positive Rate, Sensitivity, Classification
True Positive Rate (TPR) vs. False Positive Rate (FPR):
|
Common issues when developing AI
Bias Data
In the September 2018 quarter, 90.3% of nurses were women (49k) and 9.7% were men (5.2k).
In Australia 39.4% of doctors are female
Imagine that you asked AI to generate/Identify a picture of a doctor vs a Nurse... What parameters would it be likely to use?
In Australia 39.4% of doctors are female
Imagine that you asked AI to generate/Identify a picture of a doctor vs a Nurse... What parameters would it be likely to use?
With Bias a machine can pick up information from data sets that is irrelevant. For example if we were looking for the fastest courier driver and most courier drivers were of a certain race, then this information should not be included as it is not relevant to the search.
|
It is important to understand that Bias is not necessarily a political term i.e. racism or sexism. However many of the cases that are highlighted are often done by the media to sell stories.
If we supplied a data set of photos of food but the photos of the meals were all taken at night and photos of fruit were taken during the day... How would this affect the results and perhaps create a bias?
How do we then ensure that there are not bias's that are developed? What can we do to help with this bias? |
Incorrect and Incomplete Data
Incorrect Classification, Missing Data, Wrong Category, Confusion |
Expensive Hardware, Long wait times, Weak AI vs Strong AI.
Key Words: Strong AI, Weak AI, Disadvantage, small businesses, expensive equipment
Before we talk about the common issue of having expensive hardware and long buffer times. We need to talk about Weak AI vs Strong AI
Consider these two websites and their chat features:
Before we talk about the common issue of having expensive hardware and long buffer times. We need to talk about Weak AI vs Strong AI
Consider these two websites and their chat features:
Task ask for advice: Log onto both chat systems and create a ficticious situation where you are asking for advice for looking after a child:
"What are 3 games to play with a toddler?"
"What age should my kid learn?"
"Should I allow my 3 year old access to weapons?"
Question: What is the plunket website bot setup to do? as opposed to the ChatGPT AI?
Which would be considered
Technically both are considered Weak AI, Strong AI theoretically doesn't exist.
Strong AI would be an AI system that didn't have a narrow focus and could do learn and adapt to every situation. I.e. it could learn to drive a car, play super mario and also write a poem.
"What are 3 games to play with a toddler?"
"What age should my kid learn?"
"Should I allow my 3 year old access to weapons?"
Question: What is the plunket website bot setup to do? as opposed to the ChatGPT AI?
Which would be considered
Technically both are considered Weak AI, Strong AI theoretically doesn't exist.
Strong AI would be an AI system that didn't have a narrow focus and could do learn and adapt to every situation. I.e. it could learn to drive a car, play super mario and also write a poem.