If you have been on a desert island for the last 18 months without internet access or contact with the outside world it’s possible you have never heard of GenAI, GPTs or ChatGPT.
But even if you haven’t, it might still be worth a quick recap to bring you up to speed with some of the more recent developments and key terms before we go any further.
A GPT (Generative Pre-trained Transformer) is a type of AI designed to understand and produce human-like text. When you ask a GPT something, using a “prompt” it predicts what comes next in a sentence based on what it has learned. It doesn’t actually “think” like a human but uses patterns from its training to predict responses that often seem intelligent and relevant. A prompt is simply the text used to communicate with the GPT, it’s like a recipe, setting out the ingredient’s and how to combine them to get the desired result.
A prompt is like a recipe, setting out the ingredient’s and how to combine them to get the desired result
You input your question into what are called GPT models. These to a certain extent are defined by the companies that build them, for example ChatGPT is owned by OpenAI, Copilot, Microsoft, Gemini, Google, and Claude, Anthropic.
A whole industry has grown up around OpenAI’s ChatGPT. Custom GPTs are a feature of ChatGPT that lets users customise or fine-tune the GPT for specific tasks or domains. For example, “Legal Assistant GPT,” is trained on legal documents, case law, and regulations and helps lawyers with research and drafting, also “Math Tutor GPT” which specialises in explaining mathematical concepts and helping solve maths problems step-by-step. At the time of writing there were over 100,000 custom GPTs and growing. In many ways it’s like the app market where independent companies develop what they believe will be desirable GPTs, hoping to cash in at a later date when or if their GPT grows in popularity.
Custom GPTs let users customise or fine-tune the GPT for specific tasks or domains
But which is the best?
This is a little like asking which is the best car or computer, it really depends on what you want to use it for. And even if we narrow it down and ask – what is the best GPT for learning, it’s still not that easy. For example, if you think of each GPT model as a different type of teacher, some are good at breaking down complex topics and making them easy to understand, others have a way with numbers or telling motivational stories, you get the idea.
Think of each GPT model as a different type of teacher
But we must start somewhere so I have chosen 4 popular models, ChatGPT, Copilot, Gemini, and Claude. The plan is to put them each to the test using a series of standardised prompts, designed to mimic real-world learning and exam scenarios. By comparing their responses across various tasks, we should gain some insights into their strengths, weaknesses, and potential application in learning.
The testing focussed on professional accounting examinations using the free versions of the GPT models available at the time (August 2024). Each GPT was required to complete 7 different learning-based tasks, these were 1. Write practice questions, 2. Explain a particular concept, 3. Solve a numerical problem, 4. Summarise technical content, 5. Answer questions, 6. Provide feedback on an answer and 7. Create flashcards. The criteria used was Clarity, Accuracy, Relevance, and Teaching skills. This last category related to how well the GPT answers incorporated teaching skills such as step by step instruction, examples, motivation, coaching etc.
Results
The winner was ChatGPT with 86%, followed by Claude, scoring 77% and joint third with 65% were Copilot and Gemini. They all scored well on explaining concepts, summarising content, and giving feedback. But Copilot and Gemini were not always accurate, especially when dealing with numbers which brought their scores down. Although a link was provided to the respective syllabi, none of the GPTs were particularly good at producing exam standard questions. They were all too simplistic and lacked the required level of difficulty. A more detailed prompt with additional guidance providing examples of past question would however help improve this.
Conclusions
Based on these results, ChatGPT is the clear winner which is consistent with the findings of others, confirming that it is probably the best overall model at the moment. However, it’s possible that with more sophisticated prompts some of the problems that Copilot and Gemini had with calculations could be resolved.
More generally having now used these models for a few real-world learning tasks, they are hugely impressive but should be used with caution. They make mistakes but even worse the answers to the non-expert will appear convincing. That said as long as you know this, they can be of real value if used properly. Here are a few tips to help.
• Use at least two GPTs – Save two GPTs into your task bar, and when in doubt as to the accuracy or relevance of an answer simply input the same question into both and let each validate the other. Based on these findings the best models to use would be ChatGPT and Claude. If they give different answers use a third.
• Use GPT’s to summarise content, provide feedback and explain complex topics
– Summarising a large block of text, reducing a complex topic down to easier to digest chunks of information not only saves time but reduces cognitive load.
– Feedback is as they say the “breakfast of champions,” which makes it an invaluable learning tool, all the GPTs were good at this. Put your answer into the GPT and ask how it might be improved or how to score more marks. If you provide a marking guide the answer will be more accurate.
– Explanations and examples are essential for learning. Simply ask the GPT to explain a concept in say “100 words” or to “a novice” who has never come across the topic before. Asking the GPT to include an example or go through it “step by step” can really help develop a much better understanding. And for difficult concepts, ask it to use an analogy or explain by way of a story.
• Use GPTs to help produce study timetables – but less so for flashcards and mind maps, which tend to be little more than words listed together. In fairness the problem with flashcards has more to do with teh quality of the questions and in some instances the answers.
• Be careful with asking technical questions – Stick to the texts that are produced by the experts and their answers.
The bottom line – Think of your GPT as a young teacher who sometimes gets over confident and makes mistakes but is very, very smart and wants to help you do well.
And if used properly it will!
