As a boutique consulting company specialized in conversational AI, we spend a lot of time auditing chatbots. We’ve pretty much got it down to a science at this point. Using our audit methodology and checklist, we’ll walk you through how to prep, test and score your bot in less than 20 minutes.
First and foremost, use checklist to help you audit your chatbot. Why? Because unfortunately, our minds aren’t as sharp at 25+ as they were at 12… we forget things. Not only does the checklist help you remember everything, it also keeps you focused and instantly highlights opportunities for improvement.
Don’t have an audit checklist? We’ve got you covered! Our audit checklist is a robust one-pager that covers the four most important pillars that make up a strong chatbot:
Scroll down to the end of this post to download your own copy 👇
We recommend prepping 15 questions that you can ask your bot. We always start from a generic list of 10 questions and then adapt those based on the target audience using the bot, the placement of the bot and the company. If you’re lacking inspiration, you can always check out our base list of questions in this article.
Now it’s time to test your bot! Ask the bot your 15 questions and note how it responds: true positive, false positive, false negative or true negative. Then run through the rest of the checklist testing the different criteria and recording the results. Once you reach the end it’s time to move on to the next step.
This is the easy part… Add up the scores for each section and then calculate the total score out of 50. You’ll immediately see which areas the bot performs well and which areas need improvement. This will help you focus your optimization efforts.
After testing 100+ bots, we’ve perfected the bot testing setup. Here are a few habits that we use to make sure our tests run smoothly.
After a lot of trial and error, we found that the best way to efficiently test a chatbot is to use two screens. We like the open the bot on our laptop and have the audit checklist and questions open on our second screen.
This setup makes it easy to copy and paste the questions into the bot instead of losing time typing them out.
This is crucial. As you’re moving quickly, ideas come and go just as quickly. I promise you won’t regret this. Plus it makes referring back to a test a few hours, days or months later so much easier.
Color allows us to see quickly how a bot performed on NLP quality. We highlight true positives in green, false positives in red, false negatives in orange and leave true negatives as is.
You can use any color combination that you like. The idea is to pick a set of colors that stand out from one another to make counting at the end quick and easy.
This is a simple one, but make sure you have a tablet and phone next to you so you don’t have to get up partway through testing and try to find it.
Screen recordings are a lifesaver. AI is like a black box and weird stuff happens. It just does. After experiencing weird things and then trying to recreate them with different results, we resorted to recording each testing session. This allows us to revisit a test later on as well as show our developers and conversational AI consultants what went wrong so they can look into the conversation history and pinpoint the error.
Fill out your info on the right to receive a download link for an editable version of our Bot Audit Checklist!
If you want to discuss AI in more detail, then reach out to Alexis.
He's ready to chat in French, English and Greek.