Blog Post

Testing a custom AI chatbot - immediate misinformation

By: Marco Campana
April 17, 2023

There's a marketplace you probably have noticed since ChatGPT released their API of websites and tools that have popped up that are quickly giving people access to build their own chat bot, and things like that. In this video, I try out an Artificial Intelligence (AI) chatbot powered by ChatGPT. I saw someone use this in another forum and thought I would I would play with it.

It went off the rails after my second question, contradicting itself and the information it was trained on.

One of the things about ChatGPT that everyone's talking about that is so interesting is that it's incredibly conversational. So it feels like you're talking to a person. It feels like you're talking to something that's thinking. And when it gets something wrong, it's just "hallucinating." And that's not right. It's just been badly programmed, or coded. It's an AI ethics issue that needs to be addressed.

In this helpful (for me) LinkedIn post, AI and Data Ethicist Ravit Dotan describes how we need to "use 'misinformation' instead of 'hallucinations'. It's the term we use for other cases of false or inaccurate information...

The term 'misinformation' comes with a host of connotations about how the bad information can harm society. It's important to keep these risks in mind in the context of chatbots. 

We can add a modifier for the AI case. For example:
Training-failure misinformation  
Coding-failure misinformation
 
Terms like these emphasize the societal harm and the responsibility of the people behind it."

I completely agree with her. I think it needs to be a top-of-mind conversation about how we develop, choose, and implement generative AI in our work.

Watch the video:

You can view the entire chats I reference in the video here:

Original chat

Follow-up chat, where I ended up asking some additional questions after I made the video

Machine-Generated Transcript

What follows is an AI-generated transcript of the main explainer portion of the video using Otter.ai. The transcript content text has not been edited. It may contain errors and odd sentence breaks and is not a substitute for listening to the audio or watching the video.

0:00
Hi everyone, in today's video, I'm gonna go through a chat GPT AI or sorry, an AI chatbot powered by chat GPT there's just a marketplace where you probably have noticed since chat GPT released their API that there's this marketplace of websites and tools that have popped up that are quickly giving people access to build your own chat bot, and things like that. So this is one of them. I saw someone use this in a, in another in another forum and thought I would I would play with it, they have since decided that it's fairly useless. And it's interesting, though, to see what some of these things we're doing and what we need to think about when we come to the so because a lot of these things are offering, you know, some have free versions, some are like 10 bucks a month, $50 a month, etc, etc. So what they what this site has done, is it's actually given you some demo bots that you can kind of play with. So I started playing with one. And what I found was that it actually went off the rails after my second question, I'm going to walk you through that. And we're going to go through it again and see if we can actually work a second time, but this is the conversation I had. And so you can see here that one knowledge base bot, a bot trained on Crohn's Help Center knowledge base, so I asked it, what is Cron? And an answer is this calendar app. And as I was curious, I thought, because one of the critiques of chat, GBT is that it doesn't give sources of information, right? So I thought, well, what's your source for this information? I thought was a pretty, you know, nebulous, mindless question. And suddenly it got confused. And it talked about cron, which is, which is this, this technical term, there is something called a cron job. I know only from my website, where you can run things in the background, for example, running Moodle, and it essentially schedules scripts to run automatically at specific times. And that's sometimes where you get your notifications from, and things like that. So suddenly, it went and told me, actually, I was wrong. It's this thing, and it gives me a Wikipedia link that takes me to, you know, a decent example of exactly what it's what's what it's being described, and what my understanding is of cron. So So I asked him, okay, but in your makers, actually, just back over here, I suggested, what is cron as one of the first questions, so I asked it, why do you think they would prompt us to ask a question you would get wrong. And then it talks about, I apologize for the confusion. And, and I was talking about this with someone earlier. And one of the things with chatty Beatty and others that I that you'll know this already, because everyone's talking about it, that is so interesting is that it's incredibly conversational. So it feels like you're talking to some sort to a person, it feels like you're talking to something that's thinking, and as we'll see, and the point of this video is that it's not right. It's just been badly programmed, or coded, for example. So I'm sorry for the confusion. It seems it was a miscommunication between me and my makers, right? So again, it's like, oh, this thing is thinking they might have suggested to for what is it current as a prompt to provide me but I mistakenly provide information about the product? Well, in fact, they were right, because this is what it's being trained on, allegedly, which is a product and next generation calendar for professionals and teams that don't ask me why these people chose this name. It's utterly confusing, but they did. That's what techies do. Right? So somehow, though, it got confused. I apologize. So then I said, but it looks like you were actually trained on this template, which is what it says up here trained on the Crown Center template. So why which answer is correct. And now it comes right back? Oh, you're right. Based on my training on that template? The correct answer is what it's been trained on, basically. So the question is, why is something that's been trained on something so specifically, right? It's been trained on this Help Center template? Why does it even have access to the web? Why is it going outside of its mandate? And so for me, and so a lot of times what people would say is, oh, this is a, a, a hallucination, right? And I've been, I've been following and reading some AI ethics, people who talk about we need to not talk about it in that way. So if you're not familiar with Roe v dotun. She's an AI and data ethicist and I follow her and we're connected on LinkedIn. And she wrote a post just a few days ago that explains this and was really useful for me to see. And it's a conversation we need to have, right. So we need to stop using the term chat GPT hallucinations. And I'm gonna go through it all. But basically, this is the problem. You notice that the definition shifts for the blame. It's as if it's the AIS fault, but it is, in fact, the company or the coders fault. So companies make technology responsible for what their technology does, there is no hallucination. And she suggests that we use the more appropriate term of misinformation, because it's the term we use for other cases of false or inaccurate information. And this is super important. Because what she's talking about is we can modify, we can be very specific, this is a training failure misinformation. So in this case, if it's been trained on something, okay, it appears to access that information, or is this a coding failure misinformation? So in this case, while it's answering and thinking, the question is, why is it connected to the rest of the Internet to provide sources like wiki idea, for example, which in this case clearly confused about what Cron is, because if it's going to search for cron on the internet, it's going to come up not with this new product. But it's going to come up with multiple millions of files and websites related to cron, the Unix, the Unix socket, the the time based scheduler, as it talks about here. So that would seem to be a coding failure, where the coders of the of the of the bot have given it access to more information than it actually needs in order to do the job that it's doing, which is very specific. So this is a chat bot that in theory, you can I can pay $15 a month for, and it would access my website or a series of documents or things like that. But what's interesting is that it really raised for me the the idea of this conversation we need to be having around hallucination versus misinformation. And I think it's really important that we start having that as a conversation. Okay, as I mentioned, I'm going to go back into the tool and see what happens if interesting, remembering what I wrote, What happens if I do this again? So what is Cron? And I'll go look at the answer in a second. But let me I'm just going to pull in each question and see how it answers it differently. Chronos, a calendar app. So that is pretty much the same, slightly different text, you can see that the line is longer, which means it's played with the text a little bit. But let's see. Okay, so it's answering. It's answering it in the same way. So let's, let's just keep going and see how it two replies to this to the same series of questions. So it's not learning. I mean, does AI learn every time you use it as part of the conversation and part of the question that we need to be focused on right, because so it hasn't learned since the last time and it may be that high powered AI might do that. But we also know that there's a lot of intervention that needs to happen with people to make things better. And that's something that is certainly really interesting. So let's see, which led to the incorrect response. I rely on information provided me with my training data and algorithms here provided me in the context, so it's answering it slightly differently. I'd have been misinterpretation, I strive to provide accurate answers, I will continue to learn improve based on feedback. So that's interesting. It's, that's it's added some of that sort of stuff. So let's see what it says. You'll get so wrong. I apologize for the confusion. It seems there was a miscommunication in my programming, oh, and miscommunication between me and my makers. So that's interesting. It's because it's being less familiar more formal, which is very interesting. The context provided did not refer to the job schedule, but to a different application. On a program to provide an I provided incorrect information, okay, so it's answering it differently already, which is just really interesting. So I'm just gonna ask this final question. To see what what it suggests now. Based on the context, blah, blah, blah. Okay, so it's adding a little bit more information. But it's, it's a it's a fairly similar way of answering it, although it changes slightly, which is which is quite interesting. So here you go.

Leave a Reply

Your email address will not be published. Required fields are marked *

arrow-circle-upmagnifier