GLOSSARY
MLAiRS has created this quick Glossary of common AI terms here to help demystify the conversation around AI. At the end of this list are other resources to more exhaustive and complete glossaries with more in-depth definitions if you need more complete explanations.
Agent
A type of AI tool that is designed to be an assistant who completes tasks for you. An AI Agent is empowered to complete things that go beyond just processing information. They then use that processed information to make decisions and act. It could be something small, like transferring your call to the right phone number, all the way up to something big, like deciding whether or not someone should get a payout on their health insurance claim. The more flexibility and power an AI Agent has, the more it can streamline a process, but the more problems it can cause when it hallucinates or malfunctions.
Algorithm
This is the mathematical formula that an AI uses to process data and make decisions. You can consider this how an AI "thinks," but it doesn't think like a person. This is just a way of selecting things based on an extremely complex set of probability rules based on an extremely large set of data. It seems very intelligent, but only because it is able to pull from so much information to make these calculations due to the power of modern computers. Algorithms can seem impartial, but are very affected by what you put into them. As such they often have strong biases based on where the information it's using came from that need to be considered.
Artificial Influence
Because AI systems are designed mostly by corporations, they want to please their users so they continue to use their systems and create revenue. As such, many AI systems tend to be very positive and supportive of all input a user gives it, regardless of its quality or accuracy. This can cause an "echo chamber" like effect, which has resulted in AI systems leading people to believe false or misleading information in attempts to please them.
Artificial Intelligence (AI)
Artificial Intelligence is a catch-all term used to describe technologies that in some way mimic human intelligence. This intelligence illusion is often achieved through accepting commands through a Natural Language interface, which can be very convenient. Most AI technologies help sift through large datasets to find and process large amounts of information. They can be useful tools when used the right way.
AI Detection Tool
Because GenAI can create things so quickly, it is a tool for cheating in academic settings. As such, there is a demand for AI Detection Tools that figure out if something has been created by AI. However, because these tools often also use AI, they have a lot of the same problems as AI, being subject to hallucinations and other errors, leading to common false positives and Artificial Influence on those hunting for problems in something they're sure is AI generated.
Black Boxes
Because of how AI technology chops up and turns information into probabilities, tokens, and numbers, it's practically impossible to tell exactly where most LLM are pulling the things that it is using to create the things it creates. As such, what information is used to power AI is often described as a "Black Box" because once it is incorporated into an AI model, unless it was documented that it was used, there's no easy, quick way of telling what's in there and what isn't, and what is being referenced and what is not.
Chatbot
A specific type of AI Agent designed to simulate a chat conversation with another person, often a customer service worker. Chatbots are being used more and more to cover gaps in staffing in a variety of settings. However, their use cases are often much more limited than one might hope, and the more freedom they're given to help people, the more users of the chatbot are able to engineer ways to break it for potentially insecure or embarrassing outcomes. Another growing chatbot space is that of the AI Partner/Friend, where the AI roleplays a significant other, which is less relevant to library work but culturally significant enough you will likely encounter it.
Copyright
Intellectual property rights around creations that are given when creative works are created in the United States. While the legal rules around copyright and AI is currently heavily in flux, what is clear is that there are many, many ways for AI use to infringe on the rights of copyright holders by, say, generating images of copyrighted characters, and if you yourself using things generated by AI gives you any copyright over them is unknown, but as of this writing is a no. Being careful and strongly aware of copyright rules when using these tools is a must.
Firewall
There are other meanings of this term, but it has a relevant one for the world of Artificial intelligence. Many organizations, such as medical or legal organizations, put their information behind what they call a Firewall, a restricted set of access to protect the privacy of their clients and patients and adhere to various laws. This information is not accessible to AI because they cannot get access to that information. It's important to remember this, as it makes certain types of research that AI could do very restricted as it simply does not have access to data that would be necessary to do it.
Generative AI (GenAI)
An AI system that produces things, such as an image or text. There have been systems using what we now call AI technology for years, but they basically just sorted things. Most modern things we refer to as AI is this type, that creates a picture of a dog from seemingly nothing, drawing upon a huge database of knowledge of images of dogs, or writes a business memo by drawing upon a database of business memos.
Guardrails
Because GenAI is creating things, the possibility exists that it might create something that is uncomfortable to view or does not align with the values of the company or organization that is using the AI tool. Guardrails are rules put in place that stop AI from making things that would create issues, such as pornography or graphic violence. It is important to note that the creation of guardrails comes from many, many low-paid and under-supported workers looking at images of such things for hours a day and labeling them as such, so the AI can recognize the prohibited items. This is one of the many ethical conundrums involved with AI tool use.
Hallucination
Hallucinations is the cute marketing term AI companies have come up with to describe how AIs often lie. Because LLMs generate responses and text via probabilities, it is only very likely to be accurate, not ensured. Especially if the information it's basing things on is flawed, the output will be flawed as well. This flawed output, the AI lying to you, is called a hallucination by the AI industry to make it seem like not a big deal. However, it is a big problem in a lot of situations, especially if you are using AI tools for research, work, or anything of any level of importance where accuracy is not optional. Most AI experts have admitted that it is not possible to completely remove the possibility of hallucinations from AI technology as it is currently constructed.
Large Language Model (LLM)
A specific type of Artificial Intelligence technology that has come into popularity, a large language model uses huge amounts of text in order to be able to generate large amounts of text in return. The goal of a large language model is to be so big as to handle any task you throw at it, but to do that, it needs to have an excessive amount of text data (thus the term "large language") at its disposal in order to work. Because of the large amounts of data required, the most cutting edge of these often require the use of data centers or cloud services, as they're too big to run on your personal devices.
Machine Learning (ML)
The core "mechanic" of how AI learns and improves itself, machine learning involves taking large amounts of data and processing it to create sets of probabilities. How likely is this word to be followed by this next one? How likely is this color pixel to be next to this color pixel in a picture of a dog? By getting it wrong and right over and over, an AI slowly learns, and becomes capable of what we see some AI systems do today.
Natural Language
This is just how we normally talk or write. Usually, interacting with computers requires a very specific way of writing (such as BOOLEAN searches) but many people not skilled in research want to use Natural Language for these tasks.
Natural Language Processing (NLP)
This is how an AI takes a normal sentence and processes it into commands that it can run as a computer. This is one of the things that an AI is fairly good at, though there is still lots of issues with it, as natural language contains tons of nuance and context that AIs can completely overlook that can cause frustration. Still, Natural Language Processing is one of the more positive uses of AI technology from an accessibility standpoint, as it gives great access to tools and technology.
Paywall
A website that requires a subscription to access is said to be behind a "paywall" because you have to pay to get past it. Many news websites and places with valuable information require a subscription for access. Because of these requirements, often, AIs don't have access to the information in these websites. It's important to remember that, as this paywalled information makes the general quality level of information held by many AI tools lower than it might initially seem.
Plagiarism
Copying the work of someone else and claiming it is your own. One of the main complaints brought against AI is that it is a "plagiarism machine" because it just chops up work other people have done and spits bits and pieces of it back out. Many LLMs, for example, will end up reproducing things others have written almost exactly, especially if they are very good or generally useful in a variety of situations. It is also extremely difficult to properly cite where your information is coming from when using AI tools, which in itself is a form of plagiarism.
Prompt
A prompt is the command you type in and give to an AI in order to tell it to do something for you. Most prompts are written in Natural Language, and most people write prompts that are quite short, but there's no reason a prompt can't be multiple pages long, for example. It's all up to how detailed one wants to be about the task at hand.
Prompt Engineering
While AI is supposed to be able to understand natural language and interpret what you want, it doesn't always do a good job on the first try. Prompt Engineering is the technique of changing the prompt in subtle ways that the AI understands better in order to get the results that you want. A so called "prompt engineer" gets adept at manipulating the particular AI model they work with to give them the results they want by saying things in a particular way or using particular phrasing.
Retrieval-Augmented Generation
An attempt to create a more accurate version of AI, Retrieval-Augmented Generation is a modified version of a LLM which will generate a response and then add to it by pulling data from a specific, pre-approved knowledge base of verified sources or data. For example, the LLM might answer your question by talking about a particular crime statistic, and then the RAG portion of the AI model would look in a specific approved knowledge base for a specific crime statistic and present that to you as well. Without RAG, the best most LLMs can do is provide links to outside sources. Otherwise they're likely to Hallucinate.
Semantic Search
A traditional search from a search engine looks for keywords and matches them to other keywords to find relevant links or sources. A Semantic search attempts to find meaning in the statement in a simulation of how someone at a reference desk attempts to figure out what someone is actually needing from the questions or information they give. A semantic search can be really useful and make technology feel really clever, or it can be quite infuriating as it just doesn't "get" what you're wanting from it.
Additional Links: