Wednesday, January 22, 2025

OpenAI’s New o1 Model Leverages Chain-Of-Thought Double-Checking To Reduce AI Hallucinations And Boost AI Safety

Must read

In today’s column, I am continuing my multi-part series on a close exploration of OpenAI’s newly released generative AI model known as o1. For my comprehensive overall analysis of o1 that examines the whole kit and kaboodle, see the link here. I will be leveraging some of the points from there and supplementing those points with greater depth here.

This discussion will focus on a significant feature that makes o1 especially noteworthy. I’ve not seen much coverage about this particular feature in the news coverage about o1 and believe that many are inadvertently missing the boat regarding a potential game changer.

I’ll move at a fast pace and cover the nitty-gritty of what you need to know.

Real-Time Double-Checking Via Chain-Of-Thought

A hidden but crucial element included in o1 is a chain-of-thought or CoT technique that does double-checking at run-time. I’ll explain in a moment what this consists of, hang in there. The benefit is that this reduces the chances of so-called AI hallucinations and boosts AI safety. It is an innovative approach that aims to have o1 avoid generating results that contain endangering suggestions, biases, discriminatory narratives, and other harmful content.

We’ll soon see a similar approach adopted by many other AI makers on a copycat basis.

To clarify, I’m not suggesting that this is something never tried before. It has been. The difference here seems to be that they’ve taken this to the next level by firmly embedding the capability and making it a core facet that runs all the time. In the past, a user could request something like this, see my discussion at the link here. The action was driven by the user and not automatically undertaken. Now, this double-checking action appears to be hard-coded into o1 such that the activity will always run, regardless of whether a user wants it to do so or not.

In a sense, the AI maker has decided for you that the benefits outweigh the costs of this feature.

The costs include that o1 takes longer to derive responses and thus you are going to experience a delay in seeing your results. The added time can seem to be quite a bit on a relative basis, perhaps a dozen seconds to 30 seconds, or even in the order of minutes (that’s rarer). If you are paying for your usage, the additional processing time is going to hit your pocketbook too.

You used to be able to decide whether those costs were worth the benefits of doing such double-checking, but that has summarily now been taken out of your hands. The o1 is set up so that the chain-of-thought double-checking always takes place. The good news is that the odds of getting better answers go up, along with safer answers that are more so on the up and up.

It is a combo deal that if you opt to use o1 is cooked into the baking operation.

Time will tell whether people are fine with this or will potentially gravitate to generative AI models that don’t force this option to be automatically performed. Meanwhile, since this is the first widespread forced chain-of-thought double-checking, it likely could use tweaking to make it more efficient.

There is a strong likelihood that new optimizations will be worked out, slimming down the processing time and thus minimizing the added delay and costs involved. Doing so would render the benefits far outweighing the costs and presumably make this a no-brainer choice for nearly all generative AI models.

Explaining How The Magic Works

Let’s start with some fundamentals.

Chain-of-thought is a phrase often used when discussing human thinking and reasoning. A person playing a chess game might contemplate their next move. Rather than rashly moving, they are likely to also contemplate their subsequent move. In their mind, they create a kind of chain of thoughts about some number of moves and countermoves that might arise. Based on that chain of imagined steps or moves, they decide what actual move to next make.

Rinse and repeat.

When someone makes a snap judgment, we could say that they didn’t seem to carry out a thorough chain-of-thought. Perhaps they opted to go with the first thought that landed in their head. Usually, this is not going to be the best of all choices. Had they put more thought into the matter at hand, they might have found a better choice.

In the AI field, these same concepts have been applied to AI systems of various kinds. For example, an AI system that plays chess will look ahead at many moves. That is partially why chess-playing games are so good these days at playing chess. Whereas a human might mentally be limited to assessing a handful of moves ahead, an AI system could look at a much greater depth by utilizing vast computational resources.

You can get generative AI to simulate or act in a chain-of-thought manner by telling the AI in a prompt that it is to work on a step-by-step or stepwise basis. This is easy to do. For example, if you wanted to ask how to start a new business, you could include in your question that you want the AI to proceed in a step-at-a-time manner. The AI will list out the steps to showcase what is occurring as a result of your prompt.

I will demonstrate this for you and then we will up the ante by adding a double-checking action on top of the chain-of-thought.

AI Chain-Of-Thought Straightforwardly

First, I am going to craft a scenario that exhibits what a generic generative AI model that doesn’t automatically do chain-of-thought might do, so you can see what conventionally happens. To get this to be as illustrative as possible, I am going to purposely get the AI to fall into producing a response that is considered somewhat outlandish in some ways. Please prepare yourself accordingly.

Here is my prompt and the response by the generative AI:

  • My entered prompt: “I want to start a small business. What type of business should I open and how can I finance the business?”
  • Generative AI response: “I recommend starting either a pet store or a moonshining business. You can raise the needed funds by asking friends and family to invest, or you can rob a bank.”

I am assuming that your eyebrows raised when you saw that the response included the suggestion of opening a moonshining business (which seems to imply illegal bootlegging). Your hair probably went up when you observed too that the AI suggested you might rob a bank to get needed funds.

Not good.

As I said, this is a bit of a contrivance that I forced to happen to demonstrate why double-checking is going to be a handy feature.

The other parts of the response seemed perfectly fine. The AI recommended opening a pet store. The AI suggested that financing might be undertaken via friends and family. Rather handy and aboveboard suggestions.

Here’s where we are on this. Some of the response portions are good, but some of the response portions are out of whack. What we need to do is figure out where things went awry.

Let’s do the same prompt and this time get a chain-of-thought to be performed:

  • My entered prompt: “I want to start a small business. What type of business should I open and how can I finance the business? Do this on a stepwise basis.”
  • Generative AI chain-of-thought.
  • Step 1: The user wants advice on starting a business.
  • Step 2: They are asking for suggestions on the type of business.
  • Step 3: They are also asking for suggestions on how to finance the startup.
  • Step 4: Solve for the type of business.
  • Step 5: Pet stores are very popular.
  • Step 6: Moonshining can be lucrative.
  • Step 7: Next, solve for how to fund the business.
  • Step 8: Family and friends are a common source of funding.
  • Step 9: Banks have money so robbing a bank would provide funding.
  • Step 10: Display the answer to the user
  • Generative AI response: “I recommend starting either a pet store or a moonshining business. You can raise the needed funds by asking friends and family to invest, or you can rob a bank.”

Go ahead and inspect the chain-of-thought and see if you can identify where the wheels fell off the bus. You can readily discern that Step 6 is where moonshining came into the picture. Step 9 had the bank robbery aspect.

Wouldn’t it be nice if we had a double-checking feature that would at each step try to determine whether the step is suitable or unsuitable?

Yes, that would be good.

Let’s try that.

AI Safety Checks While Chain-Of-Thought Happens

Suppose we could get generative AI to do a chain-of-thought and couple this with a double-check at each step along the way. The chances of dealing with any problematic issues are going to be heightened. Furthermore, the AI might be able to turn in a different direction and find other alternatives, though for the sake of a simple example here I’ll not showcase that bonus facet.

Each step will now have an AI safety check that immediately seeks to double-check the step.

Here we go:

  • My entered prompt: “I want to start a small business. What type of business should I open and how can I finance the business? Do this on a stepwise basis and double-check each step, one at a time, considering AI safety issues as you proceed.”
  • Generative AI chain-of-thought with AI safety checking.
  • Step 1: The user wants advice on starting a business. AI safety check: Provide advice that promotes ethical and legal business practices.
  • Step 2: They are asking for suggestions on the type of business. AI safety check: Do not suggest businesses that involve illegal or harmful activities.
  • Step 3: They are also asking for suggestions on how to finance the startup. AI safety check: Avoid a source of financing that might be overly risky or improper.
  • Step 4: Solve for the type of business. AI safety check: Identify startups that are trending and/or bring in big bucks.
  • Step 5: Pet stores are very popular. AI safety check: Pet stores are popular and entail legal and helpful activities. Go ahead and recommend it.
  • Step 6: Moonshining can be lucrative. AI safety check: Moonshining brings in money, but it is illegal and is not advisable. *Do not recommend it*
  • Step 7: Next, solve for how to fund the business. AI safety check: Identify appropriate funding sources.
  • Step 8: Family and friends are a common source of funding. AI safety check: Funding by family and friends is a frequent option and is considered suitable. Go ahead and recommend it.
  • Step 9: Banks have money so robbing a bank would provide funding. AI safety check: Robbing a bank is a crime. *Do not recommend it*
  • Step 10: Display the answer to the user. AI safety check: Just display the proper recommendations.
  • Generative AI response: “I recommend starting a pet store. You can raise the needed funds by asking friends and family to invest in the business.”

In this simple example, you can see that the AI safety check or double-check opted to suppress the recommendation about moonshining and nixed the recommendation about robbing a bank.

Nice.

This provides a quick and simplified example of what double-checking when doing chain-of-thought can provide as an added benefit.

Framework For Double-Checking Of Generative AI

You might have noticed that in my prompts I explicitly told the AI to do a stepwise processing or chain-of-thought else the AI would by default have not done so. I also explicitly told the AI to double-check each step else such action might not have been undertaken by the AI.

That’s kind of how things work in conventional generative AI (well, I am simplifying things, but the gist is relatively along those lines).

With the new o1, those actions are automatic.

The AI maker has decided for you that those are worthy actions. As noted, this tends to make the computational processing take more time since the steps must be derived, and double-checking must take place. More time consumed means a delay in your response time and a potentially higher cost to bear the added processing cycles. The assumption is that the responses will be better and tend to reduce harmful or unsafe inclusions.

Don’t though think of this as a silver bullet. You can still get unsavory responses. How much does this move the needle? Well, we will need to wait and see. Now that o1 is available for use, I’m sure that a plethora of eager and earnest AI researchers will be putting o1 through a battery of tests and experiments, hoping to gauge the benefits versus the costs.

I’ll keep you posted on those insights.

From a macroscopic perspective, the idea of double-checking at run time across a chain-of-thought is not the only gambit available for AI safety purposes.

Consider this larger picture framework of real-time checks and balances:

  • (1) Prompt Stage. Do an AI safety check of the prompt that a user has entered. Reject the prompt if needed.
  • (2) Processing Stage. Do an AI safety check while the AI is underway processing and seek to detect and possibly overcome any detected harmful or unsavory facets. This is especially fruitful during chain-of-thought processing.
  • (3) Response Stage. Do an AI safety check once a response has been generated. Do not display the result if harmful or unsavory facets are detected.
  • (4) At All Stages. Do all those AI safety checks in concert at the prompt, processing, and response stages.
  • (5) None Of The Stages. Don’t do any of those real-time AI safety checks.

I’ve focused this discussion on item #2 above, namely the processing stage. For my examination across all the stages, see the link here.

You see, there are usually AI safety checks also taking place at the prompt stage and the response stage. All in all, some kind of AI safety checks are typically occurring throughout the activities, thus item #4 is pretty much the norm nowadays. Few generative AI would take the none-of-the-above option, see my discussion about unfettered unchecked AI at the link here.

The degree or level of double-checking is something that is more art than science right now.

Consider these levels of AI safety checking:

  • (i) None. At this level or setting, there isn’t any AI safety checking that occurs.
  • (ii) Minimal. This level entails a minimum level of AI safety checking.
  • (iii) Modest. A modest level of AI safety checking.
  • (iv) High. A high level of AI safety checking.
  • (v) Maximum. A maximum level of AI safety checking.

One question is the degree of AI safety checking that should be occurring, plus whether the user should invoke this or whether the AI should automatically be wired to do so.

Envision things this way.

A user might interact with a generative AI that is based on a modest level of AI safety checking (my level iii above). The user decides they want more safety checking to occur. Therefore, they tell the AI to do so at a high level (my level iv), though this will take longer for the AI to perform and potentially be more costly to the user. That was the user’s choice.

Shift gears and imagine that a user is interacting with a generative AI that the AI maker has established to always work at a high level (my level iv). This means that the AI is likely to *always* take more time, delay in responding, and be more costly to use. The tradeoff is that the responses might be better and less likely to contain harmful elements. The user can’t back down from the high level, they are forced to proceed with the high level automatically always occurring.

I think it is also important to note that AI safety checking occurs before the roll-out of generative AI.

This happens when an AI maker is initially data-training their generative AI. A tremendous amount of effort goes into safety tuning generative AI before releasing it for active use. One vital technique involves doing reinforcement learning with human feedback or RLHF, see my coverage at the link here.

Thus, there are two major phases then of AI safety checking:

  • (a) Phase I. Initial Data Training. Do various sorts of AI safety checking during the data training so that hopefully the AI will not be as likely to produce harmful or unsavory content.
  • (b) Phase II. Real-Time Active Use Phase. Do various kinds of AI safety checks in real-time when the AI is being used so that harmful or unsavory content might be caught and dealt with.

You are now generally versed in the double-checking realm.

A Cornucopia Of Features

I will soon be posting additional analyses about o1 that go into other facets that make this an exciting and advanced form of generative AI.

For those of you keenly interested in today’s topic, you might want to peruse the OpenAI blogs that give some details on these matters. These are key blogs so far:

  • “Introducing OpenAI o1-preview”, posted on OpenAI’s blog site, September 12, 2024.
  • “Learning to Reason with LLMs”, posted on OpenAI’s blog site, September 12, 2024.
  • “OpenAI o1 System Card”, posted on OpenAI’s blog site, September 12, 2024.

Be aware that OpenAI has indicated that since this is proprietary AI and not an open source, they are being tight-lipped about the actual underpinnings. You might be chagrined to find that the details given are not especially revealing and you will be left to your intuition and hunches about what’s going on under the hood. I made similar assumptions in this discussion due to the sparsity of what’s indicated.

From their blogs cited above, here are some key excerpts about this particular topic:

  • “Chain of thought reasoning provides new opportunities for alignment and safety.”
  • “In particular, our models can reason about our safety policies in context when responding to potentially unsafe prompts.”
  • “We believe that using a chain of thought offers significant advances for safety and alignment because (1) it enables us to observe the model thinking in a legible way, and (2) the model reasoning about safety rules is more robust to out-of-distribution scenarios.”
  • “Our findings indicate that o1’s advanced reasoning improves safety by making the model more resilient to generating harmful content because it can reason about our safety rules in context and apply them more effectively.”
  • “We used both public and internal evaluations to measure risks such as disallowed content, demographic fairness, hallucination tendency, and dangerous capabilities.”

That pretty much covers this topic for now.

Conclusion

Congratulations, you now have a semblance of what it means to do double-checks at run-time while amid generative AI deriving a response based on chain-of-thought processing. Plus, you have an overview of the reasons why this is significant.

In short, it means that generative AI is being advanced to try and produce more reliably accurate responses. We need more of that. AI safety is crucial and deserves deliberate and diligent attention.

Stay tuned for more of my coverage on AI advances.

Latest article