Back to all posts

My ChatGPT Business Plan Was A Cautionary Tale

2025-09-15Tiernan Ray6 minutes read
ChatGPT
AI
Business

A person working on a laptop with charts and graphs.

By now, most of us know that generative AI can be inconsistent. Whether you're using it to brainstorm a novel or edit an image, the AI can introduce strange inconsistencies or completely lose track of its instructions.

Sometimes, this flakiness is a worthwhile trade-off if it helps you iterate on an idea. However, when it comes to detailed processes like creating a financial forecast, be prepared to spend a significant amount of time double-checking and correcting the AI's work, or risk being led far astray. Creating a business plan is an excellent test for any generative AI, and after spending weeks working with ChatGPT on this very task, the results were both helpful and riddled with errors. The biggest lesson learned: the longer your chat session, the more errors will inevitably sneak in, making the experience infuriating.

The Grand Experiment: Crafting a Business Plan with AI

Using the latest GPT model, I initiated a chat to create a business plan for growing a newsletter by acquiring subscribers through advertising. The plan required creating and continuously updating spreadsheet tables for subscribers, revenue, ad spending, and cash flow. ChatGPT was able to create these tables from scratch in Excel and allowed me to experiment with assumptions, like the rate of subscriber growth.

The process started with a simple prompt: "What's a good, simple business plan outline for growing a subscription business over three years from 250 subscribers to 10,000, where churn per year is assumed at 16%?"

We went back and forth, adding new variables like a $30 monthly subscription fee and iterating on tables and charts. I could adjust metrics like the cost to acquire a customer (CAC) and immediately see the impact on the profit and loss statement. In hindsight, this was the golden age of the project, a time of easy collaboration before the confusion set in.

The First Cracks Appear: When AI Forgets the Facts

The first error appeared about a third of the way into our session. A table of profit and loss clearly showed the business becoming profitable in month 10. However, ChatGPT's text summary confidently stated, "By Month ~43–45, cumulative cash flow turns positive." I challenged this, pointing out its own table showed the breakeven point at month 10. ChatGPT conceded the error with a chipper "and that's on me" and offered a graph, which also confirmed the month 10 timeline.

Several turns later, a similar mistake occurred. When I pointed out another discrepancy, ChatGPT admitted it had forgotten a fundamental assumption we started with: the business began with 250 subscribers, not zero. It became clear that one of us needed a reliable memory for the project's key details, and it wasn't going to be the AI.

Playing Whack-a-Mole with AI Errors

As ChatGPT generated new tables and graphs, strange little errors kept surfacing. When calculating the business's "terminal value," I asked it to use the total subscriber count in month 60. The precise value it offered was 9,200 subscribers. This was wrong. Just moments earlier, a table it generated listed the figure as 10,228. Once again, ChatGPT admitted its mistake without explanation, and I realized my role had shifted to that of a full-time, anal-retentive fact-checker.

This is the ultimate game of whack-a-mole: if you aren't paying close attention, you'll miss an incorrect assumption that throws everything off later. One of the most frustrating parts of working with any AI bot is the lack of explanation. It will say "my bad," but what you really want to know is how it confidently cited a number that directly contradicted the data it just produced.

The list of errors grew from there:

  • It used the wrong subscription price, leading to incorrect revenue calculations—twice.
  • It generated a chart with numbers that varied wildly from the accompanying table.
  • It produced an erroneous figure for free cash flow by mixing unrelated assumptions.
  • It forgot the agreed-upon "discount rate" and substituted another.
  • It simply miscalculated specific equations.

Each error required more coffee and deep-breathing exercises as my role shifted from ideation to tedious error-checking. OpenAI's response was that current language models are strongest in "short-turn conversations" and that the company is working on improving reliability in longer ones.

What's Happening Under the Hood?

Critics of modern AI, like scholar Gary Marcus, have noted that large language models lack true logical reasoning. Instead of maintaining a firm grip on agreed-upon variables, they produce confident-sounding statements while letting basic facts slip away.

This speaks to an issue with memory. Every LLM has a "context window," which is its short-term memory of the conversation. In this case, ChatGPT was either failing to recall information correctly from its context window or recalling an earlier, uncorrected error. The burden of maintaining factual consistency fell entirely on me.

The Productivity Paradox

Ultimately, working with ChatGPT is a mixed bag. The program can supply useful equations and background information instantly, saving you from tedious research. It can also follow a thread of discussion and incorporate new information, a major leap from the chatbots of just a few years ago. It helps you get from a blank page to a substantial piece of work quickly.

However, it also inserts erroneous assumptions, forgets key details, and makes obvious calculation mistakes that can be maddening. My personal productivity calculus is one part euphoria, one part lamentation. The project took half the time it would have on my own, but half of that time was spent correcting mistakes the AI shouldn't have made. Not factored into this equation is the stress of having to watch everything like a hawk, constantly waiting for the next "gotcha" to emerge.

The Takeaway: Proceed with Cautious Optimism

The core technical issue is that an LLM acts like a sloppy database. It holds onto data but can also corrupt or delete it without warning. For enterprises, technical solutions like Retrieval-Augmented Generation (RAG) can help by storing key variables in a separate, stable database for the AI to retrieve.

Most individuals don't have that kind of infrastructure. For us, the best approach is simple vigilance. Check every detail to ensure an error hasn't crept in. So be warned: watch the model like a hawk, and keep a fresh pot of coffee ready.

Read Original Post
ImaginePro newsletter

Subscribe to our newsletter!

Subscribe to our newsletter to get the latest news and designs.