OpenAI Lists Professions Its AI Is Ready To Tackle
OpenAI Launches GDPval to Measure Real World AI Performance
OpenAI, the company behind ChatGPT, has introduced a new evaluation framework called GDPval. According to an official company blog post, this benchmark is designed to measure how effectively its AI models can handle "economically valuable, real-world tasks across 44 occupations." The goal is to ground the conversation about AI's societal impact in tangible evidence of its current capabilities, moving beyond speculation.
This move is seen as a direct attempt by OpenAI to demonstrate the financial viability of its technology, especially in the face of growing skepticism that the AI boom might be a technological dead end. It also serves as a data-driven response to criticism over the company's marketing, which has sometimes been viewed as overly boastful, such as claims of developing AI with "PhD-level" intelligence.
Which AI Models and Professions Made the Cut
The initial findings from GDPval suggest that today's top AI models are already producing work that nears the quality of human industry experts. The evaluation identified 44 occupations where AI could have the most significant impact on productivity.
This list includes a wide array of professions, such as:
- Real Estate Sales Agents
- Software Developers
- Lawyers
- Registered Nurses
- Customer Service Representatives
- Financial Advisors
- Private Detectives
Specific examples of tasks tested, as detailed in the research paper, include creating a competitor analysis for a financial analyst or designing a sales brochure for a real estate agent. In a surprising twist, the evaluation found that competitor Anthropic’s Claude Opus 4.1 was the top performer across 220 tasks. An advanced version of GPT-5 was rated as better than or equal to human experts just over 40% of the time, a significant jump from the 13.7% score of the year-old GPT-4o.
The Cautious Language of AI Replacing Jobs
OpenAI is carefully navigating the topic of job replacement. The company's official messaging emphasizes that AI is meant to "support people in the work they do every day," avoiding any direct mention of making human roles obsolete. This framing is understandable given the negative public perception of technology-driven job loss.
However, this cautious language is at odds with the broader industry narrative, where some AI executives have openly boasted about replacing human labor to cut costs. In some cases, these strategies are already beginning to backfire for companies that have moved too quickly to implement AI.
A Reality Check on AI's Workplace Readiness
Despite the promising results of the GDPval evaluation, there are strong reasons to remain cautious. The real-world application of AI has already created significant challenges for professionals like software developers and lawyers, often increasing the need for human oversight rather than reducing it.
The persistent issue of AI "hallucinations," where models generate false information, remains a major obstacle. This forces users to spend valuable time verifying AI-generated output, undercutting potential productivity gains. OpenAI itself acknowledges these limitations, stating that most jobs are far more complex than a collection of well-defined tasks.
"Early GDPval results show that models can already take on some repetitive, well-specified tasks faster and at lower cost than experts," the company wrote. "However, most jobs are more than just a collection of tasks that can be written down."