Smarter AI Neurosymbolic Solutions For Unreliable LLMs
The Troubling Rise of LLM Hallucinations and Biases
A significant challenge in big tech's artificial intelligence endeavors isn't the fear of AI surpassing humanity, but rather the persistent inaccuracies of large language models (LLMs) such as OpenAI's ChatGPT, Google's Gemini, and Meta's Llama. This issue of them consistently making errors appears to be a deeply rooted problem.
These errors, commonly called 'hallucinations,' were notably highlighted when US law professor Jonathan Turley was falsely implicated in sexual harassment by ChatGPT in 2023.
OpenAI's apparent fix involved programming ChatGPT to avoid answering questions about Turley, a solution that is neither fair nor adequate. Addressing hallucinations reactively on a case-by-case basis is not a sustainable approach.
Similarly, LLMs have been found to reinforce stereotypes and provide answers with a Western bias. A significant issue is the lack of accountability for this widespread misinformation, as tracing how an LLM arrives at a specific conclusion is often challenging.
Why Current Solutions for LLM Flaws Fall Short
After the 2023 release of GPT-4, OpenAI's latest major LLM development, a vigorous debate erupted concerning these problems. While the discussion may have somewhat subsided since then, the underlying issues remain unaddressed.
For instance, the EU swiftly passed its AI Act in 2024, aiming to lead globally in AI oversight. However, the act largely depends on AI companies to self-regulate and doesn't thoroughly tackle the core problems. This has not stopped tech companies from deploying LLMs to hundreds of millions of users worldwide and gathering their data without sufficient oversight.
Meanwhile, recent evaluations demonstrate that even the most advanced LLMs continue to be unreliable. Despite this, leading AI firms still shy away from accepting responsibility for these errors.
Unfortunately, the tendencies of LLMs to spread misinformation and replicate biases cannot be resolved through gradual improvements over time. With the emergence of agentic AI, where users will soon delegate tasks like booking holidays or managing monthly bill payments to LLMs, the potential for complications is set to increase significantly.
Neurosymbolic AI A Path to More Reliable AI
The emerging field of neurosymbolic AI could address these challenges, while also diminishing the massive data volumes needed for training LLMs. So, what exactly is neurosymbolic AI, and how does it operate?
Under the Hood How LLMs Function and Where They Stumble
LLMs operate using a technique called deep learning. They are fed vast quantities of text data and employ advanced statistics to identify patterns that help determine the subsequent word or phrase in any given response. Each model, along with all its learned patterns, is housed in arrays of powerful computers within large data centers known as neural networks.
LLMs can simulate reasoning through a method called chain-of-thought, generating multi-step answers that mirror how humans might logically reach a conclusion, based on patterns observed in their training data.
Undoubtedly, LLMs represent a significant engineering feat. They excel at summarizing text and translation and might enhance the productivity of users who are diligent and knowledgeable enough to identify their mistakes. Nevertheless, they possess a great capacity to mislead because their conclusions are always rooted in probabilities, not genuine understanding.
Misinformation in, misinformation out. (Source: Collagery)
A common workaround is termed “human-in-the-loop,” ensuring that humans using AIs still make the ultimate decisions. However, assigning blame to humans doesn't resolve the problem. They will often still be misguided by misinformation.
LLMs now require such extensive training data for advancement that we are resorting to feeding them synthetic data—data created by other LLMs. This synthetic data can replicate and amplify existing errors from its source, leading new models to inherit the weaknesses of their predecessors. Consequently, the expense of programming AIs to be more accurate after their initial training, known as “post-hoc model alignment,” is rapidly increasing.
It also becomes progressively harder for programmers to identify what is going wrong because the number of steps in the model’s reasoning process grows ever larger, making error correction increasingly difficult.
Bridging Neural Learning and Symbolic Reasoning
Neurosymbolic AI merges the predictive learning capabilities of neural networks with the teaching of formal rules that humans use for more reliable deliberation. These rules include logical principles like “if a then b” (e.g., “if it’s raining then everything outside is normally wet”), mathematical rules such as “if a = b and b = c then a = c,” and the agreed-upon meanings of words, diagrams, and symbols. Some of these rules are directly inputted into the AI system, while others are deduced by the AI itself through analyzing its training data and performing “knowledge extraction.”
This approach aims to create an AI that will not hallucinate and will learn more rapidly and intelligently by organizing its knowledge into distinct, reusable components. For instance, if the AI possesses a rule about things getting wet outside during rain, it doesn’t need to store every example of wet objects—the rule can be applied to any new object, even one it has never encountered.
Key Advantages of the Neurosymbolic AI Model
During model development, neurosymbolic AI also integrates learning and formal reasoning through a process known as the “neurosymbolic cycle.” This involves a partially trained AI extracting rules from its training data, then instilling this consolidated knowledge back into the network before further training with more data.
This method is more energy-efficient because the AI doesn't need to store as much data. Additionally, the AI becomes more accountable because it's easier for a user to monitor how it arrives at specific conclusions and improves over time. It is also fairer because it can be programmed to adhere to pre-existing rules, such as: “For any decision made by the AI, the outcome must not depend on a person’s race or gender.”
Neurosymbolic AI The Next Evolution in Artificial Intelligence
The first wave of AI in the 1980s, known as symbolic AI, was actually based on teaching computers formal rules that they could then apply to new information. Deep learning followed as the second wave in the 2010s, and many now view neurosymbolic AI as the third wave.
It's simplest to apply neurosymbolic principles to AI in specialized areas because the rules can be clearly defined. Thus, it's unsurprising that we've seen it first appear in Google’s AlphaFold, which predicts protein structures to aid drug discovery, and AlphaGeometry, which solves complex geometry problems.
Realizing the Potential Challenges and Outlook for Neurosymbolic AI
For more general-purpose AIs, China’s DeepSeek employs a learning technique called “distillation”, which is a step in this direction. However, to make neurosymbolic AI fully viable for general models, further research is necessary to refine their ability to discern general rules and perform knowledge extraction effectively.
It remains unclear to what extent LLM developers are already working on this. While they express intentions toward teaching their models to think more intelligently, they also seem committed to scaling up with ever-larger datasets.
The reality is that for AI to continue advancing, we will need systems that can adapt to new information from just a few examples, verify their understanding, multitask, reuse knowledge to improve data efficiency, and reason reliably in sophisticated ways.
In this manner, well-designed digital technology could potentially offer an alternative to regulation, as checks and balances would be integrated into the AI's architecture and perhaps standardized across the industry. There's a long road ahead, but at least a viable path is emerging.