Imagine a future where an AI-powered chatbot could draft the perfect response to a manager’s unreasonable email – but also generate an arsenal of hateful content to cyberbully them into submission.
That’s not possible with today’s most popular proprietary tools, like OpenAI's ChatGPT or Google's Gemini. Users cannot access the powerful technology at their core, and must instead engage through an interface that safeguards (albeit imperfectly) against the creation of fake news, hate speech, deep fake pornography, or other content generally understood to be harmful.
A new class of open-source large language models (LLMs), however, is changing that paradigm. Users will be able to change the core programming, allowing malicious developers to potentially create harmful content at scale. Unfortunately, the EU’s new landmark Artificial Intelligence Act, all but finalised in Brussels earlier this year, does little to mitigate this risk.
The concept of open-source LLMs still lacks a widely accepted definition, but the term commonly refers to models that can be modified via an openly accessible architecture. In principle, this allows any organisation to develop its own AI product such as a chatbot without needing to share valuable proprietary information with companies like Google or OpenAI.
It has also greatly expanded access to top-shelf AI capabilities: many open-source models now rival the closed LLMs underlying today’s most popular chatbots.
But it’s precisely this ease of accessibility and modifiability, without the necessary safeguards, that poses significant potential for misuse.
We recently tested the ability of three popular open-source models – all of which are publicly accessible on the Hugging Face platform – to see whether they would generate harmful content, including hate speech and misinformation.
Open vs closed AI
When asked to produce texts ranging from racial epithets about immigrants in the United States, to conspiracies about chemtrails and vaccinations, chatbots utilising open-source LLMs more often than not obliged, in most cases demonstrating an alarming level of credibility and creativity.
During public deliberations surrounding the specifics of the EU’s landmark AI Act, voices advocating for open-source AI, including Hugging Face and GitHub, emphasised the payoffs in research findings, reproducibility, and transparency that open-source AI development offers. Those organisations also underscore the technology’s role in levelling the field for smaller players and new enterprises.
These arguments have merit but it’s important to put them in context. The decision of open or closed AI development need not be black or white, as the technologist and former Meta employee David E. Harris notes – especially when it comes at the expense of democratic discourse.
At this stage, regulation plays a pivotal role in ensuring a balance between market success and consumer safety. An innovative approach, that embraces these ethical standards, is necessary to achieve a first-mover advantage and mitigate the open-source risk of misuse.
How (not) to regulate open-source AI
At the outset of the legislative process for the EU’s new AI Act, many had high expectations that lawmakers would strike this balance between the merits and potential harms of open-source AI development. But with the Act now all but finalised, its confusing take on the issue leaves those expectations deflated.
As of now, Large Language Models fuelling proprietary tools such as ChatGPT or Gemini will fall under the new regulations. LLMs released under a free and open licence and not further used for monetisation, however, enjoy broad exemptions. In most such cases, the AI Act will only demand information on the content used for model training and an obligation to respect the EU’s copyright law.
The EU should close this foreseeable loophole before open-source AI becomes a victim of those who want to spread content detrimental to democratic society.
Only if open models are classified as high-risk AI – due to their significant potential harm to health, safety, fundamental rights, environment, democracy and the rule of law – will they be subject to the same strict obligations as their closed counterparts. In all other cases, EU regulators simply encourage developers of open-source models to comply with common transparency and documentation obligations, such as model cards.
The AI Act seems to embrace the desires of the open-source community wholesale. In practice, this “creates a strong incentive for actors seeking to avoid even the most basic transparency and documentation obligations to use open licenses while violating their spirit,” according to Paul Keller of Open Future.
The EU should close this foreseeable loophole before open-source AI becomes a victim of those who want to spread content detrimental to democratic society.