The Open Source Initiative recently dropped the first version of its Open Source AI Definition (OSAID). While not legally enforceable, the idea is to provide the industry a concrete standard for what it really means to be “open source” in artificial intelligence.
The term “open source” is bandied around a lot and some tech companies that claim to be open source actually aren’t. As more companies utilise AI, standards and governance will be increasingly imperative to business practice.
Meta is just one of the companies that has branded its Llama large language models as open source, despite not meeting the new criteria.
And this is important.
For businesses looking to adopt AI, OSAID isn’t just about semantics – it’s being positioned as a safeguard for transparency, security and keeping AI accessible.
What is open source AI and why does this definition matter?
Open source AI refers to AI systems that are fully accessible to the public in their code, data and design, allowing anyone to use, modify and share the model without restrictions. This openness ensures AI systems can be studied, improved and applied transparently, benefiting a wide community of developers and users.
In an industry where terms like “AI” and “open source” are thrown around loosely, the release of OSAID brings some long-overdue clarity.
With policymakers worldwide drafting AI regulations – including here in Australia – a definition like OSAID gives everyone a common standard and helps regulators identify which models should receive open source benefits, such as favourable compliance treatment.
OSI’s executive vice president Stefano Maffulli notes the European Commission, for example, has been closely watching the open source AI space. This definition, Maffulli says, could be a guidepost for what “open” means as laws start to catch up with AI.
Open source AI standards
So what does it actually take for a model to qualify as open source AI under OSAID? The definition has some very specific requirements, or “four freedoms”, that go beyond just free access.
To meet OSAID’s standard, an AI model must:
- Use the system for any purpose and without having to ask for permission;
- Study how the system works and inspect its components;
- Modify the system for any purpose, including to change its output; and
- Share the system for others to use with or without modifications, for any purpose.
These criteria create a framework for what transparency and accountability should look like in open source AI.
For businesses looking to bring AI into their operations, this definition aims to ensure they’re working with tools that can be understood, audited and trusted.
The OSAID standard can also help businesses spot models that are “open source” in name only. Truly open source models are transparent, letting companies see what data the model was trained on and how it processes information. This visibility is essential for ensuring data compliance, detecting bias and strengthening security.
In an era where the legal and ethical implications of AI are under scrutiny, using AI tools that don’t align with OSAID could result in compliance issues.
Of course, some argue the OSAID doesn’t go far enough in specifying licensing and data transparency requirements, which are essential for businesses that depend on stable, well-documented AI tools.
OSI has acknowledged that future updates to its OSAID will likely address these nuances, especially as questions around intellectual property and licensing for training data evolve.
As a result, OSI has set up a committee to monitor the standard’s real-world application and recommend subsequent updates.
Open source AI is complicated
The new standard is certainly going to shine a light on companies claiming the open source label without delivering on the details.
In the case of Meta, its Llama models are labelled as open source but fail the OSAID standard regarding restrictive licensing.
Llama’s license limits some commercial uses and prohibits activities that could be harmful or illegal. Meta has also been reluctant to disclose details about Llama’s training data.
Meta argues these restrictions serve as “guardrails” against potential misuse, which remains a significant concern in AI development. Without controls, powerful AI models could be repurposed for harmful applications like deepfakes, misinformation or other unethical uses.
These are real risks, and Meta’s position reflects a broader ethical debate within the open source community: how to balance transparency and unrestricted use with responsibility and safety. For Meta, keeping certain restrictions in place allows it to reduce misuse risks without sacrificing model accessibility.
The OSI also acknowledges the need for privacy in some situations. Its transparent about the exclusion of some types of training data, like in the medical field, due to privacy laws.
“We want Open Source AI to exist also in fields where data cannot be legally shared,” the FAQs read.
Interestingly, Meta, along with other big tech companies such as Amazon, Google, Microsoft, Cisco, Intel and Salesforce, actually helps support the OSI’s work.
On the one hand, this does show an investment in open source software. But it also raises questions around potential conflicts of interest when a company may support the OSI financially but doesn’t adhere to its standards.
“While the OSI is very grateful for their support, the OSI does not endorse these or any other companies,” the website reads.
This dynamic reveals the complexities at play when the very organisations that fund open source initiatives also have a vested interest in shaping what “open source” means in AI.
This is all the more reason for international and independent standards to exist – in order to ensure labels such as “open source” don’t just become a marketing tool to peddle AI models.
Never miss a story: sign up to SmartCompany’s free daily newsletter and find our best stories on LinkedIn.