THE ETHICS OF SOFTWARE IN THE AGE OF ARTIFICIAL INTELLIGENCE

by Fabio Gnassi

interview with Stefano Maffulli

The growing concerns surrounding the use of centralised and proprietary artificial intelligence find a possible solution in the open source universe — a horizon of hope for building more transparent, ethical, and community-driven tools. But what lies behind the expression “open source artificial intelligence models”, and who are the actors responsible for regulating and governing the growth of these innovative tools?

The Open Source Initiative is not the only entity committed to safeguarding the democratisation of software. The Free Software Foundation also exists. Could you explain the main differences between the two institutions?

Before the 1980s, software was often developed and shared freely among researchers, universities, and companies, without particular restrictions. However, by the end of the decade, this free circulation of source code began to diminish due to the enforcement of copyright laws. Companies started to protect their work through copyright, thereby limiting access to and modification of software.

It was in this context that a researcher at MIT devised a legal mechanism capable of preserving the free circulation of software by using copyright law in a way that ran counter to its original purpose. From this insight emerged a political manifesto, a declaration of intent that attracted those who believed in the ideal of software being accessible and modifiable by all. Over time, this community became structured and gave birth to the Free Software Foundation (FSF), an organisation that, beyond protecting the free sharing of source code, also played a key role in shaping the very concept of software.
It’s worth noting that until then, the idea of software was very different from today’s. Before the 1970s–80s, software was not covered by copyright law — this conception developed later, alongside the evolution of the concept of free or open source software.

This evolution occurred in tandem with technological advances. The more technology progressed, the more it became possible to test these ideals through the development of new software. The Free Software Foundation was the first organisation to formalise these ideas. In the 1990s, following the explosion of the internet and the emergence of new software source codes designed to solve problems, a new organisation was born. While it presented itself as an offshoot of the Free Software Foundation, it diverged in its interpretation of the founding principles. This organisation is the Open Source Initiative (OSI).

These two organisations can be seen as sister organisations that pursue the same goal through different approaches: the Free Software Foundation is driven by an ethical objective based on a moral imperative — that anyone using software must have access to its source code. This means software must be free, and this definition is rooted in the recognition of four core freedoms: to use, study, share, and modify. The Open Source Initiative defends these same principles, but not from a moral standpoint — rather, from a practical one.

Open source also concerns AI models — a topic that has returned to the spotlight with the release of the DeepSeek-R1 model. Could you provide an overview of this scenario?

Free, open source software is a type of software without restrictions, which gives users all the necessary tools to control the technology they are using — to understand how it works, how it was built, and to be able to modify and share it with others.
When we compare software with machine learning models, we must keep in mind that the former consists of lines of code written by humans, while the latter are not directly programmed — they are machines capable of making predictions and producing outputs that are not deterministic, but based on statistical calculations.

The main difference is that, since they are not directly programmed, machine learning models do not have a source code, which is a fundamental requirement for exercising the four freedoms upon which the definition of free software is based.
Our research group spent nearly two years trying to answer this question. After this period, we concluded that in order to study and modify modern artificial intelligence based on machine learning and deep learning techniques, it is essential to have access to four key elements:

The model weights — the parameters that determine the behaviour of the neural network.
The complete training code — which defines the model’s learning process.
The complete code used to build the training dataset — crucial for understanding how the data was selected and prepared.
The full list of original data making up the dataset — to ensure transparency and reproducibility.

These elements represent the essential requirements for an AI system to comply with the principles of free software and to guarantee users the four fundamental freedoms.

When a company claims to have released a model as “open source”, is it truly adhering to your vision, or merely offering its own interpretation of the concept?

Companies have commercial interests and often exploit the open source label as a marketing tool — sometimes abusing it solely for economic gain.
A prime example is the LLaMA model, developed by Meta, which was marketed as open source but distributed with restrictions incompatible with the unrestricted access required by the definition. Furthermore, LLaMA lacks transparency regarding the development and training process, the data used, and the training code.
The same applies to DeepSeek-R1 and other models that, although released in a relatively open manner, do not provide access to training code, datasets, or source code used in their creation.

As the body responsible for defining and recognising the concept of open source, we often find ourselves at the centre of controversy. Criticism comes from both sides — from those who believe access to training datasets should always be guaranteed, and from companies that consider their sharing practices sufficient, such as distributing model weights and publishing research papers and technical reports.

The debate remains open and ever-evolving. It’s no coincidence that our current definition is marked as version “1.0”, precisely to highlight that we acknowledge its future development, in parallel with technological progress and its applications.
It’s important to remember that the definitions of open source and free software were consolidated over decades, at a time when there were already numerous software projects, vast code archives, and various licences to analyse. As such, those definitions emerged as generalisations of what already existed. In the field of artificial intelligence, however, we are still in the early stages.

Unlike a registered trademark, the term “open source” does not benefit from exclusive legal protection — which is why many AI companies label themselves as open source even when they do not meet the fundamental requirements.
Nevertheless, in both the United States and Europe, the definition of open source has acquired legal value through certain rulings, which establish that software described as such must guarantee the four fundamental freedoms: use, study, modification, and distribution.
Returning to the example of LLaMA — this model does not meet these criteria and, as a result, cannot be considered open source.

What is your opinion on the opposite phenomenon — that of states, institutions, and companies developing closed and proprietary models?

The risks posed by these systems stem primarily from their opacity. By nature, AI systems already struggle to be seen as trustworthy, as there is no exact science capable of explaining and justifying why a model generates a specific output from a given input.
There are missing elements that could help us understand, for example, why a model repeatedly gives incorrect and consistent answers to certain questions.

Without a clear scientific explanation of the results, the lack of access to source data or information about the training process exposes these systems to bias and systemic distortions.
This risk becomes even more critical as the model’s opacity increases.

Moreover, we are witnessing a problem already seen in the software world: the tendency to “reinvent the wheel” from scratch each time.
The fact that only one or two companies hold the expertise, data, and hardware infrastructure to develop these systems creates a major vulnerability for the entire sector.

It’s a situation reminiscent of the 1990s and 2000s, when 98% of computers worldwide used the same browser — Internet Explorer — making the Internet dependent on Microsoft’s decisions.
There were widespread vulnerabilities and security risks, as if everything had been reduced to a monoculture.

STEFANO MAFFULLI

Stefano Maffulli is the Executive Director of the Open Source Initiative (OSI), an organisation he has been a member of since 2021, following decades of dedication to promoting open source.
From 2001 to 2007, he co-founded and led the Italian chapter of the Free Software Foundation Europe. He later structured the developer community at the OpenStack Foundation and led open source marketing teams at several international companies.
A passionate user of open source software, Maffulli has contributed patches to documentation, translations, and supported various projects such as GNU, QGIS, OpenStreetMap, and WordPress.
His appointment as Executive Director marked a fundamental step in transforming the OSI into a professionally managed organisation.