Scalability and Power: The Boundaries of Deep Learning

a cura di Fabio Gnassi
Scalability and Power: The Boundaries of Deep Learning
interview with Marco Canini

As is now well known, the recent rise of artificial intelligence stems from the convergence of two key factors: the vast amounts of data generated by contemporary digital life—readily available online—and the ongoing advancement of computational power.
While the accumulation of data is an intuitive phenomenon, rooted in our daily experience, the evolution of computing capabilities follows more complex and less tangible dynamics. Yet these are crucial to fully understanding the nature and limitations of AI.


The deep learning process relies on three factors: computational power, the number of parameters, and dataset size. Could you explain what these are and why they matter?

Deep learning models are trained through an iterative process in which their capabilities gradually improve through repeated update cycles. Initially, the model starts with randomised parameters, making it relatively ineffective. It is then exposed to data tailored to the specific task at hand. This data is typically labelled with information that acts as “solutions”, helping the model to understand and learn how to perform the task.
Naturally, performing the computations required during training becomes increasingly demanding as the model grows. A larger number of parameters increases the computational load, but it also enables the model to capture more complex correlations and finer details.
The size of the training dataset is equally critical: larger datasets enhance the model’s learning ability, resulting in more accurate predictions and higher overall quality.
In general, we can say that in deep learning, improvements in computational power, parameter count, and data volume lead to significant performance gains.

What is Moore’s Law, and how has it influenced the development and scalability of deep learning?

Moore’s Law is named after Gordon Moore, co-founder and CEO of Intel, who observed that the number of transistors in digital circuits doubled approximately every two years. From this historical trend emerged a predictive “law” suggesting that technological advances would allow the semiconductor industry to continue boosting circuit performance by increasing transistor density.
This law has had a major impact on the development and scalability of deep learning. Although the origins of deep learning date back to the 1970s with early neural networks, recent breakthroughs have only become feasible thanks to access to vast datasets and rising computational power—advancements made possible in large part due to Moore’s Law.
In the past, performing complex calculations was prohibitively expensive. Today, thanks to GPUs, we can achieve computational throughput measured in teraflops (one teraflop equals one trillion real-number operations per second). This has dramatically transformed the field, enabling tasks that were once unimaginable.


One solution to the scalability challenge is “Distributed Deep Learning”. What are its characteristics?

Despite the expectations outlined by Moore’s Law, today’s technological growth is largely driven by parallelisation across extended computing resources rather than increases in individual processor speeds.
However, one of the slowing trends concerns memory capacity and bandwidth. For example, the memory available in modern GPUs is not increasing at the same pace as computational power. This presents a bottleneck, as deep learning models demand ever-larger parameter sets to function effectively.
When a single GPU no longer offers sufficient memory for training, the process becomes distributed, using multiple GPUs in parallel to share the computational workload.
This approach is known as distributed deep learning. It addresses scalability by distributing training tasks across several GPUs. In theory, this allows computational capacity and training speed to grow linearly with the number of GPUs used, reducing training time accordingly. In practice, however, the scalability is not perfectly linear due to various technical constraints.
Despite these challenges, substantial progress has been made in making parallel computation more efficient. The emergence of Transformer-based models, used in Large Language Models (LLMs), has accelerated this trend. These models contain tens or even hundreds of billions of parameters, making it necessary to train them using hundreds or thousands of GPUs in parallel.
While great strides have been made, research into improving these processes is ongoing, driven by efforts from companies, universities, and researchers worldwide.

Recent studies suggest that neural network-based artificial intelligence has reached a stage where further efficiency gains are increasingly difficult. Is that true?

Yes, particularly in the case of LLMs, there is a well-documented correlation between computational power, parameter count, and dataset size. Improvements in model quality result from increases in all three—but the relationship is exponential.
This means that even modest quality gains require substantial increases in computing resources, parameters, and data. As a result, making models more efficient has become increasingly difficult. Marginal improvements now come at enormous cost.
Training large models today can require tens of millions of dollars, limiting the ability to explore new ideas or substantially refine existing methods. This often leads to a preference for replicating established approaches rather than innovating, due to high costs and complexity.
This environment makes it more challenging to introduce novel techniques, despite vibrant research and innovation. Much of this innovation is driven by large corporations that have the necessary resources, while academia, despite its talent pool, is constrained by high costs.
Moreover, much of the knowledge gained remains proprietary, driven by commercial interests. However, open-source initiatives—such as Meta’s LLaMA—play a vital role by contributing to the wider community and encouraging knowledge sharing.

MARCO CANINI

Marco Canini is Associate Professor of Computer Science at KAUST. He received his PhD in computer science and engineering from the University of Genoa in 2009 after spending the last year as a visiting student at the University of Cambridge. He was a postdoctoral researcher at EPFL and a senior researcher at Deutsche Telekom Innovation Labs and TU Berlin. Before joining KAUST, he was an associate professor at UCLouvain. He has also held positions at Intel, Microsoft and Google.

READ MORE

Scalability and Power: The Boundaries of Deep Learning FABIO GNASSI_MARCO CANINI_THE BUNKER_00

Scalability and Power: The Boundaries of Deep Learning

Scalability and Power: The Boundaries of Deep Learning Strategies and Visions for a possible European Digital Structure FABIO GNASSI_FAUSTO GERNONE_THE BUNKER MAGAZINE_00

Strategies and Visions for a possible European Digital Structure

Strategies and Visions for a possible European Digital Structure The Internet as a Healing Space. A conversation with Valentina Tanni VIOLA GIACALONE_VALENTINATANNI_THE BUNKER MAGAZINE_01

The Internet as a Healing Space. A conversation with Valentina Tanni

The Internet as a Healing Space. A conversation with Valentina Tanni Expanding the architecture of possibility FABIO GNASSI_ILA COLOMBO_THE BUNKER MAGAZINE_00

Expanding the architecture of possibility

Expanding the architecture of possibility Adversarial Examples: Hidden threats to neural networks FABIO GNASSI_PAU LABARTA_MACHINE LEARNING_THE BUNKER MAGAZINE_00

Adversarial Examples: Hidden threats to neural networks

Adversarial Examples: Hidden threats to neural networks Latent Structures: Between Generative Models and Contemporary Design FABIO GNASSI_GEORGE GUIDA_NERF_THE BUNKER MAGAZINE-00

Latent Structures: Between Generative Models and Contemporary Design

Latent Structures: Between Generative Models and Contemporary Design Internet is Broken, and It’s Up to Us to Fix It. Valerio Bassan’s Manifesto blockhain (1)

Internet is Broken, and It’s Up to Us to Fix It. Valerio Bassan’s Manifesto

Internet is Broken, and It’s Up to Us to Fix It. Valerio Bassan’s Manifesto From Bauhaus to AI: Moholy-Nagy’s vision reinterpreted by David Szauder ALESSANDRO MANCINI_David Szauder_THE BUNKER MAGAZINE_01

From Bauhaus to AI: Moholy-Nagy’s vision reinterpreted by David Szauder

From Bauhaus to AI: Moholy-Nagy’s vision reinterpreted by David Szauder Light, Sound and Space: NONOTAK’s immersive experience at the Videocity Awards ALESSANDRO MANCINI_NONOTAK_THE BUNKER MAGAZINE_00

Light, Sound and Space: NONOTAK’s immersive experience at the Videocity Awards

Light, Sound and Space: NONOTAK’s immersive experience at the Videocity Awards Algorithmic Opacity and Transparency: it is possible or desirable to shed light on Black Boxes? CLAUDIO AGOSTI_FABIO GNASSI_THE BUNKER MAGAZINE_03

Algorithmic Opacity and Transparency: it is possible or desirable to shed light on Black Boxes?

Algorithmic Opacity and Transparency: it is possible or desirable to shed light on Black Boxes? The cyberpunk and existentialist art of Riccardo Benassi lands at MAXXI ALESSANDRO MANCINI_BENASSI_

The cyberpunk and existentialist art of Riccardo Benassi lands at MAXXI

The cyberpunk and existentialist art of Riccardo Benassi lands at MAXXI Gaming, AR and metaverse: Raffaella Camera talks about the digital future of branding ALESSANDRO MANCINI_RAFFAELLA CAMERA_THEBUNKER MAGAZINE_00

Gaming, AR and metaverse: Raffaella Camera talks about the digital future of branding

Gaming, AR and metaverse: Raffaella Camera talks about the digital future of branding Light as a Universal Language: the Art of Federica Di Carlo lands in Giza ALESSANDRO MANCINI_FEDERICA DI CARLO_THE BUNKERMAGAZINE_00

Light as a Universal Language: the Art of Federica Di Carlo lands in Giza

Light as a Universal Language: the Art of Federica Di Carlo lands in Giza Exceeding Limits: Intersections and the Evolution of Marketing in the Age of AI MANILA ALFANO_THE BUNKER MAGAZINE_INTERSECTION_00

Exceeding Limits: Intersections and the Evolution of Marketing in the Age of AI

Exceeding Limits: Intersections and the Evolution of Marketing in the Age of AI Crypto Coffer: an Investigation into the Autonomy and Value of Art CRIPTOCOFFER_FABIO GNASSI_ENXUTO AND LOVE_ THE BUNKER MAGAZINE_00

Crypto Coffer: an Investigation into the Autonomy and Value of Art

Crypto Coffer: an Investigation into the Autonomy and Value of Art Behind the VAIA philosophy: from circularity to the know-how of local craftsmen ALESSANDRO MANCINI_VAIA_THE BUNKER MAGAZINE_01

Behind the VAIA philosophy: from circularity to the know-how of local craftsmen

Behind the VAIA philosophy: from circularity to the know-how of local craftsmen Rethinking power structures: the promises of decentralised governance PRIMAVERA DE FILIPPI_THE BUNKER MAGAZINE_01