30 May 14:00 
The emergence of concepts in shallow neuralnetworks
Elena Agliari (Sapienza)
In the first part of the seminar I will introduce shallow neuralnetworks from a statisticalmechanics perspective, focusing on simple cases and on a naive scenario where information to be learnt is structureless. Then, inspired by biological informationprocessing, I will enrich the framework and make the network able to successfully and cheaply handle structured datasets. Results presented are both analytical and numerical.

30 May 14:30 
The Exponential Capacity of Dense Associative Memories
Marc Mézard (Bocconi)
Recent generalizations of the Hopfield model of associative memories can store a number of random patterns that grows exponentially with the number of artificial neurons. Besides the huge storage capacity, another interesting feature of these networks is their connection to the attention mechanism which is part of the Transformer architectures widely applied in deep learning. This talk will describe these new developments and show how one can compute the capacity of storage and of errorcorrection in memory retrieval, using methods from statistical physics of random systems.

30 May 15:00 
Geometry of the learning landscape in non convex neural networks
Riccardo Zecchina (Bocconi)

30 May 15:30 
Statistical complexity of quantum learning
Leonardo Banchi (Università di Firenze)
In recent years there have been an increasing number of results where quantum physics has been combined with machine learning for different reasons. On the one hand, quantum computers promise to significantly speed up some of the computational techniques used in machine learning and, on the other hand, "classical" machine learning methods can help us with the verification and classification of complex quantum systems. Moreover, the rich mathematical structure of quantum mechanics can help define new models and learning paradigms. In this talk, we will introduce quantum machine learning in all of these flavors, and then discuss how to bound the accuracy and generalization errors via entropic quantities. These bounds establish a link between the compression of information into quantum states and the ability to learn, and allow us to understand how difficult it is, namely how many samples are needed in the worst case scenario, to learn a quantum classification problem from examples. Different applications will be considered, such as the classification of complex phases of matter, entanglement classification, and the optimization of quantum embeddings of classical data.

30 May 16:0016:30 
Coffee break

30 May 16:30 
Deep Boltzmann Machines
Pierluigi Contucci (Università di Bologna)
In this talk I will review some results obtained for nonconvex meanfield spinglass models when the disorder is not permutation invariant. The emphasis will be on the Deep Boltzmann Machines within or outside the Nishimori line.

30 May 17:00 
Optimal mass transport: a new approach to quantum machine learning
Giacomo De Palma (Università di Bologna)
We propose a generalization of the Hamming distance to qubits and a generalization of the Lipschitz constant to quantum observables. We apply the quantum Hamming distance as cost function of the quantum version of Generative Adversarial Networks (GANs). Quantum GANs provide an algorithm to learn an unknown target quantum state and constitute one of the most promising applications of nearterm quantum computers. The quantum Hamming distance makes learning more stable and efficient, and the proposed quantum GANs can learn a broad class of quantum data with remarkable improvements over the previous proposals. Furthermore, we apply the quantum Hamming distance to prove extremely tight limitation bounds for variational quantum algorithms for combinatorial optimization problems, which consist in finding the optimal element within a finite set of possible choices and have an extremely wide application spectrum. Our bounds prove that many of such problems cannot be solved by quantum circuits with constantdepth.

30 May 17:30 
Convergence and optimality of neural networks for reinforcement learning
Andrea Agazzi (Università di Pisa)
Recent groundbreaking results have established a convergence theory for wide neural networks in the supervised learning setting. Under an appropriate scaling of parameters at initialization, the (stochastic) gradient descent dynamics of these models converge towards a socalled "meanfield" limit, identified as a Wasserstein gradient flow. In this talk, we extend some of these recent results to examples of prototypical algorithms in reinforcement learning: TemporalDifference learning and Policy Gradients. In the first case, we prove convergence and optimality of wide neural network training dynamics, bypassing the lack of gradient flow structure in this context by leveraging sufficient expressivity of the activation function. We further show that similar optimality results hold for wide, single layer neural networks trained by entropyregularized softmax Policy Gradients despite the nonlinear and nonconvex nature of the risk function.
