Master's degree study plan

First year courses

  • Laboratory of Computational Physics
    • MOD. A: high level programming in Python
    • MOD. B: modern tools for classifying data and machine learning techniques
  • Management and Analysis of Physics Dataset
    • MOD. A: hands-on laboratory of data management with FPGA (Xilinx Artix-7)
    • MOD. B: data management and data processing, with a focus on parallel processing and distributed computing
  • Machine Learning
  • Advanced Statistic for Physics Analysis (R programming)
  • General Relativity
  • Statistical Mechanics of Complex Systems

Second year courses

  • Information Theory and Inference
  • Neural Networks and Deep Learning
  • Human Data Analytics
  • Vision and Cognitive Systems
  • Network Science
  • Physical Models of Living Systems

More information on the courses, such as the unit contents and the examination methods, can be found here.

Additional experience

Mentee at LeadTheFuture

Lead The Future Mentorship
Sep 2022 - Present

Among the few Italian students selected to be mentees for LeadTheFuture, a leading mentorship non-profit organization for students in STEM, with acceptance rate below 20%. LeadTheFuture empowers top-performing students to achieve their goals and contribute to their communities by giving them one-on-one guidance from high-impact mentors coming from the world's leading STEM innovation hubs such as Silicon Valley and CERN.

Summer University on Effective High-Performance Computing and Data Analytics

Swiss National Supercomputing Centre (CSCS) and Università della Svizzera Italiana
Jul 2023 · 1 week

This course focused on the effective exploitation of state-of-the-art hybrid High-Performance Computing (HPC) systems with a special focus on GPU programming. The following topics were covered:

  • GPU architectures
  • GPU programming (CUDA)
    • Programming model
    • Memory management
    • Performance optimization and scientific libraries
  • Python HPC libraries (NumPy, SciPy, CuPy, Numba, Dask)
    • NumPy-like libraries for both CPUs and GPUS computing
    • Just-in-time compilation from Python code
    • Distributed workloads on HPC clusters

Certificate of participation

Modern Machine Learning Summer School 2023

Machine Learning Genoa Center (MaLGa)
Jun 2023 · 1 week

Summer School on the latest developments in Machine Learning organized by the Machine Learning Genoa Center (MaLGa).

After a boot camp on the first day (on statistical learning, machine learning models and optimization for machine learning), the rest of the week was dedicated to introducing modern topics, including: implicit regularization, sketching, reinforcement learning, machine learning for inverse problems, optimal transport for machine learning, fairness in machine learning.

Certificate of participation

Oxford Machine Learning Summer School 2022

University of Oxford
Aug 2022 · 2 weeks
OxML 2022 covers some of the most important topics in ML/DL that the field is showing a growing interest in (e.g., statistical/probabilistic ML, representation learning, reinforcement learning, causal inference, vision & NLP, geometrical DL, ...) and their applications in sustainable development goals (SDGs, or Global Goals).

The school program consisted of two tracks, Machine Learning for Health and Machine Learning for Finance, plus an optional module on Machine Learning Foundamentals. In addition to the lectures, the school included 2 workshops:

  • "AI frameworks from research to production, illustrated by PyTorch journey" by Vincent Moens
  • "Introduction to Quantum Machine Learning" by James Cruise and Julian van Velzen

Certificate of participation

Member of the Mission Analysis team

Alba CubeSat Unipd
Oct 2020 - Jun 2022 · 1 yr 9 mos

Alba Cubesat is a team of students who, followed by professors from the University of Padua, aim to design, build and send a CubeSat into orbit, joining the "Fly Your Satellite" Program, promoted by the ESA Education Office.

Guided by the needs of the other subgroups (Payloads, EPS, TTC&OBC, ...), the task of my team was to study the orbit and the space environment, using tools such as Matlab, Spenvis, Master and Drama by ESA. More in detail, some of our tasks were to determine the orbital parameters, estimate the life time, and perform the risk assessment, both in orbit from radiation and debris and during the return in the atmosphere. On top of this, we also dealt with the definition of the Mission Phases and the realisation of the Ground Segment. To achieve such difficult results, we held weekly alignment meetings both within the individual teams and among the team leaders .

Math and Physics Laboratory

University of Trento
Jun 2017 · 2 weeks

Group work, lasting 2 weeks, subject to constant revision by a university professor, aimed at exploring the problems related to writing a mathematical text. It consisted of participating in seminars and writing an article in LaTeX relating to the topics developed.

Private Tutor (math and physics)

2017 - 2022

Since high school I have been working as a private teacher, mainly for high school students, in order to challenge myself and gain some financial independence during my studies.

Recent projects

Data structure-oriented learning strategies for 3D Objects classification

Nowadays, the identification and understanding of 3D objects in real-world environments has a wide range of applications, including robotics and human-computer interaction. This is typically addressed using Deep Learning techniques that deal with 3D volumetric data, as they are generally able to outperform standard Machine Learning tools.

In this work, we experimented with several architectures based on Convolutional Neural Networks with the aim of classifying 3D objects. We ran our tests on the ModelNet40 dataset, one of the most popular benchmark in the context of 3D object recognition. First we compared the effectiveness of Point Clouds and Voxel grids, inspecting pros and cons of these representations. We saw how, for instance, the more accurate representation obtained via PC does not lead to better performance when dealing with CNNs, unless you have very large memory capacities. Then, we built an Autoencoder in order to retrieve an high-dimensional embedding of the input data. We showed that the application of simple ML techniques, such as SVM, on these intermediate representations can lead to state-of-the-art performances and codewords could be used for compression purposes. Finally, we provided a visual representation of the encoded features through t-SNE.

Effective processing pipeline and advanced Neural Network architectures for Small-footprint Keyword Spotting

The keyword spotting (KWS) task consists of identifying a relatively small set of keywords in a stream of user utterances. This is preferably addressed using small footprint inference models that can be deployed even on performance-limited and/or low-power devices. In this framework indeed, model performance is not the only relevant aspect and the model footprint plays an equally crucial role. In this work, first we defined a modern CNN model that outperforms our baseline model and we used it to study the impact of different pre-processing, regularization and feature extraction techniques. We saw how, for instance, the log Mel-filterbank energy features lead to the best performance and we discovered that the introduction of background noise on the train set with an optimal noise reduction coefficient of 0.5 helps the model to learn. Then, we explored different machine learning models, such as ResNets, RNNs, attention-based RNNs and Conformers in order to achieve an optimal trade-off between accuracy and footprint. We found that these architectures offer between a 30-40% improvement in accuracy compared to the baseline, while reducing up to 10× the number of parameters. We ran our tests on the Google Speech Commands dataset, one of the most popular datasets in the KWS context.

Finally, we realized a demo application that can be run as a python script. It allows the user to select the model he wants to use and, when started, it detects real-time the commands in the Speech Commands Dataset through the microphone (or any chosen input device).

Koln Traffic Regulator with Parallel Computing

Our work started from a project jointly developed by IBM and by the German city of Koln thought to be a first step towards traffic regulation and an efficient exploitation of transport's resources. In particular, we analyzed a set of mobility data emulated with SUMO, consisting of 394 million records and 20 Gb in size. To reach our goals, we set up a cluster on CloudVeneto made of 5 virtual machines (4 cores and 8 GB RAM each) and created a volume, shared across the instances using a NFS. Moreover, we used Dask to parallelize the tasks.

First, we computed some interesting metrics, comparing the performances using groupby with Dask Dataframe and foldby with Dask Bag. Then, starting with an already processed dataset containg the number of connections to each base station at each time instant, we realized an interactive dashboard, which allows to visualize the number of connections to each base station in a selected time window. With the same dataframe, we also simulated a data stream with streamz, emitting the information related to a specific time instant one next to the other and updating some plots on the fly. Finally, we performed some benchmarks on a pruned version of the dataset (~ 1 GB), trying different values of the main parameters of our setup, such as the number of files in the import of the data, the number of workers for each machine, the number of threads per worker and the block size used to read and import the data.

Explaining microbial scaling laws using Bayesian inference

In this project, we combined methods from Statistical Physics and Bayesian Data Analysis to elucidate the principles behind cellular growth and division. We studied various classes of individual-based growth-division models and inferred individual-level processes (model structures and likely ranges of associated parameters) from sigle-cell observations.

In the Bayesian framework, we formalized our process understanding the form of different rate functions, expressing the dependence of growth and division rates on variables characterizing the cell’s state (such as size and protein content). We calculated the Bayesian posteriors for the parameters of these functions and performed a model comparison to determine which was more consistent with the data coming from experimental observations.

Licenses & Certifications

B2 First - Score 168

Cambridge University Press & Assessment English
Issued Jul 2017 · No Expiration Date

ECDL Full Standard

ICDL Certification