Teachers
Brendan Murphy, University College Dublin, IE
Antonio D’Ambrosio, University of Naples Federico II, IT
Valeria Vitelli, University of Oslo, NO
Antonella Plaia, University of Palermo, IT
Mariangela Sciandra, University of Palermo, IT
Alessandro Albano, University of Palermo, IT
Cristina Mollica, University of Rome “Sapienza”, IT
Maurizio Romano, University of Cagliari, IT
Claudio Conversano, University of Cagliari, IT
More info on the selected teachers is available at the bottom of this page
Lectures
During the Summer School, there will be one or more lectures given by each teacher according to their background. Hereby, it is possible to find a short introduction to their lectures.
Basic preference learning: introduction to the topic and rank aggregation
(Maurizio Romano, Claudio Conversano, Antonio D’Ambrosio)
Preference rankings are ubiquitous in data analysis, as human beings are unable to avoid ranking things. Preference learning concerns learning from observed preference information; it is, in fact, a modern term that recalls the “ancient” analysis of preference rankings. It is a combination of “ancient” results and “modern” techniques.
The ingredients of preference learning are a set of items to be evaluated and a pool of judges to evaluate them. The concept and type of items and judges are evolving over time, preference detection is not always explicit, and rankings may result from automatic procedures… Why preference learning? Because preferences are everywhere!!
An introduction to the topic will be presented, providing a historical overview of how preference data have been approached over the centuries (starting by de Borda, 1781), defining the geometric space of preference rankings, providing tools to visualize and summarize preferences and working in detail on the rank aggregation problem, viz: given a pool of m judges ranking a set of n items, what is that ranking that better synthesizes the consensus opinion? A practical introduction with R on real cases will also be provided.
Relevant literature:
- Marden, J. I. (1996). Analyzing and modeling rank data. CRC press Kemeny, J.G., Snell, J.L., 1962. Mathematical models in the social sciences. Blaisdell Publishing Company
- Emond, E. J., & Mason, D. W. (2002). A new rank correlation coefficient with application to the consensus ranking problem. Journal of Multi‐Criteria Decision Analysis, 11(1), 17-28.
- Amodio, S., D’Ambrosio, A., & Siciliano, R. (2016). Accurate algorithms for identifying the median ranking when dealing with weak and partial rankings under the Kemeny axiomatic approach. European Journal of Operational Research, 249(2), 667-676.
- Badal, P. S., & Das, A. (2018). Efficient algorithms using subiterative convergence for Kemeny ranking problem. Computers & Operations Research, 98, 198-210.
- D’Ambrosio, A., Mazzeo, G., Iorio, C., & Siciliano, R. (2017). A differential evolution algorithm for finding the median ranking under the Kemeny axiomatic approach. Computers & Operations Research, 82, 126-138.
Basic preference learning, Modeling preferences
(Valeria Vitelli, Cristina Mollica)
What do modeling preferences mean? Various possible modeling approaches are introduced, and how to handle the different types of preference data is also shown. Specifically, both distance-based and score-based models will be introduced, examining the respective advantages and disadvantages in the different modeling situations. A practical introduction with R on possible inferential approaches for the model parameters will also be provided.
Relevant literature:
- Liu, Q., Crispino, M., Scheel, I., Vitelli, V., & Frigessi, A. (2019). Model-based learning from preference data. Annual review of statistics and its application, 6(1), 329-354
- Marden, J. I. (1996). Analyzing and modeling rank data. CRC press
- Mollica, C., & Tardella, L. (2017). Bayesian Plackett–Luce mixture models for partially ranked data. Psychometrika, 82(2), 442-458
- Vitelli, V., Sørensen, Ø., Crispino, M., Frigessi, A., & Arjas, E. (2018). Probabilistic preference learning with the Mallows rank model. Journal of Machine Learning Research, 18(158), 1-49.
Preference learning data analysis, part 1: Decision trees and ensemble methods
(Antonella Plaia, Mariangela Sciandra, Alessandro Albano)
Preference rankings can be considered indicators of individual behaviors; therefore, when respondent-specific characteristics are available, an important issue relies on identifying the profiles of respondents (or judges) giving the same/similar rankings. Decision trees and ensemble methods for preference data will be introduced. A practical example with R will also be provided.
Relevant literature:
- Albano, A., Sciandra, M., & Plaia, A. (2023). A weighted distance-based approach with boosted decision trees for label ranking. Expert Systems with Applications, 213, 119000.
- Plaia, A., & Sciandra, M. (2019). Weighted distance-based trees for ranking data. Advances in Data Analysis and Classification, 13, 427-444.
- Sciandra, M., Plaia, A., & Capursi, V. (2017). Classification trees for multivariate ordinal response: an application to student evaluation teaching. Quality & Quantity. Springer.
- D’Ambrosio, A., & Heiser, W. J. (2016). A recursive partitioning method for the prediction of preference rankings based upon Kemeny distances. Psychometrika, 81(3), 774-794.
- Plaia, A., Buscemi, S., Fürnkranz, J., & Mencía, E. L. (2022). Comparing boosting and bagging for decision trees of rankings. Journal of Classification, 39(1), 78-99.
Preference learning data analysis, part 2: Clustering
(Brendan Murphy)
How do we account for distinct subgroups of individuals who have similar preferences? Cluster analysis is a tool for finding subgroups in populations. We will examine different approaches for clustering preference data including algorithmic and statistical model-based approaches. These lectures will give a brief history of clustering. An overview of algorithmic and model-based clustering in general. A particular detail will be given on model-based clustering of preference data using mixture models. The application of clustering methods to real preference data examples will be discussed. A practical R session applying clustering methods to preference data will also be provided.
Relevant literature:
- Bouveyron, C., Celeux, G., Murphy, T.B. and Raftery, A.E. (2019) Model-based Clustering and Classification for Data Science, Cambridge University Press. (https://math.univ-cotedazur.fr/~cbouveyr/MBCbook/)
- Critchlow, D. (1985) Metric Methods for Analyzing Partially Ranked Data. Springer
- Gormley, I.C. and Murphy, T.B. (2008) Exploring voting blocs within the Irish electorate: A mixture modeling approach. Journal of the American Statistical Association, 103(483), 1014-1027.
- Irurozki, E., Calvo, B., Lozano, J. A. (2016) PerMallows: An R Package for Mallows and Generalized Mallows Models. Journal of Statistical Software, 71(12).
- Mollica, C. and Tardella, L. (2017) Bayesian mixture of Plackett-Luce models for partially ranked data. Psychometrika. 82 (2), 442-458.
- Murphy, T.B. and Martin, D. (2003) Mixtures of distance-based models for ranking data. Computational Statistics & Data Analysis. 41 (3-4), 645-655.
- Vitelli, V., Sørensen, Ø., Crispino, M., Frigessi, A. and Arjas, E. (2018) Probabilistic preference learning with the Mallows rank model. Journal of Machine Learning Research, 18(158):1−49, 2018.
Selected teachers’ short bio
Brendan Murphy
Brendan Murphy is Full Professor of Statistics at University College Dublin, Ireland. He works on statistical modeling in a wide range of domains, including the biomedical and social sciences. He has recently served as the head of the School of Mathematics & Statistics at University College Dublin. He was a fellow of the Collegium — Institut d’Études Avancées de Lyon for 2021/22. He is a council member of the Royal Statistical Society and Editor for Social Sciences and Government for the Annals of Applied Statistics.
Antonio D’Ambrosio
Antonio D’Ambrosio is Full Professor in Statistics at the Department of Economics and Statistics of the University of Naples Federico II. He took the Ph.D. in Statistics at the University of Naples Federico II by defending a Ph.D. thesis named Tree-based methods for Data Editing and Preference Rankings, and he also spent a period at Charles University in Prague and Leiden University. He was visiting researcher at the Department of Psychology – Section methods and statistics – of the Leiden University (The Netherlands). He was visiting researcher at the Department of Statistics and Operations Research of the University of Granada (Spain). He is Associate Editor of Journal of Classification.
The main research interests are classification and clustering. Within these frameworks, it’s so fascinating to deal with preference rankings.
Valeria Vitelli
Valeria Vitelli is Associate Professor at the Biostatistics Department of the University of Oslo, Norway, since 2018. She obtained her PhD in Statistics from Politecnico di Milano (Italy) in 2012, and she also spent a period at Ecole Centrale Paris (France) as a Postdoctoral Fellow. Her research interests include Preference Learning and models for Rank Data, Clustering & Mixture Modeling, Bayesian Methods, Functional Data Analysis, and High-dimensional Statistics. She is now also a PI in the Norwegian Centre for Knowledge-driven Machine Learning (Integreat).
Antonella Plaia
Antonella Plaia is Full Professor of Statistics at the Department of Economics, Business and Statistics of the University of Palermo. She has been the coordinator of the bachelor’s and master’s degrees in Statistics at the University of Palermo. The main research fields are Multivariate Data Analysis, Ensemble methods for Classification, Data Mining of large-scale data, and Functional Data Analysis. Currently, she works especially on Textual Data Analysis and Preference data. She has been a supervisor of many Ph.D. theses at the Doctorate in Economics and Statistics and a tutor of post-doc at the University of Palermo. She has been a referee of Ph.D. and master theses at Italian and foreign Universities.
Mariangela Sciandra
Mariangela Sciandra is an Associate Professor of Statistics at the Department of Economics, Business, and Statistics at the University of Palermo. Her main research areas include Generalized Linear Mixed Effects Models, with a focus on inferential and diagnostic tools. She has contributed to defining R2 measures for GLMM and exploring flexible modeling of autocorrelation structures for clustered data. She is also involved in applying statistical methods to complex ecological systems, such as monitoring the health status of the seagrass Posidonia oceanica. Her research extends to Classification Trees for ordinal responses, investigating methods based on the most appropriate distance for multivariate ordinal response data. Currently, she is actively engaged in Textual Data Analysis and Preference data. Additionally, she has supervised doctoral theses at the Doctorate in Economics and Statistics.
Alessandro Albano
Alessandro Albano is Assistant Professor at the Department of Economics, Business and Statistics at the University of Palermo. He obtained his PhD in Statistics from the University of Palermo in 2022. He was visiting scholar at the University of Valladolid in 2021. His research interests concern Ranking data, Label Ranking, Topic Models, Text Mining, Machine learning.
Maurizio Romano
Born in 1992, Maurizio Romano is Researcher in Statistics at the University of Cagliari, where he teaches Statistical Learning in a Master’s Degree course. He obtained a Ph.D. in Statistics with honors, defending a thesis that proposes a new framework for analyzing natural language textual data, labeled and unlabeled, classifying them as “positive” or “negative”.
His research interests focus on Natural Language Processing, Classification, Sentiment Analysis, Big Data, Tourism Analytics, Preference data, and Explainable AI. Since 2019, he has produced 15 scientific publications and presented at international (12) and national (3) conferences. He also participated in the research project “Designing an ICT platform to support the tourism sector” (Ref. Prof. F. Mola), in which he measured customer satisfaction in the Sardinian tourism sector.
Technical Chair and member of the local organizing committee for the annual conference of the Italian Statistical Society 2021, he was awarded by the same SIS in June 2018 during participation in Stats Under the Stars4, a competition focused on classification tasks (source). In 2023, he was selected as one of the 10 Young Speakers by Be Travel Online (BTO, the main Italian event of international relevance, networking between operators and companies on digital tourism).
Claudio Conversano
Claudio Conversano is Full Professor of Statistics at the Department of Business and Economics at the University of Cagliari since February 2022. He graduated in Economics at the University of Naples Federico II in 1996, and received the Ph.D. in Computational Statistics and a Post-doc in Multivariate Statistics from the same university in 2000. He worked as Researcher in Statistics at the University of Cassino and Lazio Meridionale from 2002 to 2005. In November 2005, he moved to the University of Cagliari, where he obtained an Associate Professor position in December 2014, and the Full Professorship position in 2022. From 2022, he is the Coordinator of the Master Program in Data Science, Business Analytics and Innovation at the University of Cagliari
