Machine Learning
Machine Learning is the field in Artificial Intelligence dealing with models and algorithms to develop computer programs able to improve their performance during execution.
This reasearch field includes several learning pradigms (e.g., supervised learning, unsupervised learning, and reinforcement learning) or techniques (e.g., neural networks, genetic algorithms, decision trees, etc.). Recently all these techniques have been applied to Data Mining that can be thought as the natural complementation between Statistics, Artificial Intelligence, and Databases. Its main goal is to develop algorithms and techniques to extract knowledge from huge data repositories.
Ongoing Projects
Batch Learning for Poker http://airwiki.elet.polimi.it/mediawiki/index.php/Batch_Learning_for_Poker
Project Proposals
Wiki Page: Automatic generation of domain ontologies Title: Automatic generation of domain ontologies |
Wiki Page: Behavior recognition from visual data Title: Behavior recognition from visual data
Material:
Expected outcome:
Required skills or skills to be acquired:
Tutor: MatteoMatteucci, AndreaBonarini |
Wiki Page: Combinatorial optimization based on stochastic relaxation Title: Combinatorial optimization based on stochastic relaxation The focus will be on the implementation of a new algorithm able to combine different approaches (estimation and sampling, from one side, and exploitation of prior knowledge about the structure of problem, on the other), together with the comparison of the results with existing techniques that historically appear in different (and often separated) communities. Good coding (C/C++) abilities are required. Since the approach will be based on statistical models, the student is supposed to be comfortable with notions that come from probability and statistics courses. The project could require some extra effort in order to build and consolidate some background in math, especially in Bayesian statistics and MCMC techniques, such as Gibbs and Metropolis samplers (4). The project can be extended to master thesis, according to interesting and novel directions of research that will emerge in the first part of the work. Possible ideas may concern the proposal of new algorithms able to learn existing dependencies among the variables in the function to be optimized, and exploit them in order to increase the probability to converge to the global optimum. Picture taken from http://www.ra.cs.uni-tuebingen.de/ Bibliography
Tutor: MatteoMatteucci, LuigiMalago |
Wiki Page: Combining Estimation of Distribution Algorithms and other Evolutionary techniques for combinatorial optimization Title: Combining Estimation of Distribution Algorithms and other Evolutionary techniques for combinatorial optimization The focus will be on the implementation of new hybrid algorithms able to combine estimation of distribution algorithms with different approaches available in the evolutionary computation literature, such as genetic algorithms and evolutionary strategies, together with other local search techniques. Good coding (C/C++) abilities are required. Some background in combinatorial optimization form the "Fondamenti di Ricerca Operativa" is desirable. The project could require some effort in order to build and consolidate some background in MCMC techniques, such as Gibbs and Metropolis samplers (4). The project could be extended to master thesis, according to interesting and novel directions of research that will emerge in the first part of the work. Computer vision provides a large number of optimization problems, such as new-view synthesis, image segmentation, panorama stitching and texture restoration, among the others, (6). One common approach in this context is based on the use of binary Markov Random Fields and on the formalization of the optimization problem as the minimum of an energy function expressed as a square-free polynomial, (5). We are interested in the proposal, comparison and evaluation of different Estimation of Distribution Algorithms for solving real world problems that appear in computer vision. Pictures taken from http://www.genetic-programming.org and (6) Bibliography
Tutor: MatteoMatteucci, LuigiMalago |
Wiki Page: Information geometry and machine learning Title: Information geometry and machine learning Information Geometry has been recently applied in different fields, both to provide a geometrical interpretation of existing algorithms, and more recently, in some contexts, to propose new techniques to generalize or improve existing approaches. Once the student is familiar with the theory of Information Geometry, the aim of the project is to apply these notions to existing machine learning algorithms. Possible ideas are the study of a particular model from the point of view of Information Geometry, for example as Hidden Markov Models, Dynamic Bayesian Networks, or Gaussian Processes, to understand if Information Geometry can give useful insights with such models. Other possible direction of research include the use of notions and ideas from Information Geometry, such as the mixed parametrization based on natural and expectation parameters (3) and/or families of divergence functions (2), in order to study model selection from a geometric perspective. For example by exploiting projections and other geometric quantities with "statistical meaning" in a statistical manifold in order to chose/build the model to use for inference purposes. Since the project has a theoretical flavor, mathematical inclined students are encouraged to apply. The project requires some extra effort in order to build and consolidate some background in math, partially in differential geometry, and especially in probability and statistics. Bibliography
Tutor: MatteoMatteucci, LuigiMalago |
Wiki Page: LARS and LASSO in non Euclidean Spaces Title: LARS and LASSO in non Euclidean Spaces One of the common hypothesis in regression analysis is that the noise introduced in order to model the linear relationship between regressors and dependent variable has a Gaussian distribution. A generalization of this hypothesis leads to a more general framework, where the geometry of the regression task is no more Euclidean. In this context different estimation criteria, such as maximum likelihood estimation and other canonical divergence functions do not coincide anymore. The target of the project is to compare the different solutions associated to different criteria, for example in terms of robustness, and propose generalization of LASSO and LARS in non Euclidean contexts. The project will focus on the understanding of the problem and on the implementation of different algorithms, so (C/C++ or Matlab or R) coding will be required. Since the project has also a theoretical flavor, mathematical inclined students are encouraged to apply. The project may require some extra effort in order to build and consolidate some background in math, especially in probability and statistics. Picture taken from (2) Bibliography
Tutor: MatteoMatteucci, LuigiMalago |
Wiki Page: Statistical inference for phylogenetic trees Title: Statistical inference for phylogenetic trees The project will focus on the understanding of the problem and on the implementation of different algorithms, so (C/C++ or Matlab or R) coding will be required. Since the approach will be based on statistical models, the student is supposed to be comfortable with notions that come from probability and statistics courses. The project is thought to be extended to master thesis, according to interesting and novel directions of research that will emerge in the first part of the work. Possible ideas may concern the proposal and implementation of new algorithms, based on recent approaches to phylogenetic inference available in the literature, as in (3) and (4). In this case the thesis requires some extra effort in order to build and consolidate some background in math in oder to understand some recent literature, especially in (mathematical) statistics and, for example, in the emerging field of algebraic statistics (5). Other possible novel applications of phylogenetic trees have been proposed in contexts different from biology, as in (6). Malware (malicious software) is software designed to infiltrate a computer without the owner's informed consent. Often malwares are related to previous programs thought evolutionary relationships, i.e., new malwares appear as small mutations of previous softwares. We are interested in the use of techniques from phylogenetic trees to create a taxonomy of real world malwares. Picture taken from http://www.tolweb.org/tree/ and http://www.blogscienze.com Bibliography
Tutor: MatteoMatteucci, LuigiMalago, StefanoZanero |
People
- AndreaBonarini
- CarloDEramo
- DavideTateo
- LuigiMalago
- MarcelloRestelli
- MatteoMatteucci
- MirzaRamicic
- PierLucaLanzi