Difference between revisions of "Master Level Theses"

From AIRWiki
Jump to: navigation, search
Line 197: Line 197:
 
|description=The project will focus on the study, implementation, comparison and analysis of different statistical inference techniques for phylogenetic trees. Phylogenetic trees [1, 2] are evolutionary trees used to represent the relationships between different species with a common ancestor. Typical inference task concern the construction of a tree starting from DNA sequences, involving both the choice of the topology of the tree (i.e., model selection) and the values of the parameters (i.e., model fitting). The focus will be a probabilistic description of the tree, given by the introduction of stochastic variables associated to both internal nodes and leaves of the tree.
 
|description=The project will focus on the study, implementation, comparison and analysis of different statistical inference techniques for phylogenetic trees. Phylogenetic trees [1, 2] are evolutionary trees used to represent the relationships between different species with a common ancestor. Typical inference task concern the construction of a tree starting from DNA sequences, involving both the choice of the topology of the tree (i.e., model selection) and the values of the parameters (i.e., model fitting). The focus will be a probabilistic description of the tree, given by the introduction of stochastic variables associated to both internal nodes and leaves of the tree.
  
The project will focus on the understanding of the problem and on the implementation of different algorithms, so (C/C++ or Matlab or R) coding will be required. Since the approach will be based on statistical model, the student is supposed to be comfortable with notions that comes from probability and statistics courses.
+
The project will focus on the understanding of the problem and on the implementation of different algorithms, so (C/C++ or Matlab or R) coding will be required. Since the approach will be based on statistical models, the student is supposed to be comfortable with notions that come from probability and statistics courses.
  
 
The project is thought to be extended to master thesis, according to interesting and novel directions of research that will emerge in the first part of the work. Possible ideas may concern the proposal and implementation of new algorithms, based on recent approaches to phylogenetic inference available in the literature, as in [3] and [4]. In this case the thesis requires some extra effort in order to build and consolidate some background in math in oder to understand some recent literature, especially in (mathematical) statistics and, for example, in the emerging field of algebraic statistics [5].
 
The project is thought to be extended to master thesis, according to interesting and novel directions of research that will emerge in the first part of the work. Possible ideas may concern the proposal and implementation of new algorithms, based on recent approaches to phylogenetic inference available in the literature, as in [3] and [4]. In this case the thesis requires some extra effort in order to build and consolidate some background in math in oder to understand some recent literature, especially in (mathematical) statistics and, for example, in the emerging field of algebraic statistics [5].
Line 253: Line 253:
 
|image=Robowii_robot.jpg}}
 
|image=Robowii_robot.jpg}}
 
<!--==== Soft Computing ====-->
 
<!--==== Soft Computing ====-->
 +
 +
 +
==== Evolutionary Optimization and Stochastic Optimization ====
 +
 +
 +
{{Project template
 +
|title=Combinatorial optimization based on stochastic relaxation
 +
|tutor=[[User:MatteoMatteucci|Matteo Matteucci]] ([mailto:matteucc-AT-elet-DOT-polimi-DOT-it]), [[User:LuigiMalago|Luigi Malagò]] ([mailto:malago-AT-elet-DOT-polimi-DOT-it email])
 +
|description=The project will focus on the study, implementation, comparison and analysis of different algorithms for the optimization of pseudo-Boolean functions, i.e., functions defined over binary variables with values in R. These functions have been studied a lot in the mathematical programming literature, and different algorithms have been proposed [1]. More recently, the same problems have been faced in evolutionary computations, with the use of genetic algorithms, and in particular estimation of distribution algorithms [2,3]. Estimation of distribution algorithms are a recent meta-heuristic, where classical crossover and mutation operators used in genetic algorithms are replaced with operators that comes from statistics, such as sampling and estimation.
 +
 +
The focus will be on the implementation of a new algorithm able to combine different approaches (estimation and sampling, from one side, and exploitation of prior knowledge about the structure of problem, on the other), together with the comparison of the results with existing techniques that historically appear in different (and often separated) communities. Good coding (C/C++) abilities are required. Since the approach will be based on statistical models, the student is supposed to be comfortable with notions that come from probability and statistics courses. The project could require some extra effort in order to build and consolidate some background in math, especially in Bayesian statistics and MCMC techniques, such as Gibbs and Metropolis samplers [4].
 +
 +
The project can be extended to master thesis, according to interesting and novel directions of research that will emerge in the first part of the work. Possible ideas may concern the proposal of new algorithms able to learn existing dependencies among the variables in the function to be optimized, and exploit them in order to increase the probability to converge to the global optimum.
 +
 +
Picture taken from http://www.ra.cs.uni-tuebingen.de/
 +
 +
;Bibliography
 +
*[1] Boros, Endre and Boros, Endre and Hammer, Peter L. (2002) Pseudo-boolean optimization. Discrete Applied Mathematics .
 +
*[2] Pelikan, Martin; Goldberg, David; Lobo, Fernando (1999), A Survey of Optimization by Building and Using Probabilistic Models, Illinois: Illinois Genetic Algorithms Laboratory (IlliGAL), University of Illinois at Urbana-Champaign.
 +
*[3] Larrañga, Pedro; & Lozano, Jose A. (Eds.). Estimation of distribution algorithms: A new tool for evolutionary computation. Kluwer Academic Publishers, Boston, 2002.
 +
*[4] Image Analysis, Random Fields Markov Chain Monte Carlo Methods
 +
 +
 +
|start=Anytime
 +
|number=1-2
 +
|cfu=20-40
 +
|image=stochastic.jpg}}

Revision as of 16:56, 18 March 2009

Here you can find proposals for master thesis (20 CFU for each student)

BioSignal Analysis

Analysis of the Olfactory Signal


Title: Computational Intelligence techniques to analyse the olfactory signal acquired by an electronic nose for cancer diagnosis
Description: The electronic nose is an instrument able to detect and recognize odors, that is the volatile substances in the atmosphere or emitted by the analyzed substance. This device can react to a gas substance by providing signals that can be analyzed to classify the input. It is composed of a sensor array (MOS sensors, in our case) and a pattern classification system based on machine learning techniques. Each sensor reacts in a different way to the analyzed substance, providing multidimensional data that can be considered as a unique olfactory blueprint of the analyzed substance. We have already tested the use of the electronic nose as diagnostic tool for lung cancer; boosted from the very satisfactory results that we have achieved by these analysis, we want to investigate the possibility of diagnosing other types of cancer and to improve the current computation intelligence techniques.

The project is done in collaboration with the Istituto dei Tumori, Milano.

Tools and instruments
Matlab
Bibliography 
BLATT R., BONARINI A, CALABRÒ E, DELLA TORRE M, MATTEUCCI M, PASTORINO U. (2008). Pattern Classification Techniques for Early Lung Cancer Diagnosis using an Electronic Nose. In: Frontiers in Artificial Intelligence and Applications. European Conference on Artificial Intelligence - Prestigious Applications of Intetelligent Systems. Patras, Greece. 21-15 luglio 2008. (vol. 178, pp. 693-697). ISBN/ISSN: 978-1-58603-891-5. IOS Press. File:PAIS.pdf
Tutor: Andrea Bonarini (email), Matteo Matteucci (email), Rossella Blatt (email)
Start: Anytime (a new acquisition phase will start in March)
Number of students: 1-2
CFU: 20


Sleep Staging


Title: Development of a computer-assisted CAP (Sleep cyclic alternating pattern) scoring method
CAP Sleep Staging.jpg
Description: In 1985, Terzano describes for the first time the Cyclic Alternating Pattern [1] during sleep and, nowadays, CAP is widely accepted by the medical community as basic analysis of sleep. The CAP evaluation is of fundamental importance since it represents the mechanism developed by the brain evolution to monitor the inner and outer world and to assure the survival during sleep. However, visual detection of CAP in polisomnography (i.e., the standard procedure) is a slow and time-consuming process. This limiting factor generates the necessity of new computer-assisted scoring methods for fast CAP evaluation. This thesis deals with the development of a Decision Support System for CAP scoring based on features extraction at multi-system level (by statistical and signal analysis) and Pattern Recognition or Machine Learning approaches. This may allow the automatic detection of CAP sleep and could be integrated, through reinforcement learning techniques, with the corrections given by physicians.
Tools and instruments
Matlab, C/C++
Bibliography
Mario Terzano, Liborio Parrino. Atlas, rules, and recording techniques for the scoring of cyclic alternating pattern (CAP) in human sleep, Sleep Medicine 2 (2001) 537–553. [2]
Tutor: Matteo Matteucci (email), Martin Mendez (email), Anna Maria Bianchi (email), Mario Terzano (Ospedale di Parma)
Start: Anytime
Number of students: 1-2
CFU: 20


Brain-Computer Interface


Title: Recognition of the user's focusing on the stimulation matrix
B p300 speller.jpg
Description: A P300-based BCI stimulates the user continuously, and the detection of a P300 designates the choice of the user. When the user is not paying attention to the interface, false positives are likely. The objective of this work is to avoid this problem; the analysis of the electroencephalogram (EEG) over the visual cortex (and possibly an analysis of P300s or of other biosignals) should tell when the user is looking at the interface.
Tools and instruments
Matlab, BCI2000, C++
EEG system
Bibliography
E. Donchin, K.M. Spencer, R. Wijesinghe. The Mental Prosthesis: Assessing the Speed of a P300-Based Brain-Computer Interface [3]
Tutor: Matteo Matteucci (email), Bernardo Dal Seno (email)
Start: Anytime
Number of students: 1-2
CFU: 20




Title: Creation of new EEG training by introduction of noise
Bci arch.png
Description: A BCI must be trained on the individual user in order to be effective. This training phase require recording data in long sessions, which is time consuming and boring for the user. The aim of this project is to develop algorithm to create new training EEG (electroencephalography) data from existing ones, so as to speed up the training phase.
Tools and instruments
Matlab, BCI2000
Knowledge of C++ may be useful
EEG system
Bibliography
J.R. Wolpaw et al. Brain-computer interfaces for communication and control [4]
Tutor: Matteo Matteucci (email), Bernardo Dal Seno (email)
Start: Anytime
Number of students: 1
CFU: 20



Title: Real-time removal of ocular artifact from EEG
B bci.jpg
Description: In a BCI based on electroencephalogram (EEG), one of the most important sources of noise is related to ocular movements. Algorithms have been devised to cancel the effect of such artifacts. The project consists in the in the implementation in real time of an existing algorithm (or one newly developed) in order to improve the performance of a BCI.
Tools and instruments
Matlab, BCI2000, C++
EEG-system
Bibliography
J.R. Wolpaw et al. Brain-computer interfaces for communication and control [5]
R.J. Croff, R.J. Barry. Removal of ocular artifact from the EEG: a review [6]
Tutor: Matteo Matteucci (email), Bernardo Dal Seno (email)
Start: Anytime
Number of students: 1
CFU: 10-20



Title: Aperiodic visual stimulation in a VEP-based BCI
Bci arch.png
Description: Visual-evoked potentials (VEPs) are a possible way to drive the a BCI. This projects aims at maximizing the discrimination between different stimuli.
Tools and instruments
Matlab, BCI2000, C++
EEG system
Bibliography
J.R. Wolpaw et al. Brain-computer interfaces for communication and control [7]
Tutor: Matteo Matteucci (email), Bernardo Dal Seno (email)
Start: Anytime
Number of students: 1
CFU: 20



Title: Driving an autonomous wheelchair with a P300-based BCI
LURCH wheelchair.jpg
Description: This project pulls together different Airlab projects with the aim to drive an autonomous wheelchair (LURCH) with a BCI, through the development of key software modules. The work will be validated with live experiments.
Tools and instruments
C++, C, BCI2000, Matlab
Linux
EEG system
Lurch wheelchair
Bibliography
R. Blatt et al. Brain Control of a Smart Wheelchair [8]
Tutor: Matteo Matteucci (email), Bernardo Dal Seno (email)
Start: November 2008
Number of students: 1
CFU: 5-20



Title: Online automatic tuning of the number of repetitions in a P300-based BCI
B p300 speller.jpg
Description: In a P300-based BCI, (visual) stimuli are presented to the user, and the intention of the user is recognized when a P300 potential is recognized in response of the desired stimulus. In order to improve accuracy, many stimulation rounds are usually performed before making a decision. The exact number of repetitions depends on the user and the goodness of the classifier, but it is usually fixed a-priori. The aim of this project is to adapt the number of repetitions to changing conditions, so as to achieve the maximum accuracy with the minimum time.

The work will be validated with live experiments.

Tools and instruments
C++, BCI2000, Matlab
EEG system
Bibliography
E. Donchin, K.M. Spencer, R. Wijesinghe. The Mental Prosthesis: Assessing the Speed of a P300-Based Brain-Computer Interface [9]
Tutor: Matteo Matteucci (email), Bernardo Dal Seno (email)
Start: Anytime
Number of students: 1
CFU: 5-20



Machine Learning


Title: Reinforcement Learning in Poker
PokerPRLT.png
Description: In this years, Artificial Intelligence research has shifted its attention from fully observable environments such as Chess to more challenging partially observable ones such as Poker.

Up to this moment research in this kind of environments, which can be formalized as Partially Observable Stochastic Games, has been more from a game theoretic point of view, thus focusing on the pursue of optimality and equilibrium, with no attention to payoff maximization, which may be more interesting in many real-world contexts.

On the other hand Reinforcement Learning techniques demonstrated to be successful in solving both fully observable problems, single and multi-agent, and single-agent partially observable ones, while lacking application to the partially observable multi-agent framework.

This research aims at studying the solution of Partially Observable Stochastic Games, analyzing the possibility to combine the Opponent Modeling concept with the well proven Reinforcement Learning solution techniques to solve problems in this framework, adopting Poker as testbed.

Tutor: Marcello Restelli (restelli-AT-elet-DOT-polimi-DOT-it)
Start: Anytime
Number of students: 1-2
CFU: 20-40




Title: EyeBot
TORCS2.jpg
Description: TORCS is a state-of-the-art open source racing simulator that represents an ideal bechmark for machine learning techniques. We already organized two successfull competitions based on TORCS where competitors have been asked to develop a controller using their preferred machine learning techniques. So far, the controller developed for TORCS used as input only information extracted directly from the state of the game. The goal of this project is to extend the existing controller API (see here) to use the visual information (e.g. the screenshots of the game) as input to the controllers. A successfull project will include both the development of the API and some basic imaga preprocessing to extract information from the images.
Tutor: Daniele Loiacono (loiacono-AT-elet-DOT-polimi-DOT-it), Alessandro Giusti (giusti-AT-elet-DOT-polimi-DOT-it), and Pierluigi Taddei (taddei-AT-elet-DOT-polimi-DOT-it)
Start: Anytime
Number of students: 1 to 2
CFU: 20



Title: SmarTrack
TORCS3.jpg
Description: The generation of customized game content for each player is an attractive direction to improve the game experience in the next-generation computer games. In this scenario, Machine Learning could play an important role to provide automatically such customized game content.

The goal of this project is to apply machine learning techniques for the generation of customized tracks in TORCS, a state-of-the-art open source racing simulator. The project include different activities: the automatic generation of tracks, the section of relevant features to characterize a track and the analysis of an interest measure.

Tutor: Daniele Loiacono (loiacono-AT-elet-DOT-polimi-DOT-it)
Start: Anytime
Number of students: 1 to 2
CFU: 20



Title: Automatic generation of domain ontologies [[Image:|center|300px]]
Description: This thesis to be developed together with Noustat S.r.l., who are developing research activities directed toward the optimization of knowledge management services, in collaboration with another company operating in this field. This project is aimed at removing the ontology building bottleneck, long and expensive activity that usually requires the direct collaboration of a domain expert. The possibility of automatic building the ontology, starting from a set of textual documents related to a specific domain, is expected to improve the ability to provide the knowledge management service, both by reducing the time-to-application, and by increasing the number of domains that can be covered. For this project, unsupervised learning methods will be applied in sequence, exploiting the topological properties of the ultra-metric spaces that emerge from the taxonomic structure of the concepts present in the texts, and associative methods will extend the concept network to lateral, non-hierarchical relationships.
Tutor: Matteo Matteucci (email), Andrea Bonarini (email)
Start: before November 30th
Number of students: 1-2
CFU: 20




Title: Statistical inference for phylogenetic trees
Toloverview.jpg
Description: The project will focus on the study, implementation, comparison and analysis of different statistical inference techniques for phylogenetic trees. Phylogenetic trees [1, 2] are evolutionary trees used to represent the relationships between different species with a common ancestor. Typical inference task concern the construction of a tree starting from DNA sequences, involving both the choice of the topology of the tree (i.e., model selection) and the values of the parameters (i.e., model fitting). The focus will be a probabilistic description of the tree, given by the introduction of stochastic variables associated to both internal nodes and leaves of the tree.

The project will focus on the understanding of the problem and on the implementation of different algorithms, so (C/C++ or Matlab or R) coding will be required. Since the approach will be based on statistical models, the student is supposed to be comfortable with notions that come from probability and statistics courses.

The project is thought to be extended to master thesis, according to interesting and novel directions of research that will emerge in the first part of the work. Possible ideas may concern the proposal and implementation of new algorithms, based on recent approaches to phylogenetic inference available in the literature, as in [3] and [4]. In this case the thesis requires some extra effort in order to build and consolidate some background in math in oder to understand some recent literature, especially in (mathematical) statistics and, for example, in the emerging field of algebraic statistics [5].

Picture taken from http://www.tolweb.org/tree/


Bibliography
  • [1] Felsenstein 2003: Inferring Phylogenies
  • [2] Semple and Steel 2003: Phylogenetics: The mathematics of phylogenetics
  • [3] Louis J. Billera, Susan P. Holmes and and Karen Vogtmann Geometry of the space of phylogenetic trees. Advances in Applied Math 27, 733-767 (2001)
  • [4] Evans, S.N. and Speed, T.P. (1993). Invariants of some probability models used in phylogenetic inference. Annals of Statistics 21, 355-377.
  • [5] Lior Pachter, Bernd Sturmfels 2005, Algebraic Statistics for Computational Biology.
Tutor: Matteo Matteucci ([10]), Luigi Malagò (email)
Start: Anytime
Number of students: 1-2
CFU: 20-40



Affective Computing

Ontologies and Semantic Web


Title: Automatic generation of domain ontologies
OntologyFromText.jpg
Description: This thesis to be developed together with Noustat S.r.l., who are developing research activities directed toward the optimization of knowledge management services, in collaboration with another company operating in this field. This project is aimed at removing the ontology building bottleneck, long and expensive activity that usually requires the direct collaboration of a domain expert. The possibility of automatic building the ontology, starting from a set of textual documents related to a specific domain, is expected to improve the ability to provide the knowledge management service, both by reducing the time-to-application, and by increasing the number of domains that can be covered. For this project, unsupervised learning methods will be applied in sequence, exploiting the topological properties of the ultra-metric spaces that emerge from the taxonomic structure of the concepts present in the texts, and associative methods will extend the concept network to lateral, non-hierarchical relationships.
Tutor: Matteo Matteucci (email), Andrea Bonarini (email)
Start: before November 30th
Number of students: 1-2
CFU: 20



Robotics


Title: Robot games
Robowii robot.jpg
Description: The goal of this activity is to develop an interactive game with robots using commercial devices such as the WII Mote (see the Robogames page)

Projects are available in different areas:

  • Design and implementation of the game on one of the available robots and extension of the robot functionalities
  • Design and implementation of the game and a new suitable robot
  • Evaluation of the game with users (in collaboration with Franca Garzotto)

These projects allow to experiment with real mobile robots and real interaction devices.

Parts of these projects can be considered as course projects. These projects can also be extended to cover course projects.

Tutor: Andrea Bonarini (bonarini-AT-elet-DOT-polimi-DOT-it)
Start: Anytime
Number of students: 1-2
CFU: 7.5-20



Evolutionary Optimization and Stochastic Optimization


Title: Combinatorial optimization based on stochastic relaxation
Stochastic.jpg
Description: The project will focus on the study, implementation, comparison and analysis of different algorithms for the optimization of pseudo-Boolean functions, i.e., functions defined over binary variables with values in R. These functions have been studied a lot in the mathematical programming literature, and different algorithms have been proposed [1]. More recently, the same problems have been faced in evolutionary computations, with the use of genetic algorithms, and in particular estimation of distribution algorithms [2,3]. Estimation of distribution algorithms are a recent meta-heuristic, where classical crossover and mutation operators used in genetic algorithms are replaced with operators that comes from statistics, such as sampling and estimation.

The focus will be on the implementation of a new algorithm able to combine different approaches (estimation and sampling, from one side, and exploitation of prior knowledge about the structure of problem, on the other), together with the comparison of the results with existing techniques that historically appear in different (and often separated) communities. Good coding (C/C++) abilities are required. Since the approach will be based on statistical models, the student is supposed to be comfortable with notions that come from probability and statistics courses. The project could require some extra effort in order to build and consolidate some background in math, especially in Bayesian statistics and MCMC techniques, such as Gibbs and Metropolis samplers [4].

The project can be extended to master thesis, according to interesting and novel directions of research that will emerge in the first part of the work. Possible ideas may concern the proposal of new algorithms able to learn existing dependencies among the variables in the function to be optimized, and exploit them in order to increase the probability to converge to the global optimum.

Picture taken from http://www.ra.cs.uni-tuebingen.de/

Bibliography
  • [1] Boros, Endre and Boros, Endre and Hammer, Peter L. (2002) Pseudo-boolean optimization. Discrete Applied Mathematics .
  • [2] Pelikan, Martin; Goldberg, David; Lobo, Fernando (1999), A Survey of Optimization by Building and Using Probabilistic Models, Illinois: Illinois Genetic Algorithms Laboratory (IlliGAL), University of Illinois at Urbana-Champaign.
  • [3] Larrañga, Pedro; & Lozano, Jose A. (Eds.). Estimation of distribution algorithms: A new tool for evolutionary computation. Kluwer Academic Publishers, Boston, 2002.
  • [4] Image Analysis, Random Fields Markov Chain Monte Carlo Methods
Tutor: Matteo Matteucci ([11]), Luigi Malagò (email)
Start: Anytime
Number of students: 1-2
CFU: 20-40