<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
		<id>https://airwiki.deib.polimi.it/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=JacopoFarina</id>
		<title>AIRWiki - User contributions [en]</title>
		<link rel="self" type="application/atom+xml" href="https://airwiki.deib.polimi.it/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=JacopoFarina"/>
		<link rel="alternate" type="text/html" href="https://airwiki.deib.polimi.it/index.php/Special:Contributions/JacopoFarina"/>
		<updated>2026-04-05T21:53:56Z</updated>
		<subtitle>User contributions</subtitle>
		<generator>MediaWiki 1.25.6</generator>

	<entry>
		<id>https://airwiki.deib.polimi.it/index.php?title=Wikipedia_Category_Graph&amp;diff=13505</id>
		<title>Wikipedia Category Graph</title>
		<link rel="alternate" type="text/html" href="https://airwiki.deib.polimi.it/index.php?title=Wikipedia_Category_Graph&amp;diff=13505"/>
				<updated>2011-08-31T13:05:26Z</updated>
		
		<summary type="html">&lt;p&gt;JacopoFarina: WikipediaCategoryGraph moved to Wikipedia Category Graph: most of other pages use this naming convention&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Project&lt;br /&gt;
|title=Automatically assigning Wikipedia articles to macrocategories&lt;br /&gt;
|short_descr=Represent Wikipedia Categories with a model based on graphs to further analyze it.&lt;br /&gt;
|coordinator=MarcoColombetti&lt;br /&gt;
|tutor=DavidLaniado;RiccardoTasso&lt;br /&gt;
|students=JacopoFarina;&lt;br /&gt;
|resarea=Social Software and Semantic Web&lt;br /&gt;
|restopic=Graph Mining and Analysis; Semantic Tagging;&lt;br /&gt;
|start=2010/06/10&lt;br /&gt;
|end=2010/10/01&lt;br /&gt;
|status=Closed&lt;br /&gt;
|level=Bs&lt;br /&gt;
|type=Thesis&lt;br /&gt;
}}&lt;br /&gt;
The goal of the project is to analyze Wikipedia categories by representing them in a graph based database to further extract informations mainly about assigning macrocategories to articles.&lt;br /&gt;
&lt;br /&gt;
Wikipedia categories are not a tree-based structure: a category may be contained in another one which is contained in another one which is contained in the first one, generating a cyclic reference, a category can contain itself and many categories may be a root category (non contained in others).&lt;br /&gt;
&lt;br /&gt;
For these reasons a graph database is better to represent the structure.&lt;br /&gt;
==Creation and further analysis of the database with igraph==&lt;br /&gt;
Wikipedia lets users download the entire site database (with all versions of all articles) or just some selections of it.&lt;br /&gt;
We use a selection which contains the category structure and article memberships in them.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[http://neo4j.org/ Neo4j] is a graph-based database, which allow a program to create and manipulate graph structures like nodes and relationships.&lt;br /&gt;
&lt;br /&gt;
In order to transfer the database in neo4j format is better save it in a file, which will be read one line at time.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
After transferring the structure in a Neo4j graph it is possible to create from it a Pajek file (.net) to make general analysis like described [[Social_Network_Analysis_With_Igraph_Package_Using_R|here]].&lt;br /&gt;
&lt;br /&gt;
==Results of the analysis==&lt;br /&gt;
We can use the tools described [http://igraph.sourceforge.net/doc/R/00Index.html here]&lt;br /&gt;
&lt;br /&gt;
The '''diameter''' of the graph is '''32'''. This is the maximum distance (number of nodes in the minimum path) between two nodes. These two nodes are ''Prehistoric life sorted by geography'' (A category about prehistoric animals without articles) and ''BMW M20'' (a car).&lt;br /&gt;
&lt;br /&gt;
The '''average distance''' between two nodes is '''5.5568'''.&lt;br /&gt;
&lt;br /&gt;
The '''graph density''' is '''6.28*10&amp;lt;sup&amp;gt;-7&amp;lt;/sup&amp;gt;'''. This is the ratio of the number of edges and the number of possible edges.&lt;br /&gt;
&lt;br /&gt;
By analysing the graph with little Java programs written for this task we calculated the average number of categories per article is '''2.68'''.&lt;br /&gt;
&lt;br /&gt;
Statistically, the 93% of articles has less than 7 categories, 64% less than 3.&lt;br /&gt;
&lt;br /&gt;
The article with more categories is [http://en.wikipedia.org/wiki/Winston_Churchill Winston Churchill], with 70 categories. [http://en.wikipedia.org/wiki/Albert_einstein Albert Einstein] has the notable number of 56 categories, too.&lt;br /&gt;
&lt;br /&gt;
===Strongly connected components===&lt;br /&gt;
&lt;br /&gt;
By applying the Tarjan's strongly connected components algorithm to the graph is possible to found 93 structures of up to 2 nodes. Each of them contains at least one cycle. Most of them are composed of two categories about the same thing, like ''History of the Germanic peoples'' and ''Ancient Germanic peoples'', but there are also more curious cases like this one&lt;br /&gt;
[[Image:Struttura fortemente connessa wikipedia.png|400px]]&lt;br /&gt;
&lt;br /&gt;
=== Tested algorithms ===&lt;br /&gt;
We tried 9 algorithms to choose the category which fit best an article. After confronting the results of the automatic procedure with human made assignments, the best algorithm was choose a different weight to each edge by the traversal direction.&lt;br /&gt;
Sizes of the macrocategories determined this way are:&lt;br /&gt;
[[Image:Dimensioni macrocategorie costi.png|400px]]&lt;br /&gt;
&lt;br /&gt;
==Download==&lt;br /&gt;
*The relation about the project (in Italian) can be found [[Media:Relazione progetto WikipediaCatGraph.pdf|here]]&lt;br /&gt;
*Sources in Java can be found [[Media:Wikipedia category Graph-sources.zip|here]]&lt;br /&gt;
*Slides of the thesis (in Italian) [[Media:Presentazione wikipedia category graph.pdf|here]]&lt;br /&gt;
*The thesis (in Italian) [[Media:Tesi wikipedia category graph.pdf|here]]&lt;br /&gt;
&lt;br /&gt;
==Previous Work==&lt;br /&gt;
&lt;br /&gt;
*[http://www-users.cs.umn.edu/~echi/papers/2009-CHI2009/p1509.pdf What's in Wikipedia? Mapping Topics and Conflict Using Socially]&lt;/div&gt;</summary>
		<author><name>JacopoFarina</name></author>	</entry>

	<entry>
		<id>https://airwiki.deib.polimi.it/index.php?title=WikipediaCategoryGraph&amp;diff=13506</id>
		<title>WikipediaCategoryGraph</title>
		<link rel="alternate" type="text/html" href="https://airwiki.deib.polimi.it/index.php?title=WikipediaCategoryGraph&amp;diff=13506"/>
				<updated>2011-08-31T13:05:26Z</updated>
		
		<summary type="html">&lt;p&gt;JacopoFarina: WikipediaCategoryGraph moved to Wikipedia Category Graph: most of other pages use this naming convention&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;#REDIRECT [[Wikipedia Category Graph]]&lt;/div&gt;</summary>
		<author><name>JacopoFarina</name></author>	</entry>

	<entry>
		<id>https://airwiki.deib.polimi.it/index.php?title=Wikipedia_Category_Graph&amp;diff=12885</id>
		<title>Wikipedia Category Graph</title>
		<link rel="alternate" type="text/html" href="https://airwiki.deib.polimi.it/index.php?title=Wikipedia_Category_Graph&amp;diff=12885"/>
				<updated>2011-01-18T15:40:00Z</updated>
		
		<summary type="html">&lt;p&gt;JacopoFarina: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Project&lt;br /&gt;
|title=Automatically assigning Wikipedia articles to macrocategories&lt;br /&gt;
|short_descr=Represent Wikipedia Categories with a model based on graphs to further analyze it.&lt;br /&gt;
|coordinator=MarcoColombetti&lt;br /&gt;
|tutor=DavidLaniado;RiccardoTasso&lt;br /&gt;
|students=JacopoFarina;&lt;br /&gt;
|resarea=Social Software and Semantic Web&lt;br /&gt;
|restopic=Graph Mining and Analysis; Semantic Tagging;&lt;br /&gt;
|start=2010/06/10&lt;br /&gt;
|end=2010/10/01&lt;br /&gt;
|status=Closed&lt;br /&gt;
|level=Bs&lt;br /&gt;
|type=Thesis&lt;br /&gt;
}}&lt;br /&gt;
The goal of the project is to analyze Wikipedia categories by representing them in a graph based database to further extract informations mainly about assigning macrocategories to articles.&lt;br /&gt;
&lt;br /&gt;
Wikipedia categories are not a tree-based structure: a category may be contained in another one which is contained in another one which is contained in the first one, generating a cyclic reference, a category can contain itself and many categories may be a root category (non contained in others).&lt;br /&gt;
&lt;br /&gt;
For these reasons a graph database is better to represent the structure.&lt;br /&gt;
==Creation and further analysis of the database with igraph==&lt;br /&gt;
Wikipedia lets users download the entire site database (with all versions of all articles) or just some selections of it.&lt;br /&gt;
We use a selection which contains the category structure and article memberships in them.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[http://neo4j.org/ Neo4j] is a graph-based database, which allow a program to create and manipulate graph structures like nodes and relationships.&lt;br /&gt;
&lt;br /&gt;
In order to transfer the database in neo4j format is better save it in a file, which will be read one line at time.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
After transferring the structure in a Neo4j graph it is possible to create from it a Pajek file (.net) to make general analysis like described [[Social_Network_Analysis_With_Igraph_Package_Using_R|here]].&lt;br /&gt;
&lt;br /&gt;
==Results of the analysis==&lt;br /&gt;
We can use the tools described [http://igraph.sourceforge.net/doc/R/00Index.html here]&lt;br /&gt;
&lt;br /&gt;
The '''diameter''' of the graph is '''32'''. This is the maximum distance (number of nodes in the minimum path) between two nodes. These two nodes are ''Prehistoric life sorted by geography'' (A category about prehistoric animals without articles) and ''BMW M20'' (a car).&lt;br /&gt;
&lt;br /&gt;
The '''average distance''' between two nodes is '''5.5568'''.&lt;br /&gt;
&lt;br /&gt;
The '''graph density''' is '''6.28*10&amp;lt;sup&amp;gt;-7&amp;lt;/sup&amp;gt;'''. This is the ratio of the number of edges and the number of possible edges.&lt;br /&gt;
&lt;br /&gt;
By analysing the graph with little Java programs written for this task we calculated the average number of categories per article is '''2.68'''.&lt;br /&gt;
&lt;br /&gt;
Statistically, the 93% of articles has less than 7 categories, 64% less than 3.&lt;br /&gt;
&lt;br /&gt;
The article with more categories is [http://en.wikipedia.org/wiki/Winston_Churchill Winston Churchill], with 70 categories. [http://en.wikipedia.org/wiki/Albert_einstein Albert Einstein] has the notable number of 56 categories, too.&lt;br /&gt;
&lt;br /&gt;
===Strongly connected components===&lt;br /&gt;
&lt;br /&gt;
By applying the Tarjan's strongly connected components algorithm to the graph is possible to found 93 structures of up to 2 nodes. Each of them contains at least one cycle. Most of them are composed of two categories about the same thing, like ''History of the Germanic peoples'' and ''Ancient Germanic peoples'', but there are also more curious cases like this one&lt;br /&gt;
[[Image:Struttura fortemente connessa wikipedia.png|400px]]&lt;br /&gt;
&lt;br /&gt;
=== Tested algorithms ===&lt;br /&gt;
We tried 9 algorithms to choose the category which fit best an article. After confronting the results of the automatic procedure with human made assignments, the best algorithm was choose a different weight to each edge by the traversal direction.&lt;br /&gt;
Sizes of the macrocategories determined this way are:&lt;br /&gt;
[[Image:Dimensioni macrocategorie costi.png|400px]]&lt;br /&gt;
&lt;br /&gt;
==Download==&lt;br /&gt;
*The relation about the project (in Italian) can be found [[Media:Relazione progetto WikipediaCatGraph.pdf|here]]&lt;br /&gt;
*Sources in Java can be found [[Media:Wikipedia category Graph-sources.zip|here]]&lt;br /&gt;
*Slides of the thesis (in Italian) [[Media:Presentazione wikipedia category graph.pdf|here]]&lt;br /&gt;
*The thesis (in Italian) [[Media:Tesi wikipedia category graph.pdf|here]]&lt;br /&gt;
&lt;br /&gt;
==Previous Work==&lt;br /&gt;
&lt;br /&gt;
*[http://www-users.cs.umn.edu/~echi/papers/2009-CHI2009/p1509.pdf What's in Wikipedia? Mapping Topics and Conflict Using Socially]&lt;/div&gt;</summary>
		<author><name>JacopoFarina</name></author>	</entry>

	<entry>
		<id>https://airwiki.deib.polimi.it/index.php?title=Wikipedia_Category_Graph&amp;diff=12880</id>
		<title>Wikipedia Category Graph</title>
		<link rel="alternate" type="text/html" href="https://airwiki.deib.polimi.it/index.php?title=Wikipedia_Category_Graph&amp;diff=12880"/>
				<updated>2011-01-17T20:58:39Z</updated>
		
		<summary type="html">&lt;p&gt;JacopoFarina: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Project&lt;br /&gt;
|title=Automatically assigning Wikipedia articles to macrocategories&lt;br /&gt;
|short_descr=Represent Wikipedia Categories with a model based on graphs to further analyze it.&lt;br /&gt;
|coordinator=MarcoColombetti&lt;br /&gt;
|tutor=DavidLaniado;RiccardoTasso&lt;br /&gt;
|students=JacopoFarina;&lt;br /&gt;
|resarea=Social Software and Semantic Web&lt;br /&gt;
|restopic=Graph Mining and Analysis; Semantic Tagging;&lt;br /&gt;
|start=2010/06/10&lt;br /&gt;
|end=2010/10/01&lt;br /&gt;
|status=Closed&lt;br /&gt;
|level=Bs&lt;br /&gt;
|type=Thesis&lt;br /&gt;
}}&lt;br /&gt;
The goal of the project is to analyze Wikipedia categories by representing them in a graph based database to further extract informations mainly about assigning macrocategories to articles.&lt;br /&gt;
&lt;br /&gt;
Wikipedia categories are not a tree-based structure: a category may be contained in another one which is contained in another one which is contained in the first, generating a cyclic reference and many categories may be a root category (non contained in others).&lt;br /&gt;
&lt;br /&gt;
For thess reasons a graph database is better to represent the structure.&lt;br /&gt;
==Creation and further analysis of the database with igraph==&lt;br /&gt;
Wikipedia lets users download the entire site database (with all versions of all articles) or just some selections of it.&lt;br /&gt;
We use a selection which contains the category structure and article memberships in them.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[http://neo4j.org/ Neo4j] is a graph-based database, which allow a program to create and manipulate graph structures like nodes and relationships.&lt;br /&gt;
&lt;br /&gt;
In order to transfer the database in neo4j format is better save it in a file, which will be read one line at time.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
After transferring the structure in a Neo4j graph it is possible to create from it a Pajek file (.net) to make general analysis like described [[Social_Network_Analysis_With_Igraph_Package_Using_R|here]].&lt;br /&gt;
&lt;br /&gt;
==Results of the analysis==&lt;br /&gt;
We can use the tools described [http://igraph.sourceforge.net/doc/R/00Index.html here]&lt;br /&gt;
&lt;br /&gt;
The '''diameter''' of the graph is '''32'''. This is the maximum distance (number of nodes in the minimum path) between two nodes. These two nodes are ''Prehistoric life sorted by geography'' (A category about prehistoric animals without articles) and ''BMW M20'' (a car).&lt;br /&gt;
&lt;br /&gt;
The '''average distance''' between two nodes is '''5.5568'''.&lt;br /&gt;
&lt;br /&gt;
The '''graph density''' is '''6.28*10&amp;lt;sup&amp;gt;-7&amp;lt;/sup&amp;gt;'''. This is the ratio of the number of edges and the number of possible edges.&lt;br /&gt;
&lt;br /&gt;
By analysing the graph with little Java programs written for this task we calculated the average number of categories per article is '''2.68'''.&lt;br /&gt;
&lt;br /&gt;
Statistically, the 93% of articles has less than 7 categories, 64% less than 3.&lt;br /&gt;
&lt;br /&gt;
The article with more categories is [http://en.wikipedia.org/wiki/Winston_Churchill Winston Churchill], with 70 categories. [http://en.wikipedia.org/wiki/Albert_einstein Albert Einstein] has the notable number of 56 categories, too.&lt;br /&gt;
&lt;br /&gt;
===Strongly connected components===&lt;br /&gt;
&lt;br /&gt;
By applying the Tarjan's strongly connected components algorithm to the graph is possible to found 93 structures of up to 2 nodes. Each of them contains at least one cycle. Most of them are composed of two categories about the same thing, like ''History of the Germanic peoples'' and ''Ancient Germanic peoples'', but there are also more curious cases like this one&lt;br /&gt;
[[Image:Struttura fortemente connessa wikipedia.png|400px]]&lt;br /&gt;
&lt;br /&gt;
=== Tested algorithms ===&lt;br /&gt;
We tried 9 algorithms to choose the category which fit best an article. After confronting the results of the automatic procedure with human made assignments, the best algorithm was choose a different weight to each edge by the traversal direction.&lt;br /&gt;
Sizes of the macrocategories determined this way are:&lt;br /&gt;
[[Image:Dimensioni macrocategorie costi.png|400px]]&lt;br /&gt;
&lt;br /&gt;
==Download==&lt;br /&gt;
*The relation about the project (in Italian) can be found [[Media:Relazione progetto WikipediaCatGraph.pdf|here]]&lt;br /&gt;
*Sources in Java can be found [[Media:Wikipedia category Graph-sources.zip|here]]&lt;br /&gt;
*Slides of the thesis (in Italian) [[Media:Presentazione wikipedia category graph.pdf|here]]&lt;br /&gt;
*The thesis (in Italian) [[Media:Tesi wikipedia category graph.pdf|here]]&lt;br /&gt;
&lt;br /&gt;
==Previous Work==&lt;br /&gt;
&lt;br /&gt;
*[http://www-users.cs.umn.edu/~echi/papers/2009-CHI2009/p1509.pdf What's in Wikipedia? Mapping Topics and Conflict Using Socially]&lt;/div&gt;</summary>
		<author><name>JacopoFarina</name></author>	</entry>

	<entry>
		<id>https://airwiki.deib.polimi.it/index.php?title=Wikipedia_Category_Graph&amp;diff=12879</id>
		<title>Wikipedia Category Graph</title>
		<link rel="alternate" type="text/html" href="https://airwiki.deib.polimi.it/index.php?title=Wikipedia_Category_Graph&amp;diff=12879"/>
				<updated>2011-01-17T20:55:36Z</updated>
		
		<summary type="html">&lt;p&gt;JacopoFarina: preciso lo scopo del lavoro&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Project&lt;br /&gt;
|title=Automatically assigning Wikipedia articles to macrocategories&lt;br /&gt;
|short_descr=Represent Wikipedia Categories with a model based on graphs to further analyze it.&lt;br /&gt;
|coordinator=MarcoColombetti&lt;br /&gt;
|tutor=DavidLaniado;RiccardoTasso&lt;br /&gt;
|students=JacopoFarina;&lt;br /&gt;
|resarea=Social Software and Semantic Web&lt;br /&gt;
|restopic=Graph Mining and Analysis; Semantic Tagging;&lt;br /&gt;
|start=2010/06/10&lt;br /&gt;
|end=2010/10/01&lt;br /&gt;
|status=Closed&lt;br /&gt;
|level=Bs&lt;br /&gt;
|type=Thesis&lt;br /&gt;
}}&lt;br /&gt;
The goal of the project is to analyze Wikipedia categories by representing them in a graph based database to further extract informations mainly about macrocategories.&lt;br /&gt;
&lt;br /&gt;
Wikipedia categories are not a tree-based structure: a category may be contained in another one which is contained in another one which is contained in the first, generating a cyclic reference and many categories may be a root category (non contained in others).&lt;br /&gt;
&lt;br /&gt;
For thess reasons a graph database is better to represent the structure.&lt;br /&gt;
==Creation and further analysis of the database with igraph==&lt;br /&gt;
Wikipedia lets users download the entire site database (with all versions of all articles) or just some selections of it.&lt;br /&gt;
We use a selection which contains the category structure and article memberships in them.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[http://neo4j.org/ Neo4j] is a graph-based database, which allow a program to create and manipulate graph structures like nodes and relationships.&lt;br /&gt;
&lt;br /&gt;
In order to transfer the database in neo4j format is better save it in a file, which will be read one line at time.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
After transferring the structure in a Neo4j graph it is possible to create from it a Pajek file (.net) to make general analysis like described [[Social_Network_Analysis_With_Igraph_Package_Using_R|here]].&lt;br /&gt;
&lt;br /&gt;
==Results of the analysis==&lt;br /&gt;
We can use the tools described [http://igraph.sourceforge.net/doc/R/00Index.html here]&lt;br /&gt;
&lt;br /&gt;
The '''diameter''' of the graph is '''32'''. This is the maximum distance (number of nodes in the minimum path) between two nodes. These two nodes are ''Prehistoric life sorted by geography'' (A category about prehistoric animals without articles) and ''BMW M20'' (a car).&lt;br /&gt;
&lt;br /&gt;
The '''average distance''' between two nodes is '''5.5568'''.&lt;br /&gt;
&lt;br /&gt;
The '''graph density''' is '''6.28*10&amp;lt;sup&amp;gt;-7&amp;lt;/sup&amp;gt;'''. This is the ratio of the number of edges and the number of possible edges.&lt;br /&gt;
&lt;br /&gt;
By analysing the graph with little Java programs written for this task we calculated the average number of categories per article is '''2.68'''.&lt;br /&gt;
&lt;br /&gt;
Statistically, the 93% of articles has less than 7 categories, 64% less than 3.&lt;br /&gt;
&lt;br /&gt;
The article with more categories is [http://en.wikipedia.org/wiki/Winston_Churchill Winston Churchill], with 70 categories. [http://en.wikipedia.org/wiki/Albert_einstein Albert Einstein] has the notable number of 56 categories, too.&lt;br /&gt;
&lt;br /&gt;
===Strongly connected components===&lt;br /&gt;
&lt;br /&gt;
By applying the Tarjan's strongly connected components algorithm to the graph is possible to found 93 structures of up to 2 nodes. Each of them contains at least one cycle. Most of them are composed of two categories about the same thing, like ''History of the Germanic peoples'' and ''Ancient Germanic peoples'', but there are also more curious cases like this one&lt;br /&gt;
[[Image:Struttura fortemente connessa wikipedia.png|400px]]&lt;br /&gt;
&lt;br /&gt;
=== Tested algorithms ===&lt;br /&gt;
We tried 9 algorithms to choose the category which fit best an article. After confronting the results of the automatic procedure with human made assignments, the best algorithm was choose a different weight to each edge by the traversal direction.&lt;br /&gt;
Sizes of the macrocategories determined this way are:&lt;br /&gt;
[[Image:Dimensioni macrocategorie costi.png|400px]]&lt;br /&gt;
&lt;br /&gt;
==Download==&lt;br /&gt;
*The relation about the project (in Italian) can be found [[Media:Relazione progetto WikipediaCatGraph.pdf|here]]&lt;br /&gt;
*Sources in Java can be found [[Media:Wikipedia category Graph-sources.zip|here]]&lt;br /&gt;
*Slides of the thesis (in Italian) [[Media:Presentazione wikipedia category graph.pdf|here]]&lt;br /&gt;
*The thesis (in Italian) [[Media:Tesi wikipedia category graph.pdf|here]]&lt;br /&gt;
&lt;br /&gt;
==Previous Work==&lt;br /&gt;
&lt;br /&gt;
*[http://www-users.cs.umn.edu/~echi/papers/2009-CHI2009/p1509.pdf What's in Wikipedia? Mapping Topics and Conflict Using Socially]&lt;/div&gt;</summary>
		<author><name>JacopoFarina</name></author>	</entry>

	<entry>
		<id>https://airwiki.deib.polimi.it/index.php?title=User:JacopoFarina&amp;diff=12521</id>
		<title>User:JacopoFarina</title>
		<link rel="alternate" type="text/html" href="https://airwiki.deib.polimi.it/index.php?title=User:JacopoFarina&amp;diff=12521"/>
				<updated>2010-10-06T22:29:20Z</updated>
		
		<summary type="html">&lt;p&gt;JacopoFarina: aggiunta foto&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Student&lt;br /&gt;
|category=Student&lt;br /&gt;
|firstname=Jacopo&lt;br /&gt;
|lastname=Farina&lt;br /&gt;
|photo=Jacopo Farina face.jpg&lt;br /&gt;
|email=jacopo1.farina@mail.polimi.it&lt;br /&gt;
|advisor=RiccardoTasso; DavidLaniado;&lt;br /&gt;
|status=inactive&lt;br /&gt;
}}&lt;br /&gt;
I'm currently working on a third-year thesis about [[WikipediaCategoryGraph|analyzing the Wikipedia graph of categories and pages]].&lt;/div&gt;</summary>
		<author><name>JacopoFarina</name></author>	</entry>

	<entry>
		<id>https://airwiki.deib.polimi.it/index.php?title=File:Jacopo_Farina_face.jpg&amp;diff=12520</id>
		<title>File:Jacopo Farina face.jpg</title>
		<link rel="alternate" type="text/html" href="https://airwiki.deib.polimi.it/index.php?title=File:Jacopo_Farina_face.jpg&amp;diff=12520"/>
				<updated>2010-10-06T22:28:32Z</updated>
		
		<summary type="html">&lt;p&gt;JacopoFarina: Face of Jacopo farina&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Face of Jacopo farina&lt;/div&gt;</summary>
		<author><name>JacopoFarina</name></author>	</entry>

	<entry>
		<id>https://airwiki.deib.polimi.it/index.php?title=Wikipedia_Category_Graph&amp;diff=12359</id>
		<title>Wikipedia Category Graph</title>
		<link rel="alternate" type="text/html" href="https://airwiki.deib.polimi.it/index.php?title=Wikipedia_Category_Graph&amp;diff=12359"/>
				<updated>2010-09-20T16:27:33Z</updated>
		
		<summary type="html">&lt;p&gt;JacopoFarina: /* Download */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Project&lt;br /&gt;
|title=Wikipedia Category Graph&lt;br /&gt;
|short_descr=Represent Wikipedia Categories with a model based on graphs to further analyze it.&lt;br /&gt;
|coordinator=MarcoColombetti&lt;br /&gt;
|tutor=DavidLaniado;RiccardoTasso&lt;br /&gt;
|students=JacopoFarina;&lt;br /&gt;
|resarea=Social Software and Semantic Web&lt;br /&gt;
|restopic=Graph Mining and Analysis&lt;br /&gt;
|start=2010/06/10&lt;br /&gt;
|end=2010/10/01&lt;br /&gt;
|status=Active&lt;br /&gt;
|level=Bs&lt;br /&gt;
|type=Course&lt;br /&gt;
}}&lt;br /&gt;
The goal of the project is to analyze Wikipedia categories by representing them in a graph based database.&lt;br /&gt;
&lt;br /&gt;
Wikipedia categories are not a three-based structure: a category may be contained in another one which is contained in another one which is contained in the first, generating a cyclic reference and many categories may be a root category (non contained in others).&lt;br /&gt;
&lt;br /&gt;
For this reasons a graph database is better to represent the structure.&lt;br /&gt;
==Creation and further analysis of the database with igraph==&lt;br /&gt;
Wikipedia let users download the entire site database (with all versions of all articles) or just some selections of it.&lt;br /&gt;
We use a selection which contains the category list and articles memberships in them.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[http://neo4j.org/ Neo4j] is a graph-based database, which allow a program to create and manipulate graph structures like nodes and relationships.&lt;br /&gt;
&lt;br /&gt;
In order to transfer the database in neo4j format is better save it in a file, which will be read one line at time.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
After transferring the structure in a Neo4j graph is possible to create from it a Pajek file (.net) to make general analysis like described [[Social_Network_Analysis_With_Igraph_Package_Using_R|here]].&lt;br /&gt;
&lt;br /&gt;
==Results of the analysis==&lt;br /&gt;
We can use the tools described [http://igraph.sourceforge.net/doc/R/00Index.html here]&lt;br /&gt;
&lt;br /&gt;
The '''diameter''' of the graph is '''32'''. This is the maximum distance (number of nodes in the minimum path) between two nodes. These two nodes are ''Prehistoric life sorted by geography'' (A category about prehistoric animals without articles) and ''BMW M20'' (a car).&lt;br /&gt;
&lt;br /&gt;
The '''average distance''' between two nodes is '''5.5568'''.&lt;br /&gt;
&lt;br /&gt;
The '''graph density''' is '''6.28*10&amp;lt;sup&amp;gt;-7&amp;lt;/sup&amp;gt;'''. This is the ratio of the number of edges and the number of possible edges.&lt;br /&gt;
&lt;br /&gt;
By analysing the graph with little Java programs written for this task we calculated the average number of categories per article is '''2.68'''.&lt;br /&gt;
&lt;br /&gt;
Statistically, the 93% of articles has less than 7 categories, 64% less than 3.&lt;br /&gt;
&lt;br /&gt;
The article with more categories is [http://en.wikipedia.org/wiki/Winston_Churchill Winston Churchill], with 70 categories. [http://en.wikipedia.org/wiki/Albert_einstein Albert Einstein] has the notable number of 56 categories, too.&lt;br /&gt;
&lt;br /&gt;
===Strongly connected components===&lt;br /&gt;
&lt;br /&gt;
By applying the Tarjan's strongly connected components algorithm to the graph is possible to found 93 structures of up to 2 nodes. Each of them contains at least one cycle. Most of them are composed of two categories about the same thing, like ''History of the Germanic peoples'' and ''Ancient Germanic peoples'', but there are also more curious cases like this one&lt;br /&gt;
[[Image:Struttura fortemente connessa wikipedia.png|400px]]&lt;br /&gt;
&lt;br /&gt;
=== Tested algorithms ===&lt;br /&gt;
We tried 9 algorithms to choose the category which fit best an article. After confronting the results of the automatic procedure with human made assignments, the best algorithm was choose a different weight to each edge by the traversal direction.&lt;br /&gt;
Sizes of the macrocategories determined this way are:&lt;br /&gt;
[[Image:Dimensioni macrocategorie costi.png|400px]]&lt;br /&gt;
&lt;br /&gt;
==Download==&lt;br /&gt;
*The relation about the project (in Italian) can be found [[Media:Relazione progetto WikipediaCatGraph.pdf|here]]&lt;br /&gt;
*Sources in Java can be found [[Media:Wikipedia category Graph-sources.zip|here]]&lt;br /&gt;
*Slides of the thesis (in Italian) [[Media:Presentazione wikipedia category graph.pdf|here]]&lt;br /&gt;
*The tesis (in Italian) [[Media:Tesi wikipedia category graph.pdf|here]]&lt;br /&gt;
&lt;br /&gt;
==Previous Work==&lt;br /&gt;
&lt;br /&gt;
*[http://www-users.cs.umn.edu/~echi/papers/2009-CHI2009/p1509.pdf What's in Wikipedia? Mapping Topics and Conflict Using Socially]&lt;/div&gt;</summary>
		<author><name>JacopoFarina</name></author>	</entry>

	<entry>
		<id>https://airwiki.deib.polimi.it/index.php?title=File:Tesi_wikipedia_category_graph.pdf&amp;diff=12358</id>
		<title>File:Tesi wikipedia category graph.pdf</title>
		<link rel="alternate" type="text/html" href="https://airwiki.deib.polimi.it/index.php?title=File:Tesi_wikipedia_category_graph.pdf&amp;diff=12358"/>
				<updated>2010-09-20T16:27:18Z</updated>
		
		<summary type="html">&lt;p&gt;JacopoFarina: Il lavoro di tesi &amp;quot;Assegnamento automatico di macrocategorie degli articoli di Wikipedia&amp;quot;, presentata all'appello di Laurea del 21 settembre 2010&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Il lavoro di tesi &amp;quot;Assegnamento automatico di macrocategorie degli articoli di Wikipedia&amp;quot;, presentata all'appello di Laurea del 21 settembre 2010&lt;/div&gt;</summary>
		<author><name>JacopoFarina</name></author>	</entry>

	<entry>
		<id>https://airwiki.deib.polimi.it/index.php?title=File:Presentazione_wikipedia_category_graph.pdf&amp;diff=12357</id>
		<title>File:Presentazione wikipedia category graph.pdf</title>
		<link rel="alternate" type="text/html" href="https://airwiki.deib.polimi.it/index.php?title=File:Presentazione_wikipedia_category_graph.pdf&amp;diff=12357"/>
				<updated>2010-09-20T16:25:12Z</updated>
		
		<summary type="html">&lt;p&gt;JacopoFarina: La presentazione dle lavoro di tesi &amp;quot;Assegnamento automatico di macrocategorie degli articoli di Wikipedia&amp;quot;, presentata all'appello di Laurea del 21 settembre 2010&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;La presentazione dle lavoro di tesi &amp;quot;Assegnamento automatico di macrocategorie degli articoli di Wikipedia&amp;quot;, presentata all'appello di Laurea del 21 settembre 2010&lt;/div&gt;</summary>
		<author><name>JacopoFarina</name></author>	</entry>

	<entry>
		<id>https://airwiki.deib.polimi.it/index.php?title=Wikipedia_Category_Graph&amp;diff=12339</id>
		<title>Wikipedia Category Graph</title>
		<link rel="alternate" type="text/html" href="https://airwiki.deib.polimi.it/index.php?title=Wikipedia_Category_Graph&amp;diff=12339"/>
				<updated>2010-09-16T19:10:52Z</updated>
		
		<summary type="html">&lt;p&gt;JacopoFarina: /* Previous Work */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Project&lt;br /&gt;
|title=Wikipedia Category Graph&lt;br /&gt;
|short_descr=Represent Wikipedia Categories with a model based on graphs to further analyze it.&lt;br /&gt;
|coordinator=MarcoColombetti&lt;br /&gt;
|tutor=DavidLaniado;RiccardoTasso&lt;br /&gt;
|students=JacopoFarina;&lt;br /&gt;
|resarea=Social Software and Semantic Web&lt;br /&gt;
|restopic=Graph Mining and Analysis&lt;br /&gt;
|start=2010/06/10&lt;br /&gt;
|end=2010/10/01&lt;br /&gt;
|status=Active&lt;br /&gt;
|level=Bs&lt;br /&gt;
|type=Course&lt;br /&gt;
}}&lt;br /&gt;
The goal of the project is to analyze Wikipedia categories by representing them in a graph based database.&lt;br /&gt;
&lt;br /&gt;
Wikipedia categories are not a three-based structure: a category may be contained in another one which is contained in another one which is contained in the first, generating a cyclic reference and many categories may be a root category (non contained in others).&lt;br /&gt;
&lt;br /&gt;
For this reasons a graph database is better to represent the structure.&lt;br /&gt;
==Creation and further analysis of the database with igraph==&lt;br /&gt;
Wikipedia let users download the entire site database (with all versions of all articles) or just some selections of it.&lt;br /&gt;
We use a selection which contains the category list and articles memberships in them.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[http://neo4j.org/ Neo4j] is a graph-based database, which allow a program to create and manipulate graph structures like nodes and relationships.&lt;br /&gt;
&lt;br /&gt;
In order to transfer the database in neo4j format is better save it in a file, which will be read one line at time.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
After transferring the structure in a Neo4j graph is possible to create from it a Pajek file (.net) to make general analysis like described [[Social_Network_Analysis_With_Igraph_Package_Using_R|here]].&lt;br /&gt;
&lt;br /&gt;
==Results of the analysis==&lt;br /&gt;
We can use the tools described [http://igraph.sourceforge.net/doc/R/00Index.html here]&lt;br /&gt;
&lt;br /&gt;
The '''diameter''' of the graph is '''32'''. This is the maximum distance (number of nodes in the minimum path) between two nodes. These two nodes are ''Prehistoric life sorted by geography'' (A category about prehistoric animals without articles) and ''BMW M20'' (a car).&lt;br /&gt;
&lt;br /&gt;
The '''average distance''' between two nodes is '''5.5568'''.&lt;br /&gt;
&lt;br /&gt;
The '''graph density''' is '''6.28*10&amp;lt;sup&amp;gt;-7&amp;lt;/sup&amp;gt;'''. This is the ratio of the number of edges and the number of possible edges.&lt;br /&gt;
&lt;br /&gt;
By analysing the graph with little Java programs written for this task we calculated the average number of categories per article is '''2.68'''.&lt;br /&gt;
&lt;br /&gt;
Statistically, the 93% of articles has less than 7 categories, 64% less than 3.&lt;br /&gt;
&lt;br /&gt;
The article with more categories is [http://en.wikipedia.org/wiki/Winston_Churchill Winston Churchill], with 70 categories. [http://en.wikipedia.org/wiki/Albert_einstein Albert Einstein] has the notable number of 56 categories, too.&lt;br /&gt;
&lt;br /&gt;
===Strongly connected components===&lt;br /&gt;
&lt;br /&gt;
By applying the Tarjan's strongly connected components algorithm to the graph is possible to found 93 structures of up to 2 nodes. Each of them contains at least one cycle. Most of them are composed of two categories about the same thing, like ''History of the Germanic peoples'' and ''Ancient Germanic peoples'', but there are also more curious cases like this one&lt;br /&gt;
[[Image:Struttura fortemente connessa wikipedia.png|400px]]&lt;br /&gt;
&lt;br /&gt;
=== Tested algorithms ===&lt;br /&gt;
We tried 9 algorithms to choose the category which fit best an article. After confronting the results of the automatic procedure with human made assignments, the best algorithm was choose a different weight to each edge by the traversal direction.&lt;br /&gt;
Sizes of the macrocategories determined this way are:&lt;br /&gt;
[[Image:Dimensioni macrocategorie costi.png|400px]]&lt;br /&gt;
&lt;br /&gt;
==Download==&lt;br /&gt;
*The relation about the project (in Italian) can be found [[Media:Relazione progetto WikipediaCatGraph.pdf|here]]&lt;br /&gt;
*Sources in Java can be found [[Media:Wikipedia category Graph-sources.zip|here]]&lt;br /&gt;
&lt;br /&gt;
==Previous Work==&lt;br /&gt;
&lt;br /&gt;
*[http://www-users.cs.umn.edu/~echi/papers/2009-CHI2009/p1509.pdf What's in Wikipedia? Mapping Topics and Conflict Using Socially]&lt;/div&gt;</summary>
		<author><name>JacopoFarina</name></author>	</entry>

	<entry>
		<id>https://airwiki.deib.polimi.it/index.php?title=Wikipedia_Category_Graph&amp;diff=12338</id>
		<title>Wikipedia Category Graph</title>
		<link rel="alternate" type="text/html" href="https://airwiki.deib.polimi.it/index.php?title=Wikipedia_Category_Graph&amp;diff=12338"/>
				<updated>2010-09-16T19:10:10Z</updated>
		
		<summary type="html">&lt;p&gt;JacopoFarina: dimensioni immagini&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Project&lt;br /&gt;
|title=Wikipedia Category Graph&lt;br /&gt;
|short_descr=Represent Wikipedia Categories with a model based on graphs to further analyze it.&lt;br /&gt;
|coordinator=MarcoColombetti&lt;br /&gt;
|tutor=DavidLaniado;RiccardoTasso&lt;br /&gt;
|students=JacopoFarina;&lt;br /&gt;
|resarea=Social Software and Semantic Web&lt;br /&gt;
|restopic=Graph Mining and Analysis&lt;br /&gt;
|start=2010/06/10&lt;br /&gt;
|end=2010/10/01&lt;br /&gt;
|status=Active&lt;br /&gt;
|level=Bs&lt;br /&gt;
|type=Course&lt;br /&gt;
}}&lt;br /&gt;
The goal of the project is to analyze Wikipedia categories by representing them in a graph based database.&lt;br /&gt;
&lt;br /&gt;
Wikipedia categories are not a three-based structure: a category may be contained in another one which is contained in another one which is contained in the first, generating a cyclic reference and many categories may be a root category (non contained in others).&lt;br /&gt;
&lt;br /&gt;
For this reasons a graph database is better to represent the structure.&lt;br /&gt;
==Creation and further analysis of the database with igraph==&lt;br /&gt;
Wikipedia let users download the entire site database (with all versions of all articles) or just some selections of it.&lt;br /&gt;
We use a selection which contains the category list and articles memberships in them.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[http://neo4j.org/ Neo4j] is a graph-based database, which allow a program to create and manipulate graph structures like nodes and relationships.&lt;br /&gt;
&lt;br /&gt;
In order to transfer the database in neo4j format is better save it in a file, which will be read one line at time.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
After transferring the structure in a Neo4j graph is possible to create from it a Pajek file (.net) to make general analysis like described [[Social_Network_Analysis_With_Igraph_Package_Using_R|here]].&lt;br /&gt;
&lt;br /&gt;
==Results of the analysis==&lt;br /&gt;
We can use the tools described [http://igraph.sourceforge.net/doc/R/00Index.html here]&lt;br /&gt;
&lt;br /&gt;
The '''diameter''' of the graph is '''32'''. This is the maximum distance (number of nodes in the minimum path) between two nodes. These two nodes are ''Prehistoric life sorted by geography'' (A category about prehistoric animals without articles) and ''BMW M20'' (a car).&lt;br /&gt;
&lt;br /&gt;
The '''average distance''' between two nodes is '''5.5568'''.&lt;br /&gt;
&lt;br /&gt;
The '''graph density''' is '''6.28*10&amp;lt;sup&amp;gt;-7&amp;lt;/sup&amp;gt;'''. This is the ratio of the number of edges and the number of possible edges.&lt;br /&gt;
&lt;br /&gt;
By analysing the graph with little Java programs written for this task we calculated the average number of categories per article is '''2.68'''.&lt;br /&gt;
&lt;br /&gt;
Statistically, the 93% of articles has less than 7 categories, 64% less than 3.&lt;br /&gt;
&lt;br /&gt;
The article with more categories is [http://en.wikipedia.org/wiki/Winston_Churchill Winston Churchill], with 70 categories. [http://en.wikipedia.org/wiki/Albert_einstein Albert Einstein] has the notable number of 56 categories, too.&lt;br /&gt;
&lt;br /&gt;
===Strongly connected components===&lt;br /&gt;
&lt;br /&gt;
By applying the Tarjan's strongly connected components algorithm to the graph is possible to found 93 structures of up to 2 nodes. Each of them contains at least one cycle. Most of them are composed of two categories about the same thing, like ''History of the Germanic peoples'' and ''Ancient Germanic peoples'', but there are also more curious cases like this one&lt;br /&gt;
[[Image:Struttura fortemente connessa wikipedia.png|400px]]&lt;br /&gt;
&lt;br /&gt;
=== Tested algorithms ===&lt;br /&gt;
We tried 9 algorithms to choose the category which fit best an article. After confronting the results of the automatic procedure with human made assignments, the best algorithm was choose a different weight to each edge by the traversal direction.&lt;br /&gt;
Sizes of the macrocategories determined this way are:&lt;br /&gt;
[[Image:Dimensioni macrocategorie costi.png|400px]]&lt;br /&gt;
&lt;br /&gt;
==Download==&lt;br /&gt;
*The relation about the project (in Italian) can be found [[Media:Relazione progetto WikipediaCatGraph.pdf|here]]&lt;br /&gt;
*Sources in Java can be found [[Media:Wikipedia category Graph-sources.zip|here]]&lt;br /&gt;
&lt;br /&gt;
==Previous Work==&lt;br /&gt;
&lt;br /&gt;
[http://www-users.cs.umn.edu/~echi/papers/2009-CHI2009/p1509.pdf What's in Wikipedia? Mapping Topics and Conflict Using Socially]&lt;/div&gt;</summary>
		<author><name>JacopoFarina</name></author>	</entry>

	<entry>
		<id>https://airwiki.deib.polimi.it/index.php?title=Wikipedia_Category_Graph&amp;diff=12337</id>
		<title>Wikipedia Category Graph</title>
		<link rel="alternate" type="text/html" href="https://airwiki.deib.polimi.it/index.php?title=Wikipedia_Category_Graph&amp;diff=12337"/>
				<updated>2010-09-16T18:57:16Z</updated>
		
		<summary type="html">&lt;p&gt;JacopoFarina: /* Results of the analysis */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Project&lt;br /&gt;
|title=Wikipedia Category Graph&lt;br /&gt;
|short_descr=Represent Wikipedia Categories with a model based on graphs to further analyze it.&lt;br /&gt;
|coordinator=MarcoColombetti&lt;br /&gt;
|tutor=DavidLaniado;RiccardoTasso&lt;br /&gt;
|students=JacopoFarina;&lt;br /&gt;
|resarea=Social Software and Semantic Web&lt;br /&gt;
|restopic=Graph Mining and Analysis&lt;br /&gt;
|start=2010/06/10&lt;br /&gt;
|end=2010/10/01&lt;br /&gt;
|status=Active&lt;br /&gt;
|level=Bs&lt;br /&gt;
|type=Course&lt;br /&gt;
}}&lt;br /&gt;
The goal of the project is to analyze Wikipedia categories by representing them in a graph based database.&lt;br /&gt;
&lt;br /&gt;
Wikipedia categories are not a three-based structure: a category may be contained in another one which is contained in another one which is contained in the first, generating a cyclic reference and many categories may be a root category (non contained in others).&lt;br /&gt;
&lt;br /&gt;
For this reasons a graph database is better to represent the structure.&lt;br /&gt;
==Creation and further analysis of the database with igraph==&lt;br /&gt;
Wikipedia let users download the entire site database (with all versions of all articles) or just some selections of it.&lt;br /&gt;
We use a selection which contains the category list and articles memberships in them.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[http://neo4j.org/ Neo4j] is a graph-based database, which allow a program to create and manipulate graph structures like nodes and relationships.&lt;br /&gt;
&lt;br /&gt;
In order to transfer the database in neo4j format is better save it in a file, which will be read one line at time.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
After transferring the structure in a Neo4j graph is possible to create from it a Pajek file (.net) to make general analysis like described [[Social_Network_Analysis_With_Igraph_Package_Using_R|here]].&lt;br /&gt;
&lt;br /&gt;
==Results of the analysis==&lt;br /&gt;
We can use the tools described [http://igraph.sourceforge.net/doc/R/00Index.html here]&lt;br /&gt;
&lt;br /&gt;
The '''diameter''' of the graph is '''32'''. This is the maximum distance (number of nodes in the minimum path) between two nodes. These two nodes are ''Prehistoric life sorted by geography'' (A category about prehistoric animals without articles) and ''BMW M20'' (a car).&lt;br /&gt;
&lt;br /&gt;
The '''average distance''' between two nodes is '''5.5568'''.&lt;br /&gt;
&lt;br /&gt;
The '''graph density''' is '''6.28*10&amp;lt;sup&amp;gt;-7&amp;lt;/sup&amp;gt;'''. This is the ratio of the number of edges and the number of possible edges.&lt;br /&gt;
&lt;br /&gt;
By analysing the graph with little Java programs written for this task we calculated the average number of categories per article is '''2.68'''.&lt;br /&gt;
&lt;br /&gt;
Statistically, the 93% of articles has less than 7 categories, 64% less than 3.&lt;br /&gt;
&lt;br /&gt;
The article with more categories is [http://en.wikipedia.org/wiki/Winston_Churchill Winston Churchill], with 70 categories. [http://en.wikipedia.org/wiki/Albert_einstein Albert Einstein] has the notable number of 56 categories, too.&lt;br /&gt;
&lt;br /&gt;
===Strongly connected components===&lt;br /&gt;
&lt;br /&gt;
By applying the Tarjan's strongly connected components algorithm to the graph is possible to found 93 structures of up to 2 nodes. Each of them contains at least one cycle. Most of them are composed of two categories about the same thing, like ''History of the Germanic peoples'' and ''Ancient Germanic peoples'', but there are also more curious cases like this one&lt;br /&gt;
[[Image:Struttura fortemente connessa wikipedia.png]]&lt;br /&gt;
&lt;br /&gt;
=== Tested algorithms ===&lt;br /&gt;
We tried 9 algorithms to choose the category which fit best an article. After confronting the results of the automatic procedure with human made assignments, the best algorithm was choose a different weight to each edge by the traversal direction.&lt;br /&gt;
Sizes of the macrocategories determined this way are:&lt;br /&gt;
[[Image:Dimensioni macrocategorie costi.png]]&lt;br /&gt;
&lt;br /&gt;
==Download==&lt;br /&gt;
*The relation about the project (in Italian) can be found [[Media:Relazione progetto WikipediaCatGraph.pdf|here]]&lt;br /&gt;
*Sources in Java can be found [[Media:Wikipedia category Graph-sources.zip|here]]&lt;br /&gt;
&lt;br /&gt;
==Previous Work==&lt;br /&gt;
&lt;br /&gt;
[http://www-users.cs.umn.edu/~echi/papers/2009-CHI2009/p1509.pdf What's in Wikipedia? Mapping Topics and Conflict Using Socially]&lt;/div&gt;</summary>
		<author><name>JacopoFarina</name></author>	</entry>

	<entry>
		<id>https://airwiki.deib.polimi.it/index.php?title=Wikipedia_Category_Graph&amp;diff=12336</id>
		<title>Wikipedia Category Graph</title>
		<link rel="alternate" type="text/html" href="https://airwiki.deib.polimi.it/index.php?title=Wikipedia_Category_Graph&amp;diff=12336"/>
				<updated>2010-09-16T18:49:53Z</updated>
		
		<summary type="html">&lt;p&gt;JacopoFarina: /* Results of the analysis */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Project&lt;br /&gt;
|title=Wikipedia Category Graph&lt;br /&gt;
|short_descr=Represent Wikipedia Categories with a model based on graphs to further analyze it.&lt;br /&gt;
|coordinator=MarcoColombetti&lt;br /&gt;
|tutor=DavidLaniado;RiccardoTasso&lt;br /&gt;
|students=JacopoFarina;&lt;br /&gt;
|resarea=Social Software and Semantic Web&lt;br /&gt;
|restopic=Graph Mining and Analysis&lt;br /&gt;
|start=2010/06/10&lt;br /&gt;
|end=2010/10/01&lt;br /&gt;
|status=Active&lt;br /&gt;
|level=Bs&lt;br /&gt;
|type=Course&lt;br /&gt;
}}&lt;br /&gt;
The goal of the project is to analyze Wikipedia categories by representing them in a graph based database.&lt;br /&gt;
&lt;br /&gt;
Wikipedia categories are not a three-based structure: a category may be contained in another one which is contained in another one which is contained in the first, generating a cyclic reference and many categories may be a root category (non contained in others).&lt;br /&gt;
&lt;br /&gt;
For this reasons a graph database is better to represent the structure.&lt;br /&gt;
==Creation and further analysis of the database with igraph==&lt;br /&gt;
Wikipedia let users download the entire site database (with all versions of all articles) or just some selections of it.&lt;br /&gt;
We use a selection which contains the category list and articles memberships in them.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[http://neo4j.org/ Neo4j] is a graph-based database, which allow a program to create and manipulate graph structures like nodes and relationships.&lt;br /&gt;
&lt;br /&gt;
In order to transfer the database in neo4j format is better save it in a file, which will be read one line at time.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
After transferring the structure in a Neo4j graph is possible to create from it a Pajek file (.net) to make general analysis like described [[Social_Network_Analysis_With_Igraph_Package_Using_R|here]].&lt;br /&gt;
&lt;br /&gt;
==Results of the analysis==&lt;br /&gt;
We can use the tools described [http://igraph.sourceforge.net/doc/R/00Index.html here]&lt;br /&gt;
&lt;br /&gt;
The '''diameter''' of the graph is '''32'''. This is the maximum distance (number of nodes in the minimum path) between two nodes. These two nodes are ''Prehistoric life sorted by geography'' (A category about prehistoric animals without articles) and ''BMW M20'' (a car).&lt;br /&gt;
&lt;br /&gt;
The '''average distance''' between two nodes is '''5.5568'''.&lt;br /&gt;
&lt;br /&gt;
The '''graph density''' is '''6.28*10&amp;lt;sup&amp;gt;-7&amp;lt;/sup&amp;gt;'''. This is the ratio of the number of edges and the number of possible edges.&lt;br /&gt;
&lt;br /&gt;
By analizing the graph with little Java programs written ad hoc we calculated the average number of categories per article is '''2.68'''.&lt;br /&gt;
Also, the 93% of articles has less than 7 categories, 64% less than 3.&lt;br /&gt;
The article with more categories is [http://en.wikipedia.org/wiki/Winston_Churchill Winston Churchill], with 70 categories. [http://en.wikipedia.org/wiki/Albert_einstein Albert Einstein] has the notable number of 56 categories, too.&lt;br /&gt;
&lt;br /&gt;
===Strongly connected components===&lt;br /&gt;
&lt;br /&gt;
By applying the Tarjan's strongly connected components algorithm to the graph is possible to found 93 structures of up to 2 nodes. Each of them contains at least one cycle. Most of them are composed of two categories about the same thing, like ''History of the Germanic peoples'' and ''Ancient Germanic peoples'', but there are also more curious cases like this one&lt;br /&gt;
[[Image:Struttura fortemente connessa wikipedia.png]]&lt;br /&gt;
&lt;br /&gt;
=== Tested algorithms ===&lt;br /&gt;
We tried 9 algorithms to choose the category which fit best an article. After confronting the results of the automatic procedure with human made assignments, the best algorithm was choose a different weight to each edge by the traversal direction.&lt;br /&gt;
Sizes of the macrocategories determined this way are:&lt;br /&gt;
[[Image:Dimensioni macrocategorie costi.png]]&lt;br /&gt;
&lt;br /&gt;
==Download==&lt;br /&gt;
*The relation about the project (in Italian) can be found [[Media:Relazione progetto WikipediaCatGraph.pdf|here]]&lt;br /&gt;
*Sources in Java can be found [[Media:Wikipedia category Graph-sources.zip|here]]&lt;br /&gt;
&lt;br /&gt;
==Previous Work==&lt;br /&gt;
&lt;br /&gt;
[http://www-users.cs.umn.edu/~echi/papers/2009-CHI2009/p1509.pdf What's in Wikipedia? Mapping Topics and Conflict Using Socially]&lt;/div&gt;</summary>
		<author><name>JacopoFarina</name></author>	</entry>

	<entry>
		<id>https://airwiki.deib.polimi.it/index.php?title=File:Wikipedia_category_Graph-sources.zip&amp;diff=12333</id>
		<title>File:Wikipedia category Graph-sources.zip</title>
		<link rel="alternate" type="text/html" href="https://airwiki.deib.polimi.it/index.php?title=File:Wikipedia_category_Graph-sources.zip&amp;diff=12333"/>
				<updated>2010-09-13T10:27:18Z</updated>
		
		<summary type="html">&lt;p&gt;JacopoFarina: uploaded a new version of &amp;quot;Image:Wikipedia category Graph-sources.zip&amp;quot;: I sorgenti del progetto Wikipedia Category Graph&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Java sources of Wikipedia category graph project&lt;/div&gt;</summary>
		<author><name>JacopoFarina</name></author>	</entry>

	<entry>
		<id>https://airwiki.deib.polimi.it/index.php?title=Wikipedia_Category_Graph&amp;diff=12331</id>
		<title>Wikipedia Category Graph</title>
		<link rel="alternate" type="text/html" href="https://airwiki.deib.polimi.it/index.php?title=Wikipedia_Category_Graph&amp;diff=12331"/>
				<updated>2010-09-13T10:18:48Z</updated>
		
		<summary type="html">&lt;p&gt;JacopoFarina: /* Results of the analysis */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Project&lt;br /&gt;
|title=Wikipedia Category Graph&lt;br /&gt;
|short_descr=Represent Wikipedia Categories with a model based on graphs to further analyze it.&lt;br /&gt;
|coordinator=MarcoColombetti&lt;br /&gt;
|tutor=DavidLaniado;RiccardoTasso&lt;br /&gt;
|students=JacopoFarina;&lt;br /&gt;
|resarea=Social Software and Semantic Web&lt;br /&gt;
|restopic=Graph Mining and Analysis&lt;br /&gt;
|start=2010/06/10&lt;br /&gt;
|end=2010/10/01&lt;br /&gt;
|status=Active&lt;br /&gt;
|level=Bs&lt;br /&gt;
|type=Course&lt;br /&gt;
}}&lt;br /&gt;
The goal of the project is to analyze Wikipedia categories by representing them in a graph based database.&lt;br /&gt;
&lt;br /&gt;
Wikipedia categories are not a three-based structure: a category may be contained in another one which is contained in another one which is contained in the first, generating a cyclic reference and many categories may be a root category (non contained in others).&lt;br /&gt;
&lt;br /&gt;
For this reasons a graph database is better to represent the structure.&lt;br /&gt;
==Creation and further analysis of the database with igraph==&lt;br /&gt;
Wikipedia let users download the entire site database (with all versions of all articles) or just some selections of it.&lt;br /&gt;
We use a selection which contains the category list and articles memberships in them.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[http://neo4j.org/ Neo4j] is a graph-based database, which allow a program to create and manipulate graph structures like nodes and relationships.&lt;br /&gt;
&lt;br /&gt;
In order to transfer the database in neo4j format is better save it in a file, which will be read one line at time.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
After transferring the structure in a Neo4j graph is possible to create from it a Pajek file (.net) to make general analysis like described [[Social_Network_Analysis_With_Igraph_Package_Using_R|here]].&lt;br /&gt;
&lt;br /&gt;
==Results of the analysis==&lt;br /&gt;
We can use the tools described [http://igraph.sourceforge.net/doc/R/00Index.html here]&lt;br /&gt;
&lt;br /&gt;
The '''diameter''' of the graph is 32. This is the maximum distance (number of nodes in the minimum path) between two nodes. These two nodes are ''Prehistoric life sorted by geography'' (A category about prehistoric animals without articles) and ''BMW M20'' (a car).&lt;br /&gt;
&lt;br /&gt;
The '''average distance''' between two nodes is 5.5568.&lt;br /&gt;
&lt;br /&gt;
The '''graph density''' is 6.28*10&amp;lt;sup&amp;gt;-7&amp;lt;/sup&amp;gt;. This is the ratio of the number of edges and the number of possible edges.&lt;br /&gt;
&lt;br /&gt;
===Strongly connected components===&lt;br /&gt;
&lt;br /&gt;
By applying the Tarjan's strongly connected components algorithm to the graph is possible to found 93 structures of up to 2 nodes. Each of them contains at least one cycle. Most of them are composed of two categories about the same thing, like ''History of the Germanic peoples'' and ''Ancient Germanic peoples'', but there are also more curious cases like this one&lt;br /&gt;
[[Image:Struttura fortemente connessa wikipedia.png]]&lt;br /&gt;
&lt;br /&gt;
=== Tested algorithms ===&lt;br /&gt;
We tried 9 algorithms to choose the category which fit best an article. After confronting the results of the automatic procedure with human made assignments, the best algorithm was choose a different weight to each edge by the traversal direction.&lt;br /&gt;
Sizes of the macrocategories determined this way are:&lt;br /&gt;
[[Image:Dimensioni macrocategorie costi.png]]&lt;br /&gt;
&lt;br /&gt;
==Download==&lt;br /&gt;
*The relation about the project (in Italian) can be found [[Media:Relazione progetto WikipediaCatGraph.pdf|here]]&lt;br /&gt;
*Sources in Java can be found [[Media:Wikipedia category Graph-sources.zip|here]]&lt;br /&gt;
&lt;br /&gt;
==Previous Work==&lt;br /&gt;
&lt;br /&gt;
[http://www-users.cs.umn.edu/~echi/papers/2009-CHI2009/p1509.pdf What's in Wikipedia? Mapping Topics and Conflict Using Socially]&lt;/div&gt;</summary>
		<author><name>JacopoFarina</name></author>	</entry>

	<entry>
		<id>https://airwiki.deib.polimi.it/index.php?title=File:Dimensioni_macrocategorie_costi.png&amp;diff=12330</id>
		<title>File:Dimensioni macrocategorie costi.png</title>
		<link rel="alternate" type="text/html" href="https://airwiki.deib.polimi.it/index.php?title=File:Dimensioni_macrocategorie_costi.png&amp;diff=12330"/>
				<updated>2010-09-13T10:18:33Z</updated>
		
		<summary type="html">&lt;p&gt;JacopoFarina: Dimensione delle macrocategorie scelte con costi di attraversamento differenziato.&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Dimensione delle macrocategorie scelte con costi di attraversamento differenziato.&lt;/div&gt;</summary>
		<author><name>JacopoFarina</name></author>	</entry>

	<entry>
		<id>https://airwiki.deib.polimi.it/index.php?title=File:Struttura_fortemente_connessa_wikipedia.png&amp;diff=12328</id>
		<title>File:Struttura fortemente connessa wikipedia.png</title>
		<link rel="alternate" type="text/html" href="https://airwiki.deib.polimi.it/index.php?title=File:Struttura_fortemente_connessa_wikipedia.png&amp;diff=12328"/>
				<updated>2010-09-13T10:11:18Z</updated>
		
		<summary type="html">&lt;p&gt;JacopoFarina: Una struttura fortemente connessa all'interno di en.Wikipedia&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Una struttura fortemente connessa all'interno di en.Wikipedia&lt;/div&gt;</summary>
		<author><name>JacopoFarina</name></author>	</entry>

	<entry>
		<id>https://airwiki.deib.polimi.it/index.php?title=File:Wikipedia_category_Graph-sources.zip&amp;diff=12318</id>
		<title>File:Wikipedia category Graph-sources.zip</title>
		<link rel="alternate" type="text/html" href="https://airwiki.deib.polimi.it/index.php?title=File:Wikipedia_category_Graph-sources.zip&amp;diff=12318"/>
				<updated>2010-09-12T13:00:51Z</updated>
		
		<summary type="html">&lt;p&gt;JacopoFarina: uploaded a new version of &amp;quot;Image:Wikipedia category Graph-sources.zip&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Java sources of Wikipedia category graph project&lt;/div&gt;</summary>
		<author><name>JacopoFarina</name></author>	</entry>

	<entry>
		<id>https://airwiki.deib.polimi.it/index.php?title=File:Wikipedia_category_Graph-sources.zip&amp;diff=12306</id>
		<title>File:Wikipedia category Graph-sources.zip</title>
		<link rel="alternate" type="text/html" href="https://airwiki.deib.polimi.it/index.php?title=File:Wikipedia_category_Graph-sources.zip&amp;diff=12306"/>
				<updated>2010-09-05T14:26:15Z</updated>
		
		<summary type="html">&lt;p&gt;JacopoFarina: uploaded a new version of &amp;quot;Image:Wikipedia category Graph-sources.zip&amp;quot;: Java sources of Wikipedia category graph project&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Java sources of Wikipedia category graph project&lt;/div&gt;</summary>
		<author><name>JacopoFarina</name></author>	</entry>

	<entry>
		<id>https://airwiki.deib.polimi.it/index.php?title=Wikipedia_Category_Graph&amp;diff=12305</id>
		<title>Wikipedia Category Graph</title>
		<link rel="alternate" type="text/html" href="https://airwiki.deib.polimi.it/index.php?title=Wikipedia_Category_Graph&amp;diff=12305"/>
				<updated>2010-09-05T14:18:45Z</updated>
		
		<summary type="html">&lt;p&gt;JacopoFarina: aggiornati dopo filtraggio categorie non semantiche&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Project&lt;br /&gt;
|title=Wikipedia Category Graph&lt;br /&gt;
|short_descr=Represent Wikipedia Categories with a model based on graphs to further analyze it.&lt;br /&gt;
|coordinator=MarcoColombetti&lt;br /&gt;
|tutor=DavidLaniado;RiccardoTasso&lt;br /&gt;
|students=JacopoFarina;&lt;br /&gt;
|resarea=Social Software and Semantic Web&lt;br /&gt;
|restopic=Graph Mining and Analysis&lt;br /&gt;
|start=2010/06/10&lt;br /&gt;
|end=2010/10/01&lt;br /&gt;
|status=Active&lt;br /&gt;
|level=Bs&lt;br /&gt;
|type=Course&lt;br /&gt;
}}&lt;br /&gt;
The goal of the project is to analyze Wikipedia categories by representing them in a graph based database.&lt;br /&gt;
&lt;br /&gt;
Wikipedia categories are not a three-based structure: a category may be contained in another one which is contained in another one which is contained in the first, generating a cyclic reference and many categories may be a root category (non contained in others).&lt;br /&gt;
&lt;br /&gt;
For this reasons a graph database is better to represent the structure.&lt;br /&gt;
==Creation and further analysis of the database with igraph==&lt;br /&gt;
Wikipedia let users download the entire site database (with all versions of all articles) or just some selections of it.&lt;br /&gt;
We use a selection which contains the category list and articles memberships in them.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[http://neo4j.org/ Neo4j] is a graph-based database, which allow a program to create and manipulate graph structures like nodes and relationships.&lt;br /&gt;
&lt;br /&gt;
In order to transfer the database in neo4j format is better save it in a file, which will be read one line at time.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
After transferring the structure in a Neo4j graph is possible to create from it a Pajek file (.net) to make general analysis like described [[Social_Network_Analysis_With_Igraph_Package_Using_R|here]].&lt;br /&gt;
&lt;br /&gt;
===Results of the analysis===&lt;br /&gt;
We can use the tools described [http://igraph.sourceforge.net/doc/R/00Index.html here]&lt;br /&gt;
&lt;br /&gt;
The '''diameter''' of the graph is 32. This is the maximum distance (number of nodes in the minimum path) between two nodes. These two nodes are ''Prehistoric life sorted by geography'' (A category about prehistoric animals without articles) and ''BMW M20'' (a car).&lt;br /&gt;
&lt;br /&gt;
The '''average distance''' between two nodes is 5.5568.&lt;br /&gt;
&lt;br /&gt;
The '''graph density''' is 6.28*10&amp;lt;sup&amp;gt;-7&amp;lt;/sup&amp;gt;. This is the ratio of the number of edges and the number of possible edges.&lt;br /&gt;
&lt;br /&gt;
==Download==&lt;br /&gt;
*The relation about the project (in Italian) can be found [[Media:Relazione progetto WikipediaCatGraph.pdf|here]]&lt;br /&gt;
*Sources in Java can be found [[Media:Wikipedia category Graph-sources.zip|here]]&lt;br /&gt;
&lt;br /&gt;
==Previous Work==&lt;br /&gt;
&lt;br /&gt;
[http://www-users.cs.umn.edu/~echi/papers/2009-CHI2009/p1509.pdf What's in Wikipedia? Mapping Topics and Conflict Using Socially]&lt;/div&gt;</summary>
		<author><name>JacopoFarina</name></author>	</entry>

	<entry>
		<id>https://airwiki.deib.polimi.it/index.php?title=Wikipedia_Category_Graph&amp;diff=12304</id>
		<title>Wikipedia Category Graph</title>
		<link rel="alternate" type="text/html" href="https://airwiki.deib.polimi.it/index.php?title=Wikipedia_Category_Graph&amp;diff=12304"/>
				<updated>2010-09-05T14:15:39Z</updated>
		
		<summary type="html">&lt;p&gt;JacopoFarina: aggiunto link per scaricare sorgenti&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Project&lt;br /&gt;
|title=Wikipedia Category Graph&lt;br /&gt;
|short_descr=Represent Wikipedia Categories with a model based on graphs to further analyze it.&lt;br /&gt;
|coordinator=MarcoColombetti&lt;br /&gt;
|tutor=DavidLaniado;RiccardoTasso&lt;br /&gt;
|students=JacopoFarina;&lt;br /&gt;
|resarea=Social Software and Semantic Web&lt;br /&gt;
|restopic=Graph Mining and Analysis&lt;br /&gt;
|start=2010/06/10&lt;br /&gt;
|end=2010/10/01&lt;br /&gt;
|status=Active&lt;br /&gt;
|level=Bs&lt;br /&gt;
|type=Course&lt;br /&gt;
}}&lt;br /&gt;
The goal of the project is to analyze Wikipedia categories by representing them in a graph based database.&lt;br /&gt;
&lt;br /&gt;
Wikipedia categories are not a three-based structure: a category may be contained in another one which is contained in another one which is contained in the first, generating a cyclic reference and many categories may be a root category (non contained in others).&lt;br /&gt;
&lt;br /&gt;
For this reasons a graph database is better to represent the structure.&lt;br /&gt;
==Creation and further analysis of the database with igraph==&lt;br /&gt;
Wikipedia let users download the entire site database (with all versions of all articles) or just some selections of it.&lt;br /&gt;
We use a selection which contains the category list and articles memberships in them.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[http://neo4j.org/ Neo4j] is a graph-based database, which allow a program to create and manipulate graph structures like nodes and relationships.&lt;br /&gt;
&lt;br /&gt;
In order to transfer the database in neo4j format is better save it in a file, which will be read one line at time.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
After transferring the structure in a Neo4j graph is possible to create from it a Pajek file (.net) to make general analysis like described [[Social_Network_Analysis_With_Igraph_Package_Using_R|here]].&lt;br /&gt;
&lt;br /&gt;
===Results of the analysis===&lt;br /&gt;
We can use the tools described [http://igraph.sourceforge.net/doc/R/00Index.html here]&lt;br /&gt;
&lt;br /&gt;
The '''diameter''' of the graph is 19. This is the maximum distance (number of nodes in the minimum path) between two nodes. These two nodes are ''Bowers Hill'' (A Virginia community) and ''m,n,k-game'' (a board game).&lt;br /&gt;
&lt;br /&gt;
The '''average distance''' between two nodes is 4.781262.&lt;br /&gt;
&lt;br /&gt;
The '''graph density''' is 1.408986 10&amp;lt;sup&amp;gt;-7&amp;lt;/sup&amp;gt;. This is the ratio of the number of edges and the number of possible edges.&lt;br /&gt;
&lt;br /&gt;
==Download==&lt;br /&gt;
*The relation about the project (in Italian) can be found [[Media:Relazione progetto WikipediaCatGraph.pdf|here]]&lt;br /&gt;
*Sources in Java can be found [[Media:Wikipedia category Graph-sources.zip|here]]&lt;br /&gt;
&lt;br /&gt;
==Previous Work==&lt;br /&gt;
&lt;br /&gt;
[http://www-users.cs.umn.edu/~echi/papers/2009-CHI2009/p1509.pdf What's in Wikipedia? Mapping Topics and Conflict Using Socially]&lt;/div&gt;</summary>
		<author><name>JacopoFarina</name></author>	</entry>

	<entry>
		<id>https://airwiki.deib.polimi.it/index.php?title=File:Wikipedia_category_Graph-sources.zip&amp;diff=12303</id>
		<title>File:Wikipedia category Graph-sources.zip</title>
		<link rel="alternate" type="text/html" href="https://airwiki.deib.polimi.it/index.php?title=File:Wikipedia_category_Graph-sources.zip&amp;diff=12303"/>
				<updated>2010-09-05T14:14:50Z</updated>
		
		<summary type="html">&lt;p&gt;JacopoFarina: Java sources of Wikipedia category graph project&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Java sources of Wikipedia category graph project&lt;/div&gt;</summary>
		<author><name>JacopoFarina</name></author>	</entry>

	<entry>
		<id>https://airwiki.deib.polimi.it/index.php?title=Wikipedia_Category_Graph&amp;diff=12302</id>
		<title>Wikipedia Category Graph</title>
		<link rel="alternate" type="text/html" href="https://airwiki.deib.polimi.it/index.php?title=Wikipedia_Category_Graph&amp;diff=12302"/>
				<updated>2010-09-05T13:56:11Z</updated>
		
		<summary type="html">&lt;p&gt;JacopoFarina: pubblico pdf relazione&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Project&lt;br /&gt;
|title=Wikipedia Category Graph&lt;br /&gt;
|short_descr=Represent Wikipedia Categories with a model based on graphs to further analyze it.&lt;br /&gt;
|coordinator=MarcoColombetti&lt;br /&gt;
|tutor=DavidLaniado;RiccardoTasso&lt;br /&gt;
|students=JacopoFarina;&lt;br /&gt;
|resarea=Social Software and Semantic Web&lt;br /&gt;
|restopic=Graph Mining and Analysis&lt;br /&gt;
|start=2010/06/10&lt;br /&gt;
|end=2010/10/01&lt;br /&gt;
|status=Active&lt;br /&gt;
|level=Bs&lt;br /&gt;
|type=Course&lt;br /&gt;
}}&lt;br /&gt;
The goal of the project is to analyze Wikipedia categories by representing them in a graph based database.&lt;br /&gt;
&lt;br /&gt;
Wikipedia categories are not a three-based structure: a category may be contained in another one which is contained in another one which is contained in the first, generating a cyclic reference and many categories may be a root category (non contained in others).&lt;br /&gt;
&lt;br /&gt;
For this reasons a graph database is better to represent the structure.&lt;br /&gt;
==Creation and further analysis of the database with igraph==&lt;br /&gt;
Wikipedia let users download the entire site database (with all versions of all articles) or just some selections of it.&lt;br /&gt;
We use a selection which contains the category list and articles memberships in them.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[http://neo4j.org/ Neo4j] is a graph-based database, which allow a program to create and manipulate graph structures like nodes and relationships.&lt;br /&gt;
&lt;br /&gt;
In order to transfer the database in neo4j format is better save it in a file, which will be read one line at time.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
After transferring the structure in a Neo4j graph is possible to create from it a Pajek file (.net) to make general analysis like described [[Social_Network_Analysis_With_Igraph_Package_Using_R|here]].&lt;br /&gt;
&lt;br /&gt;
===Results of the analysis===&lt;br /&gt;
We can use the tools described [http://igraph.sourceforge.net/doc/R/00Index.html here]&lt;br /&gt;
&lt;br /&gt;
The '''diameter''' of the graph is 19. This is the maximum distance (number of nodes in the minimum path) between two nodes. These two nodes are ''Bowers Hill'' (A Virginia community) and ''m,n,k-game'' (a board game).&lt;br /&gt;
&lt;br /&gt;
The '''average distance''' between two nodes is 4.781262.&lt;br /&gt;
&lt;br /&gt;
The '''graph density''' is 1.408986 10&amp;lt;sup&amp;gt;-7&amp;lt;/sup&amp;gt;. This is the ratio of the number of edges and the number of possible edges.&lt;br /&gt;
&lt;br /&gt;
==Download==&lt;br /&gt;
The relation about the project (in Italian) can be found [[Media:Relazione progetto WikipediaCatGraph.pdf|here]]&lt;br /&gt;
&lt;br /&gt;
==Previous Work==&lt;br /&gt;
&lt;br /&gt;
[http://www-users.cs.umn.edu/~echi/papers/2009-CHI2009/p1509.pdf What's in Wikipedia? Mapping Topics and Conflict Using Socially]&lt;/div&gt;</summary>
		<author><name>JacopoFarina</name></author>	</entry>

	<entry>
		<id>https://airwiki.deib.polimi.it/index.php?title=File:Relazione_progetto_WikipediaCatGraph.pdf&amp;diff=12301</id>
		<title>File:Relazione progetto WikipediaCatGraph.pdf</title>
		<link rel="alternate" type="text/html" href="https://airwiki.deib.polimi.it/index.php?title=File:Relazione_progetto_WikipediaCatGraph.pdf&amp;diff=12301"/>
				<updated>2010-09-05T13:54:25Z</updated>
		
		<summary type="html">&lt;p&gt;JacopoFarina: Relazione del progetto Wikipedia Category Graph sulla rappresentazione delle categorie e delle pagine di en.Wikipedia tramite un grafo per effettuare successive analisi.&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Relazione del progetto Wikipedia Category Graph sulla rappresentazione delle categorie e delle pagine di en.Wikipedia tramite un grafo per effettuare successive analisi.&lt;/div&gt;</summary>
		<author><name>JacopoFarina</name></author>	</entry>

	<entry>
		<id>https://airwiki.deib.polimi.it/index.php?title=User:JacopoFarina&amp;diff=12287</id>
		<title>User:JacopoFarina</title>
		<link rel="alternate" type="text/html" href="https://airwiki.deib.polimi.it/index.php?title=User:JacopoFarina&amp;diff=12287"/>
				<updated>2010-09-01T20:40:30Z</updated>
		
		<summary type="html">&lt;p&gt;JacopoFarina: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Student&lt;br /&gt;
|category=Student&lt;br /&gt;
|firstname=Jacopo&lt;br /&gt;
|lastname=Farina&lt;br /&gt;
|email=jacopo1.farina@mail.polimi.it&lt;br /&gt;
|advisor=RiccardoTasso; DavidLaniado;&lt;br /&gt;
|status=active&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
I'm currently working on a third-year thesis about [[WikipediaCategoryGraph|analyzing the Wikipedia graph of categories and pages]].&lt;/div&gt;</summary>
		<author><name>JacopoFarina</name></author>	</entry>

	<entry>
		<id>https://airwiki.deib.polimi.it/index.php?title=Tesi&amp;diff=12118</id>
		<title>Tesi</title>
		<link rel="alternate" type="text/html" href="https://airwiki.deib.polimi.it/index.php?title=Tesi&amp;diff=12118"/>
				<updated>2010-08-10T15:47:53Z</updated>
		
		<summary type="html">&lt;p&gt;JacopoFarina: spazio&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt;''Un semplice vademecum alle tesi... ''&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;u&amp;gt;Se siete qui per cercare una tesi... peccato, siete nel posto sbagliato!&amp;lt;/u&amp;gt; Le tesi disponibili presso il laboratorio di Intelligenza Artificiale e Robotica vengono raccolte [[Project Proposals|nell'apposita sezione di airwiki]]. Ovviamente, se avete idee che vorreste sviluppare come tesi siente invitati a contattare i docenti che fanno riferimento al laboratorio e proporle... '''ben venga l'iniziativa!''' Comunque vi suggeriamo di andare avanti con la lettura di questa pagina che prima o poi vi servirà ;-).&lt;br /&gt;
&lt;br /&gt;
Quanto segue è una raccolta di informazioni utili alla stesura di un manoscritto che sia degno di chiamarsi tesi e può considerarsi a buon titolo una ''mini guida'' alla scrittura della relazione finale di tesi. &amp;lt;u&amp;gt;Non pretende d'essere la guida definitiva&amp;lt;/u&amp;gt;: è da considerarsi semplicemente un elenco delle cose che continuiamo a correggere in tutte le tesi e che non sarebbe male &amp;quot;prevenire invece che curare&amp;quot; :-P. &lt;br /&gt;
&lt;br /&gt;
Un grazie particolare va alle persone che hanno donato del tempo prezioso alla raccolta del materiale che segue (in rigoroso ordine alfabetico!): Francesco Amigoni, Andrea Bonarini, Marco Colombetti, Nicola Gatti, Matteo Matteucci, Marcello Restelli, Marco Somalvico... e se abbiamo dimenticato qualcuno segnalatelo, saremo ben felici di aggiungere il suo nome!&lt;br /&gt;
&lt;br /&gt;
==Aspetti (Molto) Generali==&lt;br /&gt;
&lt;br /&gt;
Quanto segue potrebbe sembrare forse superfluo, ma non è male che sia chiaro fin dall'inizio perché guidi la scrittura di ogni singola pagina del vostro manoscritto...&lt;br /&gt;
&lt;br /&gt;
* Cercate di capire fin da subito le regole del gioco: per tale motivo vi consigliamo di leggere il regolamento delle tesi per la Laurea Specialistica che trovate [http://www.polimi.it/cms/file/3001/regolamento_LS_131207.pdf qui].&lt;br /&gt;
&lt;br /&gt;
* Non mettere nella relazione tutto ciò che si conosce, ma solo ciò che è strettamente legato all'argomento trattato e funzionale alla trattazione dello stesso.&lt;br /&gt;
&lt;br /&gt;
* Organizzare i concetti in una struttura razionale. Non lasciare concetti importanti sottintesi. Non ripetere in continuazione i concetti se non esplicitamente richiesto dalla chiarezza della trattazione.&lt;br /&gt;
&lt;br /&gt;
* Rileggere più e più volte quanto si scrive. Fare leggere lo scritto anche ad altre persone che non sono a conoscenza dell'argomento (si suggerisce di rileggere su carta e non a video quanto scritto). Per facilitare il compito a loro (e al vostro relatore che leggerà tutta la tesi almeno una volta, ... usate il &amp;lt;u&amp;gt;correttore ortografico&amp;lt;/u&amp;gt; PRIMA di darla da leggere.&lt;br /&gt;
&lt;br /&gt;
La durata di una tesi non e' predeterminata. Di fatto una tesi si puo' dir conclusa quando sono soddisfatti (ovviamente in maniera piu' o meno sfumata) alcuni requisiti:&lt;br /&gt;
&lt;br /&gt;
*  quando si hanno dei risultati di cui dar conto nella tesi, che comportino una certa dose di innovazione scientifica o anche solo tecnologica&lt;br /&gt;
&lt;br /&gt;
*  quando e' stata scritta e riletta&lt;br /&gt;
&lt;br /&gt;
*  quando la documentazione e' completa&lt;br /&gt;
&lt;br /&gt;
*  quando e' stata scritta una paginetta su airwiki in inglese con la descrizione della tesi, una demo, tutto il materiale che serve per far funzionare o per capire il contenuto della tesi &lt;br /&gt;
	&lt;br /&gt;
==Interazione con il relatore/correlatore==&lt;br /&gt;
&lt;br /&gt;
Il relatore (o correlatore) è la persona responsabile del vostro operato da tesista. Il suo compito è seguirvi nello svolgimento della stessa e far sì che il risultato finale sia degno di essere letto sia dal punto di vista della forma sia dal punto di vista del contenuto. Ricordate che alla fine starà a lui valutare la tesi e accreditarne i corrispondenti crediti (un tempo questo veniva fatto compilando la seguente scheda che trovate [[Media:SchedaValutazioneTesi.pdf|qui]]... ''just for you to know''...). Ecco alcune buone regole per interagire con il vostro relatore/correlatore:&lt;br /&gt;
&lt;br /&gt;
* Il relatore/correlatore deve rileggere la tesi e quindi deve avere in mano una versione completa della stessa &amp;lt;u&amp;gt;almeno un mese prima della scadenza&amp;lt;/u&amp;gt; per la consegna. Ovviamente, deve essere una bozza e non la stesura finale, ma deve avere già la struttura e i contenuti della stesura finale. Ancor meglio se mentre la state scrivendo fate avere i capitoli al vostro relatore/correlatore perché li legga uno alla volta.&lt;br /&gt;
&lt;br /&gt;
* Non date un manoscritto mai riletto da leggere al relatore/correlatore; questi deve potersi concentrare sui contenuti e sulla struttura espositiva, non sulla forma dell'italiano (o inglese nel caso scriviate in inglese, nel qual caso e' meglio se usate oltre al correttore ortografico anche quello grammaticale).&lt;br /&gt;
	&lt;br /&gt;
* Il relatore/correlatore deve seguirvi nello svolgimento della tesi dandovi indicazioni e suggerimenti ogni volta che vi bloccate in un qualche punto... ma va da sè che la tesi non deve farla lui ;-)&lt;br /&gt;
&lt;br /&gt;
* Interagite frequentemente con relatore e correlatore/i. Questo può essere fatto in vari modi; si suggerisce un incontro con cadenza almeno quindicinale e un'email alla settimana in cui si descrivono le attività fatte di recente. Qualora non si sia fatto niente di nuovo va benissimo una mail che lo dice, giusto per sapere che esistete e che non vi siete dimenticati di noi poveri relatori/correlatori.&lt;br /&gt;
&lt;br /&gt;
* Alla fine della tesi dovrete consegnarne una copia al relatore in formato cartaceo e in formato digitale (su opportuno CD con titolo della stessa e autori sul CD!). E' ''carino'' se consegnate una versione cartacea della tesi anche ai vari correlatori!&lt;br /&gt;
	&lt;br /&gt;
==Strumenti per la scrittura==&lt;br /&gt;
&lt;br /&gt;
Premettiamo che la qualità di una tesi prescinde dallo strumento utilizzato per la sua stesura. Per esperienza passata nostra e di colleghi che vi hanno preceduto suggeriamo di usare '''LaTeX''' e a tale scopo forniamo un &amp;quot;pacchetto&amp;quot; con uno stile che già rispetta tutte le indicazioni di formattazione che seguiranno e relative istruzioni su come usarlo. Lo potete scaricare dal seguente [[Media:SchemaTesi.tgz|link]]. (NB: sappiamo benissimo che questo link andrebbe messo in un punto molto più visibile della pagina, ma l'intenzione è constringervi a leggere buona parte del contenuto della stessa :-P).&lt;br /&gt;
&lt;br /&gt;
* '''LaTeX:''' è un sistema di editing che permette di scrivere documenti in puro formato testo e successivamente di compilarli in un documento &amp;lt;u&amp;gt;Postscript&amp;lt;/u&amp;gt; o &amp;lt;u&amp;gt;pdf&amp;lt;/u&amp;gt;. Di norma è già installato sui sistemi Linux (pachetto tetex) mentre lo si può scaricare liberamente per sistemi Windows dal sito del progetto MikTex: http://www.miktex.org . Una guida molto nota per la scrittura di documenti LaTeX è [http://www.ctan.org/tex-archive/info/lshort/english/lshort.pdf questa] e comunque è possibile trovarne diverse in rete.&lt;br /&gt;
 &lt;br /&gt;
* '''Editor:''' essendo LaTeX un linguaggio di formattazione basato su TAG un buon editor che supporti la stesura della tesi è fondamentale. In ambiente Linux si suggeriscono ''Kile'', ''Emacs'', ''lyx'' o ''gedit''. In ambiente Windows suggeriamo l'uso di ''WinEdt'' (scaricabile dal sito http://www.winedt.com) o TeXnicCenter (scaricabile dal sito http://www.toolscenter.org). &amp;lt;u&amp;gt;Sconsigliamo fortemente&amp;lt;/u&amp;gt; Scientific Workplace in quanto il risultato non è LaTeX standard e quindi non è nè portatile nè leggibile! &lt;br /&gt;
&lt;br /&gt;
* '''Grafici:''' per il plotting di dati numeri in grafici 2D e 3D suggeriamo ''gnuplot'' (http://www.gnuplot.info); esiste sia per Linux sia per Windows e trovate ottimi tutorial in rete su come si usa. Se usate grafici realizzati in Matlab o Excel, tenete conto che tutti gli assi devono essere associati a un'indicazione di cosa contengono e che questi strumenti spesso &amp;quot;autoscalano&amp;quot; il grafico, rendendo due grafici da comparare spesso non confrontabili.&lt;br /&gt;
&lt;br /&gt;
* '''Immagini vettoriali:''' per disegnare immagini vettoriali tipo schemi a blocchi o simili suggeriamo l'uso di DIA scaricabile gratuitamente da http://www.gnome.org/projects/dia/. Lo strumento non è potente come ''xfig'' (http://www.xfig.org), ma è decisamente più user friendly ed esiste sia per Linux che per Windows! &lt;br /&gt;
&lt;br /&gt;
* '''Immagini bitmap:''' ''Gimp'' (http://www.gimp.org) è decisamente il miglior software di fotoritocco e grafica bitmap free che esiste. Basti dire che lo scopo del progetto è lo sviluppo di un clone di Photoshop! Esiste sia per Linux che per Windows.&lt;br /&gt;
&lt;br /&gt;
* '''Citazioni e BibTeX:''' per gestire il file delle citazioni della tesi suggeriamo di usare ''jabref'' (http://jabref.sourceforge.net). E' in java e quindi multipiattaforma.&lt;br /&gt;
&lt;br /&gt;
* '''Conversione formati immagine:''' ''convert'' sotto Linux è un must!&lt;br /&gt;
&lt;br /&gt;
Ovviamente non vi &amp;quot;impediamo&amp;quot; di usare Word e la suite Office di Microsoft per la scrittura della tesi, ma il tempo speso nell'imparare a usare LaTeX sarà ampiamente ripagato dalla qualità del prodotto finale, dall'acquisizione di una nuova competenza e dalla totale assenza di problemi nella formattazione del testo (anche se vi preannunciamo un minimo di fatica per domare il posizionamento automatico delle figure). Per dimostrarvi la nostra &amp;quot;buona fede&amp;quot; in tal senso forniamo anche uno stile Word per la scrittura di tesi scaricabile dal seguente [[Media:SchemaTesi.doc|link]]. In ogni caso se proprio volete usare Microsoft Word per scrivere la tesi vi suggeriamo di scrivere ogni capitolo in un file separato in modo da limitare la dimensione dell'oggetto maneggiato, soprattutto quando avete figure e grafici con molti punti.&lt;br /&gt;
&lt;br /&gt;
==Struttura e contenuto==&lt;br /&gt;
 &lt;br /&gt;
La struttura di una tesi deve tener presenti alcune semplici regole affinchè la lettura sia facilitata da un filo logico che vede il tema principale della tesi sempre al centro di ogni sua parte. Quindi:  &lt;br /&gt;
&lt;br /&gt;
* Devono esserci delle motivazioni per lo stato dell'arte e quindi deve essere funzionale al contenuto della tesi.&lt;br /&gt;
&lt;br /&gt;
* Già dall'indice (e quindi dai titoli di capitoli e paragrafi) si dovrebbe riuscire a capire una linea di ragionamento.&lt;br /&gt;
&lt;br /&gt;
* La lunghezza complessiva della tesi dovrebbe essere di 100-150 pagine + appendici (max 100 pagine).&lt;br /&gt;
&lt;br /&gt;
* La lunghezza media di una frase è di circa 3-5 righe. Non superare mai le 10 :-)&lt;br /&gt;
&lt;br /&gt;
* Inserire degli esempi per rendere più chiara la trattazione.&lt;br /&gt;
&lt;br /&gt;
* Preferire la prima persona plurale (noi...) invece della prima singolare (io...) che è troppo presuntuosa e della forma impersonale (si è fatto ...) che rende difficile cogliere il vostro contributo e distinguerlo da quello di altri, oltre a far perdere forza all'argomentazione.&lt;br /&gt;
&lt;br /&gt;
Esempi e template di tesi sono disponibili [[Media:SchemaTesi.tgz|link]].  Non considerate la struttura come intoccabile; anzi, è probabile che se non la modificate vuol dire che non avete chiaro come va strutturata la tesi.  Nel caso, rileggete i consigli sopra e parlate col vostro relatore.&lt;br /&gt;
&lt;br /&gt;
==Elementi di stile==&lt;br /&gt;
&lt;br /&gt;
* Non mettere virgole fra il soggetto e il predicato.&lt;br /&gt;
** Sbagliato: ''Questo comportamento, ha indotto...''&lt;br /&gt;
** Corretto: ''Questo comportamento ha indotto...''&lt;br /&gt;
&lt;br /&gt;
* Rispetto al lavoro svolto, scrivere al passato e non al presente o, peggio, al futuro. In pratica non si usa mai il tempo futuro se non nel capitolo &amp;quot;Conclusioni e sviluppi futuri&amp;quot;!&lt;br /&gt;
** Sbagliato: ''... in questa tesi si vuole realizzare un sistema...''&lt;br /&gt;
** Corretto: ''... in questa tesi è stato realizzato un sistema...''&lt;br /&gt;
&lt;br /&gt;
* Scrivere in positivo e non in negativo.&lt;br /&gt;
** Sbagliato: ''... il nostro sistema non si propone come una soluzione per il problema A...''&lt;br /&gt;
** Corretto: ''... il nostro sistema si propone come una soluzione per il problema B...''&lt;br /&gt;
&lt;br /&gt;
* Non scrivere come si parla.&lt;br /&gt;
** Sbagliato: ''... l'idea suona grossomodo così...''&lt;br /&gt;
** Corretto: ''... l'idea si può esprimere come segue:...''&lt;br /&gt;
** Sbagliato: ''... e quindi salire con i valori del parametro...''&lt;br /&gt;
** Corretto: ''... e quindi aumentare i valori del parametro...''&lt;br /&gt;
** Sbagliato: ''... tale risultato non è da prendere come un fallimento...''&lt;br /&gt;
** Corretto: ''... tale risultato non è da considerare un fallimento...''&lt;br /&gt;
&lt;br /&gt;
* Scrivere in italiano cercando di eliminare i termini inglesi (nel limite del possibile: per esempio hardware, software e file sono intraducibili).&lt;br /&gt;
** Sbagliato: ''... multiagent...''&lt;br /&gt;
** Corretto: ''... multiagente (oppure: molti agenti)...''&lt;br /&gt;
** Sbagliato: ''... game theory...''&lt;br /&gt;
** Corretto: ''... teoria dei giochi...''&lt;br /&gt;
** Sbagliato: ''... self-interested...''&lt;br /&gt;
** Corretto: ''... egoista...''&lt;br /&gt;
&lt;br /&gt;
* Se si usano delle parole in inglese, non mettere la &amp;quot;s&amp;quot; per il plurale.&lt;br /&gt;
** Sbagliato: ''... i robots...''&lt;br /&gt;
** Corretto: ''... i robot...''&lt;br /&gt;
** Sbagliato: ''... i pixels...''&lt;br /&gt;
** Corretto: ''... i pixel...''&lt;br /&gt;
&lt;br /&gt;
* Siate precisi e specifici.&lt;br /&gt;
** Sbagliato: ''Le tecniche che descriveremo potranno essere usate per tutti i tipi di ricostruzione...''&lt;br /&gt;
** Corretto: ''Le tecniche che descriveremo in questa tesi potranno essere usate per tutti i tipi di ricostruzione...''&lt;br /&gt;
** Corretto: ''Le tecniche che descriveremo nel Capitolo 4 potranno essere usate per tutti i tipi di ricostruzione...''&lt;br /&gt;
&lt;br /&gt;
* Traducete &amp;quot;return&amp;quot; con restituisce e non con &amp;quot;ritorna&amp;quot;:&lt;br /&gt;
** Sbagliato: ''... la funzione ritorna un valore...''&lt;br /&gt;
** Corretto: ''... la funzione restituisce un valore...''&lt;br /&gt;
&lt;br /&gt;
* Mantenete la coerenza nel manoscritto. Se ad un certo punto parlate di &amp;quot;ricostruzione virtuale&amp;quot;, nelle sezioni seguenti non usate &amp;quot;riassemblaggio virtuale&amp;quot; per denotare lo stesso concetto, ma continuate a usare &amp;quot;ricostruzione virtuale&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
* Non usare &amp;quot;calcolatore&amp;quot; ma &amp;quot;elaboratore&amp;quot; o il termine inglese &amp;quot;computer&amp;quot; ormai utilizzato anche in italiano.&lt;br /&gt;
&lt;br /&gt;
* Non si inizia una frase in italiano con &amp;quot;ma&amp;quot; o &amp;quot;però&amp;quot;. Inoltre si mette una virgola davanti al &amp;quot;ma&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
==Elementi di formato==&lt;br /&gt;
&lt;br /&gt;
* Non inserire spazi prima della virgola, del punto e virgola, dei due punti, del punto, del punto esclamativo, del punto di domanda; niente spazi tra un articolo e l'apostrofo, e tra l'apostrofo e la parola seguente. Inoltre non mettere spazi dopo la parentesi aperta e prima della parentesi chiusa.&lt;br /&gt;
** Sbagliato: ''... dell' interazione...''&lt;br /&gt;
** Corretto: ''... dell'interazione...''&lt;br /&gt;
** Sbagliato: ''...( regole o insiemi di regole ) ...''&lt;br /&gt;
** Corretto: ''... (regole o insiemi di regole)...''&lt;br /&gt;
** Sbagliato: ''... molto semplici ;...''&lt;br /&gt;
** Corretto: ''... molto semplici;...''&lt;br /&gt;
&lt;br /&gt;
* Non usare le virgolette se non strettamente necessario.&lt;br /&gt;
** Sbagliato: ''... mi sono soffermato su alcune &amp;quot;scelte di progetto&amp;quot;...''&lt;br /&gt;
** Corretto: ''... mi sono soffermato su alcune scelte di progetto...''&lt;br /&gt;
&lt;br /&gt;
* Usare le iniziali maiuscole solo per i nomi propri di persona e non per i termini tecnici.&lt;br /&gt;
** Sbagliato: ''... Cooperazione...''&lt;br /&gt;
** Corretto: ''... cooperazione...''&lt;br /&gt;
&lt;br /&gt;
* Usare le maiuscole quando si richiamano nel testo figure, capitoli o sezioni indicandole con il numero corrispondente. Non usare le maiuscole quando si fa riferimento a figure, capitoli o sezioni senza indicarle con il loro numero.&lt;br /&gt;
** Sbagliato: ''... nella figura 2.1...''&lt;br /&gt;
** Corretto: ''... nella Figura 2.1...''&lt;br /&gt;
** Sbagliato: ''... come illustrato nel capitolo 3...''&lt;br /&gt;
** Corretto: ''... come illustrato nel Capitolo 3...''&lt;br /&gt;
** Sbagliato: ''... l'algoritmo descritto nella Sezione precedente...''&lt;br /&gt;
** Corretto: ''... l'algoritmo descritto nella sezione precedente...''&lt;br /&gt;
** Sbagliato: ''Nel presente Capitolo introduciamo il concetto...''&lt;br /&gt;
** Corretto: ''Nel presente capitolo introduciamo il concetto...''&lt;br /&gt;
&lt;br /&gt;
* Aggettivo, soggetto e verbo devono concordare in genere e numero.&lt;br /&gt;
** Sbagliato: ''... il nostri algoritmo...''&lt;br /&gt;
** Corretto: ''... il nostro algoritmo...''&lt;br /&gt;
** Sbagliato: ''... noi ha implementato...''&lt;br /&gt;
** Corretto: ''... noi abbiamo implementato...''&lt;br /&gt;
&lt;br /&gt;
* Usare le lettere accentate.&lt;br /&gt;
** Sbagliato: ''... ne questo ne quello...''&lt;br /&gt;
** Sbagliato: ''... nè questo nè quello...''&lt;br /&gt;
** Corretto: ''... né questo né quello...''&lt;br /&gt;
** Sbagliato: ''... poiche...''&lt;br /&gt;
** Sbagliato: ''... poichè...''&lt;br /&gt;
** Corretto: ''... poiché...''&lt;br /&gt;
** Sbagliato: ''... perche...''&lt;br /&gt;
** Sbagliato: ''... perchè...''&lt;br /&gt;
** Corretto: ''... perché...''&lt;br /&gt;
&lt;br /&gt;
* La consonante eufonica 'd' si aggiunge dopo una congiunzione se il nome dopo inizia con la stessa vocale (fa eccezione la locuzione &amp;quot;ad esempio&amp;quot;)&lt;br /&gt;
** Sbagliato: ''... ad un certo punto...''&lt;br /&gt;
** Corretto: ''... a un certo punto...''&lt;br /&gt;
** Sbagliato: ''... ed altri ancora...''&lt;br /&gt;
** Corretto: ''... e altri ancora...''&lt;br /&gt;
** Sbagliato: ''... e eventualmente...''&lt;br /&gt;
** Corretto: ''... ed eventualmente...''&lt;br /&gt;
** Sbagliato: ''... a esempio...''&lt;br /&gt;
** Corretto: ''... ad esempio...''&lt;br /&gt;
&lt;br /&gt;
* Per abbreviare &amp;quot;ad esempio&amp;quot; potere utilizzare &amp;quot;e.g.,&amp;quot; per abbreviare &amp;quot;cioè&amp;quot; potere utilizzare &amp;quot;i.e.,&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
* Usare il corsivo solo la prima volta che si introduce un termine o quando lo si definisce.&lt;br /&gt;
&lt;br /&gt;
* Cominciare un nuovo paragrafo (o capoverso o sezione) con l'iniziale maiuscola. Il primo paragrafo di una sezione non va indentato, i successivi sì.&lt;br /&gt;
&lt;br /&gt;
* Terminare ogni frase con un punto.&lt;br /&gt;
&lt;br /&gt;
* Ogni frase di un elenco puntato o numerato deve chiudersi con una virgola, con un punto e virgola o con un punto (fate una scelta e rispettatela coerentemente in tutto il manoscritto). L'ultima frase dell'elenco si deve chiudere con un punto.&lt;br /&gt;
&lt;br /&gt;
* Tutte le figure (tabelle) devono essere richiamate nel testo. I richiami devono essere uniformi nello stile: per esempio, scegliete se usare &amp;quot;Fig. X.Y&amp;quot; (&amp;quot;Tab. X.Y&amp;quot;) oppure &amp;quot;Figura X.Y&amp;quot; (&amp;quot;Tabella X.Y&amp;quot;) e mantenete sempre lo stesso formato.&lt;br /&gt;
&lt;br /&gt;
* Il formato delle didascalie di figure e tabelle deve essere uniforme. O tutte iniziano con la maiuscola o tutte iniziano con la minuscola, o tutte terminano con un punto o tutte terminano senza punto, e così via.&lt;br /&gt;
&lt;br /&gt;
* I puntini di sospensione devono essere sempre tre: &amp;quot;...&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
* Mantenete uniformità nello scrivere i nomi. Se avete denotato un oggetto come &amp;quot;LonMark&amp;quot;, poi scrivetelo sempre così e non, per esempio, &amp;quot;Lonmark&amp;quot;. &lt;br /&gt;
&lt;br /&gt;
* Le immagini vanno allineate in alto o in basso alla pagina e non &amp;quot;annegate&amp;quot; nel testo. Inoltre, per quanto posibile, andrebbero messe subito dopo la citazione.&lt;br /&gt;
&lt;br /&gt;
* Indice, capitoli, bibliografia, etc., iniziano sempre nella pagina dx.&lt;br /&gt;
	&lt;br /&gt;
==Informazioni generali sul formato==&lt;br /&gt;
&lt;br /&gt;
Nel caso dello stile LaTeX fornito il formato e l'impaginazione sono già corrette. Qualora si voglia usare uno strumento di editing alternativo si consiglia di attenersi alle seguenti semplici regole di formattazione. In ogni caso, nel paccheto latex per la formattazione che trovate più sopra c'è un file pdf d'esempio. &lt;br /&gt;
&lt;br /&gt;
* La numerazione di sezioni e appendici segue il seguente schema:&lt;br /&gt;
** Sezioni:&lt;br /&gt;
 1 TITOLO &lt;br /&gt;
 1.1 Sottosezione &lt;br /&gt;
 1.1.1 Sottosottosezione &lt;br /&gt;
 ... &lt;br /&gt;
** Appendici:&lt;br /&gt;
 A TITOLO &lt;br /&gt;
 A.1 Paragrafo appendice &lt;br /&gt;
 A.1.1 Sottoparagrafo appendice &lt;br /&gt;
 ... &lt;br /&gt;
&lt;br /&gt;
* Figure e tabelle: N.N dove N.N=numero sezione o lettera appendice.numero progressivo nella sezione. &lt;br /&gt;
&lt;br /&gt;
* Si consiglia l'uso dell'header sottolineato in tutte le pagine delle sezioni tranne la prima, header che deve contenere il numero della sezione e il titolo. Si consiglia inoltre il footer sottolineato con il numero della pagina.&lt;br /&gt;
&lt;br /&gt;
* La bibliografia deve contenere nelle sezioni un numero contenuto in parentesi quadre [X], e nella bibliografia lo stesso numero contenuto in parentesi con riportati a fianco gli autori, il titolo, il riferimento (rivista, tesi, rapporto interno, raccolta articoli), l'anno di pubblicazione, l'editore ed eventualmente il numero di pagina.&lt;br /&gt;
&lt;br /&gt;
* Si possono usare caratteri di scrittura a scelta, meglio se True Type, si consiglia una dimensione di 11 o 12 e un interlinea (singola, 1/2 o doppia a piacere). La stampa della tesi finale deve essere fronte e retro.&lt;/div&gt;</summary>
		<author><name>JacopoFarina</name></author>	</entry>

	<entry>
		<id>https://airwiki.deib.polimi.it/index.php?title=Wikipedia_Category_Graph&amp;diff=11956</id>
		<title>Wikipedia Category Graph</title>
		<link rel="alternate" type="text/html" href="https://airwiki.deib.polimi.it/index.php?title=Wikipedia_Category_Graph&amp;diff=11956"/>
				<updated>2010-07-06T13:03:56Z</updated>
		
		<summary type="html">&lt;p&gt;JacopoFarina: /* Results of the analysis */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Project&lt;br /&gt;
|title=Wikipedia Category Graph&lt;br /&gt;
|short_descr=Represent Wikipedia Categories with a model based on graphs to further analyze it.&lt;br /&gt;
|coordinator=MarcoColombetti&lt;br /&gt;
|tutor=DavidLaniado;RiccardoTasso&lt;br /&gt;
|students=JacopoFarina;&lt;br /&gt;
|resarea=Social Software and Semantic Web&lt;br /&gt;
|restopic=Graph Mining and Analysis&lt;br /&gt;
|start=2010/06/10&lt;br /&gt;
|end=2010/10/01&lt;br /&gt;
|status=Active&lt;br /&gt;
|level=Bs&lt;br /&gt;
|type=Course&lt;br /&gt;
}}&lt;br /&gt;
The goal of the project is to analyze Wikipedia categories by representing them in a graph based database.&lt;br /&gt;
&lt;br /&gt;
Wikipedia categories are not a three-based structure: a category may be contained in another one which is contained in another one which is contained in the first, generating a cyclic reference and many categories may be a root category (non contained in others).&lt;br /&gt;
&lt;br /&gt;
For this reasons a graph database is better to represent the structure.&lt;br /&gt;
==Creation and further analysis of the database with igraph==&lt;br /&gt;
Wikipedia let users download the entire site database (with all versions of all articles) or just some selections of it.&lt;br /&gt;
We use a selection which contains the category list and articles memberships in them.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[http://neo4j.org/ Neo4j] is a graph-based database, which allow a program to create and manipulate graph structures like nodes and relationships.&lt;br /&gt;
&lt;br /&gt;
In order to transfer the database in neo4j format is better save it in a file, which will be read one line at time.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
After transferring the structure in a Neo4j graph is possible to create from it a Pajek file (.net) to make general analysis like described [[Social_Network_Analysis_With_Igraph_Package_Using_R|here]].&lt;br /&gt;
&lt;br /&gt;
===Results of the analysis===&lt;br /&gt;
We can use the tools described [http://igraph.sourceforge.net/doc/R/00Index.html here]&lt;br /&gt;
&lt;br /&gt;
The '''diameter''' of the graph is 19. This is the maximum distance (number of nodes in the minimum path) between two nodes. These two nodes are ''Bowers Hill'' (A Virginia community) and ''m,n,k-game'' (a board game).&lt;br /&gt;
&lt;br /&gt;
The '''average distance''' between two nodes is 4.781262.&lt;br /&gt;
&lt;br /&gt;
The '''graph density''' is 1.408986 10&amp;lt;sup&amp;gt;-7&amp;lt;/sup&amp;gt;. This is the ratio of the number of edges and the number of possible edges.&lt;br /&gt;
&lt;br /&gt;
==Previous Work==&lt;br /&gt;
&lt;br /&gt;
[http://www-users.cs.umn.edu/~echi/papers/2009-CHI2009/p1509.pdf What's in Wikipedia? Mapping Topics and Conflict Using Socially]&lt;/div&gt;</summary>
		<author><name>JacopoFarina</name></author>	</entry>

	<entry>
		<id>https://airwiki.deib.polimi.it/index.php?title=Wikipedia_Category_Graph&amp;diff=11955</id>
		<title>Wikipedia Category Graph</title>
		<link rel="alternate" type="text/html" href="https://airwiki.deib.polimi.it/index.php?title=Wikipedia_Category_Graph&amp;diff=11955"/>
				<updated>2010-07-06T13:02:48Z</updated>
		
		<summary type="html">&lt;p&gt;JacopoFarina: /* Creation of the database */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Project&lt;br /&gt;
|title=Wikipedia Category Graph&lt;br /&gt;
|short_descr=Represent Wikipedia Categories with a model based on graphs to further analyze it.&lt;br /&gt;
|coordinator=MarcoColombetti&lt;br /&gt;
|tutor=DavidLaniado;RiccardoTasso&lt;br /&gt;
|students=JacopoFarina;&lt;br /&gt;
|resarea=Social Software and Semantic Web&lt;br /&gt;
|restopic=Graph Mining and Analysis&lt;br /&gt;
|start=2010/06/10&lt;br /&gt;
|end=2010/10/01&lt;br /&gt;
|status=Active&lt;br /&gt;
|level=Bs&lt;br /&gt;
|type=Course&lt;br /&gt;
}}&lt;br /&gt;
The goal of the project is to analyze Wikipedia categories by representing them in a graph based database.&lt;br /&gt;
&lt;br /&gt;
Wikipedia categories are not a three-based structure: a category may be contained in another one which is contained in another one which is contained in the first, generating a cyclic reference and many categories may be a root category (non contained in others).&lt;br /&gt;
&lt;br /&gt;
For this reasons a graph database is better to represent the structure.&lt;br /&gt;
==Creation and further analysis of the database with igraph==&lt;br /&gt;
Wikipedia let users download the entire site database (with all versions of all articles) or just some selections of it.&lt;br /&gt;
We use a selection which contains the category list and articles memberships in them.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[http://neo4j.org/ Neo4j] is a graph-based database, which allow a program to create and manipulate graph structures like nodes and relationships.&lt;br /&gt;
&lt;br /&gt;
In order to transfer the database in neo4j format is better save it in a file, which will be read one line at time.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
After transferring the structure in a Neo4j graph is possible to create from it a Pajek file (.net) to make general analysis like described [[Social_Network_Analysis_With_Igraph_Package_Using_R|here]].&lt;br /&gt;
&lt;br /&gt;
===Results of the analysis===&lt;br /&gt;
&lt;br /&gt;
The '''diameter''' of the graph is 19. This is the maximum distance (number of nodes in the minimum path) between two nodes. These two nodes are ''Bowers Hill'' (A Virginia community) and ''m,n,k-game'' (a board game).&lt;br /&gt;
&lt;br /&gt;
The average distance between two nodes is 4.781262.&lt;br /&gt;
&lt;br /&gt;
The graph density is 1.408986 10&amp;lt;sup&amp;gt;-7&amp;lt;/sup&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Previous Work==&lt;br /&gt;
&lt;br /&gt;
[http://www-users.cs.umn.edu/~echi/papers/2009-CHI2009/p1509.pdf What's in Wikipedia? Mapping Topics and Conflict Using Socially]&lt;/div&gt;</summary>
		<author><name>JacopoFarina</name></author>	</entry>

	<entry>
		<id>https://airwiki.deib.polimi.it/index.php?title=Wikipedia_Category_Graph&amp;diff=11878</id>
		<title>Wikipedia Category Graph</title>
		<link rel="alternate" type="text/html" href="https://airwiki.deib.polimi.it/index.php?title=Wikipedia_Category_Graph&amp;diff=11878"/>
				<updated>2010-06-28T21:53:06Z</updated>
		
		<summary type="html">&lt;p&gt;JacopoFarina: added part about neo4j&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Project&lt;br /&gt;
|title=Wikipedia Category Graph&lt;br /&gt;
|short_descr=Represent Wikipedia Categories with a model based on graphs to further analyze it.&lt;br /&gt;
|coordinator=MarcoColombetti&lt;br /&gt;
|tutor=DavidLaniado;RiccardoTasso&lt;br /&gt;
|students=JacopoFarina;&lt;br /&gt;
|resarea=Social Software and Semantic Web&lt;br /&gt;
|restopic=Graph Mining and Analysis&lt;br /&gt;
|start=2010/06/10&lt;br /&gt;
|end=2010/10/01&lt;br /&gt;
|status=Active&lt;br /&gt;
|level=Bs&lt;br /&gt;
|type=Course&lt;br /&gt;
}}&lt;br /&gt;
The goal of the project is to analyze Wikipedia categories by representing them in a graph based database.&lt;br /&gt;
&lt;br /&gt;
Wikipedia categories are not a three-based structure: a category may be contained in another one which is contained in another one which is contained in the first, generating a cyclic reference and many categories may be a root category (non contained in others).&lt;br /&gt;
&lt;br /&gt;
For this reasons a graph database is better to represent the structure.&lt;br /&gt;
==Creation of the database==&lt;br /&gt;
Wikipedia let users download the entire site database (with all versions of all articles) or just some selections of it.&lt;br /&gt;
We use a selection which contains the category list and articles memberships in them.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[http://neo4j.org/ Neo4j] is a graph-based database, which allow a program to create and manipulate graph structures like nodes and relationships.&lt;br /&gt;
&lt;br /&gt;
After transferring the structure in a Neo4j graph is possible to create a Pajek file to general analysis like described [[Social_Network_Analysis_With_Igraph_Package_Using_R|here]].&lt;br /&gt;
&lt;br /&gt;
==Previous Work==&lt;br /&gt;
&lt;br /&gt;
[http://www-users.cs.umn.edu/~echi/papers/2009-CHI2009/p1509.pdf What's in Wikipedia? Mapping Topics and Conflict Using Socially]&lt;/div&gt;</summary>
		<author><name>JacopoFarina</name></author>	</entry>

	<entry>
		<id>https://airwiki.deib.polimi.it/index.php?title=Wikipedia_Category_Graph&amp;diff=11828</id>
		<title>Wikipedia Category Graph</title>
		<link rel="alternate" type="text/html" href="https://airwiki.deib.polimi.it/index.php?title=Wikipedia_Category_Graph&amp;diff=11828"/>
				<updated>2010-06-16T10:43:36Z</updated>
		
		<summary type="html">&lt;p&gt;JacopoFarina: /* Creation of the database */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Project&lt;br /&gt;
|title=Wikipedia Category Graph&lt;br /&gt;
|short_descr=Represent Wikipedia Categories with a model based on graphs to further analyze it.&lt;br /&gt;
|coordinator=MarcoColombetti&lt;br /&gt;
|tutor=DavidLaniado;RiccardoTasso&lt;br /&gt;
|students=JacopoFarina;&lt;br /&gt;
|resarea=Social Software and Semantic Web&lt;br /&gt;
|restopic=Graph Mining and Analysis&lt;br /&gt;
|start=2010/06/10&lt;br /&gt;
|end=2010/10/01&lt;br /&gt;
|status=Active&lt;br /&gt;
|level=Bs&lt;br /&gt;
|type=Course&lt;br /&gt;
}}&lt;br /&gt;
The goal of the project is to analyze Wikipedia categories by representing them in a graph based database.&lt;br /&gt;
&lt;br /&gt;
Wikipedia categories are not a three-based structure: a category may be contained in another one which is contained in another one which is contained in the first, generating a cyclic reference and many categories may be a root category (non contained in others).&lt;br /&gt;
&lt;br /&gt;
For this reasons a graph database is better to represent the structure.&lt;br /&gt;
==Creation of the database==&lt;br /&gt;
Wikipedia let users download the entire site database (with all versions of all articles) or just some selections of it.&lt;br /&gt;
We use a selection wich contains the category list and articles memberships in them.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[http://neo4j.org/ Neo4j] is a graph-based database, which allow a program to create and manipulate graph structures like nodes and relationships.&lt;/div&gt;</summary>
		<author><name>JacopoFarina</name></author>	</entry>

	<entry>
		<id>https://airwiki.deib.polimi.it/index.php?title=Wikipedia_Category_Graph&amp;diff=11823</id>
		<title>Wikipedia Category Graph</title>
		<link rel="alternate" type="text/html" href="https://airwiki.deib.polimi.it/index.php?title=Wikipedia_Category_Graph&amp;diff=11823"/>
				<updated>2010-06-16T10:12:14Z</updated>
		
		<summary type="html">&lt;p&gt;JacopoFarina: start to write&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Project&lt;br /&gt;
|title=Wikipedia Category Graph&lt;br /&gt;
|short_descr=Represent Wikipedia Categories with a model based on graphs to further analyze it.&lt;br /&gt;
|coordinator=MarcoColombetti&lt;br /&gt;
|tutor=DavidLaniado;RiccardoTasso&lt;br /&gt;
|students=JacopoFarina;&lt;br /&gt;
|resarea=Social Software and Semantic Web&lt;br /&gt;
|restopic=Graph Mining and Analysis&lt;br /&gt;
|start=2010/06/10&lt;br /&gt;
|end=2010/10/01&lt;br /&gt;
|status=Active&lt;br /&gt;
|level=Bs&lt;br /&gt;
|type=Course&lt;br /&gt;
}}&lt;br /&gt;
The goal of the project is to analyze Wikipedia categories by representing them in a graph based database.&lt;br /&gt;
&lt;br /&gt;
Wikipedia categories are not a three-based structure: a category may be contained in another one which is contained in another one which is contained in the first, generating a cyclic reference and many categories may be a root category (non contained in others).&lt;br /&gt;
&lt;br /&gt;
For this reasons a graph database is better to represent the structure.&lt;br /&gt;
==Creation of the database==&lt;br /&gt;
Wikipedia let users download the entire site database (with all versions of all articles) or just some selections of it.&lt;br /&gt;
We use a selection wich contains the category list and articles memberships in them. &lt;br /&gt;
[http://neo4j.org/ Neo4j] is a graph-based database, which allow a program to create and manipulate graph structures like nodes and relationships.&lt;/div&gt;</summary>
		<author><name>JacopoFarina</name></author>	</entry>

	</feed>