BigDat2017: certificate of attendance

I have recently been attending in Bari (IT) a winter school about Big Data: BigDat2017. At the moment, Big Data is gaining great attention in research, since it allows to provide data-driven solutions in several contexts.

As part of my postgraduate research I decided to attend it and follow the new developments in this field.

Here follows the proof of my attendance.

Certificate of Attendance

In addition to this, I also wrote a review of the winter school. If you are interested in it, please read it here: BigDat2017.

Here is the link to the news I wrote on my department website, instead.

BigDat2017: a review

Prof. Ullman was about to start his lecture
Prof. Ullman was about to start his lecture

This week I have been attending the 3rd edition of the Big Data winter school: BigDat2017. It was held in my former campus, at the University of Bari (IT). It was a really nice feeling to be back for a while, sitting on those benches and following courses, once again.
Big Data has recently gained a lot of interest in research and many believe that it will still play its leading role for many years. Nowadays, we live in a world in which all information seems to be available, we are surrounded by data-driven applications (Google, Facebook, Twitter, Spotify, just to name a few), which gather data and try to provide tailor-made solutions for their users. To this end, having such event like BigDat2017 with its clear mission —introduce and update new researchers into this fast advancing research area—is really important.Read More

Department Research Seminar: Early Detection of Research Topics

On the 8th February I delivered a seminar to my department (KMi @ OU) in which I described the work I have been doing in the last two years for my postgraduate research.

I started with a little bit of introduction about science, I then started talking about the different and currently available technologies for keeping track of the development of the different research areas. I showed how this technologies were not satisfactory enough if we want to perform the early detection of research topics. Showing a bit of the state of the art (including The Structure of Scientific Revolution by Kuhn) it allowed me to state my main hypothesis, regarding the existence of an embryonic stage that research areas face and that it is possible to detect their emergence during this stage1.Read More

A Visual Introduction to Machine Learning: Italian Translation

The R2D3 team ( developed a visual introduction to Machine Learning. This introduction uses data visualization technologies to show a workflow that can help for the creation of a machine learning model able to make accurate predictions. Lately, many people volunteered to translate this introduction in different languages. I took care of the Italian version: Una introduzione visuale al machine learning.
English Version:
Italian Version:


Ontology Forecasting in Scientific Literature: Semantic Concepts Prediction based on Innovation-Adoption Priors

Semantic Innovation Forecasting Model
Semantic Innovation Forecasting Model

Ontology Forecasting in Scientific Literature: Semantic Concepts Prediction based on Innovation-Adoption Priors” is a peer-reviewed paper presented on Tuesday 22nd November 2016 at the “Entity detection, matching and evolution” session at the 20th International Conference on Knowledge Engineering and Knowledge Management, Bologna, Italy


Amparo Elizabeth Cano-Basave, Francesco Osborne and Angelo Antonio Salatino


The ontology engineering research community has focused for many years on supporting the creation, development and evolution of ontologies. Ontology forecasting, which aims at predicting semantic changes in an ontology, represents instead a new challenge. In this paper, we want to give a contribution to this novel endeavour by focusing on the task of forecasting semantic concepts in the research domain. Indeed, ontologies representing scientific disciplines contain only research topics that are already popular enough to be selected by human experts or automatic algorithms. They are thus unfit to support tasks which require the ability of describing and exploring the forefront of research, such as trend detection and horizon scanning. We address this issue by introducing the Semantic Innovation Forecast (SIF) model, which predicts new concepts of an ontology at time t + 1, using only data available at time t. Our approach relies on lexical innovation and adoption information extracted from historical data. We evaluated the SIF model on a very large dataset consisting of over one million scientific papers belonging to the Computer Science domain: the outcomes show that the proposed approach offers a competitive boost in mean average precision-at-ten compared to the baselines when forecasting over 5 years.Read More

Clique Percolation Method in R: a fast implementation


Clique Percolation Method (CPM) is an algorithm for finding overlapping communities within networks, intruduced by Palla et al. (2005, see references). This implementation in R, firstly detects communities of size k, then creates a clique graph. Each community will be represented by each connected component in the clique graph.


The algorithm performs the following steps:

1- first find all cliques of size k in the graph
2- then create graph where nodes are cliques of size k
3- add edges if two nodes (cliques) share k-1 common nodes
4- each connected component is a communityRead More


XMLResearcherProfile is a technology for tracking and presenting academics’ profile. In its early stage, it allows to record all the published papers (in proceedings, journals, posters and phd thesis) within a single XML file and afterwards it allows to format them as a list in a HTML page. Using specific XML tags it is possible to describe all the meaningful information about each published papers, such as title, authors, year of publication, book title and web sources. Initially, all these tags were specified using a DTD file (rpDTD.dtd), while lately an XML schema (rpSchema.xml) replaced it. The new schema contains a set of rules to describe all the tags that the XML file can contain. An instance of published paper with its relative tags is presented below.Read More