Early detection of research trends and forecast their future impact*

Please note that at the moment I am writing my dissertation which keeps me busy all the time. I do hope to come back here soon for new updates about my doctoral work.

This post aims to act like a hub for all the relevant information about my doctoral work. It will be constantly updated with new source and developments.

Abstract

The ability to promptly recognise new research trends is strategic for many stakeholders, including universities, institutional funding bodies, academic publishers and companies. While the literature describes several approaches which aim to identify the emergence of new research topics early in their lifecycle, these rely on the assumption that the topic in question is already associated with a number of publications and consistently referred to by a community of researchers. Hence, detecting the emergence of a new research area at an embryonic stage, i.e., before the topic has been consistently labelled by a community of researchers and associated with a number of publications, is still an open challenge. In this paper, we begin to address this challenge by performing a study of the dynamics preceding the creation of new topics. This study indicates that the emergence of a new topic is anticipated by a significant increase in the pace of collaboration between relevant research areas, which can be seen as the ‘parents’ of the new topic. These initial findings (i) confirm our hypothesis that it is possible in principle to detect the emergence of a new topic at the embryonic stage, (ii) provide new empirical evidence supporting relevant theories in Philosophy of Science, and also (iii) suggest that new topics tend to emerge in an environment in which weakly interconnected research areas begin to cross-fertilise. (Please note this abstract come from one of my last papers.)Read More

2100 AI: Reflections on the mechanisation of scientific discovery

2100 AI: Reflections on the mechanisation of scientific discovery” is a paper submitted to the RE-CODING BLACK MIRROR Workshop co-located with the International Semantic Web Conference (ISWC) 2017, 21-25 October 2017, Vienna, Austria.

Authors

Andrea Mannocci, Angelo Salatino, Francesco Osborne and Enrico Motta

Abstract

The pace of nowadays research is hectic. Datasets and papers are produced and made available on the Web at a rate so unprecedented that digesting the information conveyed by such a “data deluge” stretches far beyond human analytical capabilities. Data science, artificial intelligence, machine learning and big data analytics are providing researchers with new methodologies capable of coping and getting insight in an automated fashion from the overload of information conveyed. Nonetheless major advances in AI solutions for knowledge discovery risk to exacerbate some negative phenomena, which are already observable on a global scale and disrupt irremediably the way of doing science as we know it.

Download paper (from ORO): link

Download paper (from CEUR-WS): link

 

Smart Book Recommender: A Semantic Recommendation Engine for Editorial Products

Smart Book Recommender: A Semantic Recommendation Engine for Editorial Products” is a poster paper that will be presented at the International Semantic Web Conference (ISWC) 2017, 21-25 October 2017, Vienna, Austria.

Authors

Francesco Osborne, Thiviyan Thanapalasingam, Angelo Salatino, Aliaksandr Birukou and Enrico Motta

Abstract

Academic publishers, such as Springer Nature, need to constantly make informed decisions about how and where to market their editorial products. In the field of Computer Science (CS), it is particularly critical to assess which books will be of interest to the attendees of a conference. Typically, these items are manually chosen by publishing editors, on the basis of their personal experience. To make this process both faster and more robust we have developed the Smart Book Recommender (SBR), a semantic application designed to support the Springer Nature editorial team in promoting their publications at CS venues. SBR takes as input the proceedings of a conference and suggests books, journals, and other conference proceedings which are likely to be relevant to the attendees of the conference in question. It does so by taking advantage of a semantic representation of topics, which builds on a very large ontology of Computer Science topics; characterizing Springer Nature books as distributions of semantic topics; and approaching the problem as one of semantic matching between such distributions of semantic topics.

Download paper (via ORO): link

Book Review: Weapons of Math Destruction of Cathy O’Neil

weaponsmath-r4-6-06[1]
Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy.
Everyday activities are more and more shifting to a digital environment. Digital gadgets such as smartphones and werable devices are becoming inseparable part of our lives promising mostly convenience. New digital technologies have been mainly seen as empowering technologies for the users. FitBit, for example, is claimed to be a motivating device to lead a healthy and active life enabling users to achieve their goals analysing their data [1]. The data collected by this kind of devices include sleeping patterns, the number of steps, the amount of time they are engaged in physical activities and so forth. However, these data are not available just to the users but also to companies that can use them for multiple purposes. Health insurance companies, such as Vitality [2], already exploit their customers’ data in an exchange of rewards such as free tickets to the cinema or hot beverages. The potential implications of the collection and manipulation of personal data on a personal and societal level though have been downgraded. Just imagine a National Health insurance business model that operates on the basis of the classification of citizens as high- or low- risk based on their data [3]. Citizens profiled as low-risk will be granted with lower health contributions, while high-risk profiled citizens will be paying expensive and unaffordable plans.

Imagine a society where decisions on public well-being, education and so forth will be dependent on algorithmic predictions. Cathy O’Neil’s book Weapon of Math Destruction; How Big Data Increases Inequality and Threatens Democracy explores exactly these societal consequences emerging from the abuse of big data predictions.

O’Neil gives insights of how algorithms can be misused in the sake of convenience and cost efficiency resulting in practices of discrimination and bias, amplifying inequality and threatening ultimately Democracy. Her book is written for the lay public drawing though upon her academic expertise and her working experience in the financial sector. O’Neil after earning a PhD in Mathematics at Harvard, worked for the D. E. Shaw hedge fund when she initially felt a sense of disillusionment towards mathematics for their part in the financial crisis in the U.S. in 2008. The financial sector was relying on algorithmic models based on mathematical formulas that, using her words, “were more to impress than clarify”. It is when similar incomprehensible models got adopted into other sectors that she started investigating on the matter. Read More

3MT – Early detection of research trends

On 16th May 2017, the STEM Faculty of my university organised a 3 Minutes Thesis (3MT) in which each candidate has a time slot of three minutes to describe their thesis. The speech can be supported by one static slide showing important features of the work.

I wish I had shown the one above. In which it is pretty clear that in my work I aim to combine different data sources to attain new information and knowledge: emerging topics.

At the end I decided to show a more formal one, attached below.

Single slide I used for my 3MT talk

How are topics born? Understanding the research dynamics preceding the emergence of new areas

How are topics born? Understanding the research dynamics preceding the emergence of new areas” is a peer-reviewed paper submitted to PeerJ Computer Science. The paper has been submitted in July 2016 and accepted in May 2017. All the co-authors are thankful to the reviewers and the editor for providing insightful comments and thus improving the manuscript.

Authors:

Angelo Antonio Salatino, Francesco Osborne, Enrico Motta

Abstract:

The ability to recognise promptly new research trends is strategic for many stakeholders, including universities, institutional funding bodies, academic publishers and companies. While the literature describes several approaches which aim to identify the emergence of new research topics early in their lifecycle, these rely on the assumption that the topic in question is already associated with a number of publications and consistently referred to by a community of researchers. Hence, detecting the emergence of a new research area at an embryonic stage, i.e., before the topic has been consistently labelled by a community of researchers and associated with a number of publications, is still an open challenge. In this paper, we begin to address this challenge by performing a study of the dynamics preceding the creation of new topics. This study indicates that the emergence of a new topic is anticipated by a significant increase in the pace of collaboration between relevant research areas, which can be seen as the ‘parents’ of the new topic. These initial findings i) confirm our hypothesis that it is possible in principle to detect the emergence of a new topic at the embryonic stage, ii) provide new empirical evidence supporting relevant theories in Philosophy of Science, and also iii) suggest that new topics tend to emerge in an environment in which weakly interconnected research areas begin to cross-fertilise.Read More

Export Graph in R via JSON

This post presents an easy solution for exporting and importing a graph object of igraph library.
In its previous versions, the library used to have the save and load functions in which you could respectively export and import the graph object [1]. Although they seem to not be in the library anymore, the documentation states:

“Attribute values can be set to any R object, but note that storing the graph in some file formats might result the loss of complex attribute values. All attribute values are preserved if you use save and load to store/retrieve your graphs.

The library also proposes write_graph and read_graph, that rely on the GraphML format, for exporting and importing back graph objects.

However, here I propose my little solution with almost zero options. It saves the graph and allows to re-load it again (in another session as well) simply saving all the fields and values in a JSON file.

Read More

BigDat2017: certificate of attendance

I have recently been attending in Bari (IT) a winter school about Big Data: BigDat2017. At the moment, Big Data is gaining great attention in research, since it allows to provide data-driven solutions in several contexts.

As part of my postgraduate research I decided to attend it and follow the new developments in this field.

Here follows the proof of my attendance.

Certificate of Attendance

In addition to this, I also wrote a review of the winter school. If you are interested in it, please read it here: BigDat2017.

Here is the link to the news I wrote on my department website, instead.

BigDat2017: a review

This week I have been attending the 3rd edition of the Big Data winter school: BigDat2017. It was held in my former campus, at the University of Bari (IT). It was a really nice feeling to be back for a while, sitting on those benches and following courses, once again.
Big Data has recently gained a lot of interest in research and many believe that it will still play its leading role for many years. Nowadays, we live in a world in which all information seems to be available, we are surrounded by data-driven applications (Google, Facebook, Twitter, Spotify, just to name a few), which gather data and try to provide tailor-made solutions for their users. To this end, having such event like BigDat2017 with its clear mission —introduce and update new researchers into this fast advancing research area—is really important.Read More