Last Updates

Early detection of research trends and forecast their future impact*

Please note that at the moment I am writing my dissertation which keeps me busy all the time. I do hope to come back here soon for new updates about my doctoral work.

This post aims to act like a hub for all the relevant information about my doctoral work. It will be constantly updated with new source and developments.

Abstract

The ability to promptly recognise new research trends is strategic for many stakeholders, including universities, institutional funding bodies, academic publishers and companies. While the literature describes several approaches which aim to identify the emergence of new research topics early in their lifecycle, these rely on the assumption that the topic in question is already associated with a number of publications and consistently referred to by a community of researchers. Hence, detecting the emergence of a new research area at an embryonic stage, i.e., before the topic has been consistently labelled by a community of researchers and associated with a number of publications, is still an open challenge. In this paper, we begin to address this challenge by performing a study of the dynamics preceding the creation of new topics. This study indicates that the emergence of a new topic is anticipated by a significant increase in the pace of collaboration between relevant research areas, which can be seen as the ‘parents’ of the new topic. These initial findings (i) confirm our hypothesis that it is possible in principle to detect the emergence of a new topic at the embryonic stage, (ii) provide new empirical evidence supporting relevant theories in Philosophy of Science, and also (iii) suggest that new topics tend to emerge in an environment in which weakly interconnected research areas begin to cross-fertilise. (Please note this abstract come from one of my last papers.) Read More

AUGUR: Forecasting the Emergence of New Research Topics

AUGUR: Forecasting the Emergence of New Research Topics” is a paper submitted for the 18th ACM/IEEE Joint Conference on Digital Libraries, June 3–7, 2018, Fort Worth, TX, USA

Authors

Angelo Salatino, Francesco Osborne and Enrico Motta

Abstract

Being able to rapidly recognise new research trends is strategic for many stakeholders, including universities, institutional funding bodies, academic publishers and companies. The literature presents several approaches to identifying the emergence of new research topics, which rely on the assumption that the topic is already exhibiting a certain degree of popularity and consistently referred to by a community of researchers. However, detecting the emergence of a new research area at an embryonic stage, i.e., before the topic has been consistently labelled by a community of researchers and associated with a number of publications, is still an open challenge. We address this issue by introducing Augur, a novel approach to the early detection of research topics. Augur analyses the diachronic relationships between research areas and is able to detect clusters of topics that exhibit dynamics correlated with the emergence of new research topics. Here we also present the Advanced Clique Percolation Method (ACPM), a new community detection algorithm developed specifically for supporting this task. Augur was evaluated on a gold standard of 1,408 debutant topics in the 2000-2011 interval and outperformed four alternative approaches in terms of both precision and recall.

Download Gold Standard: link

Explore the line charts:

Figure 3: Performance of the Advanced Clique Percolation Method.
Figure 4: Performance of Fast Greedy algorithm.
Need a larger view? Visit here: link

Paper Download

Download paper (from ORO): link

Computer Science Ontology Portal (or simply CSO Portal)

The Computer Science Ontology Portal (also referred to simply as CSO Portal) is a web application that enables users to download, explore, and provide granular feedback on CSO at different levels. This last feature allows us to periodically review the status ontology and release new version according to the received feedbacks.

A user can explore the ontology in an interactive manner by clicking on a topic, viewing the information associated with it, and then following on to the next topic. In addition, users can download the ontology and import it in their own triple-store and then creating new applications based on it.

Registered users will be able to see different menus, where they can provide different kinds of feedbacks: ontology level, topic level, relationship level and also suggest new relationships.

Finally, there is the editorial board who aim at producing a new version of the ontology.

We submitted a paper at the 17th International Semantic Web Conference (ISWC2018) Resource Track, a paper describing both the Computer Science Ontology and the portal described here.

The portal is available at this link: https://cso.kmi.open.ac.uk.

We look forward to receive feedbacks from you.

SpringerNature Hackday – London

On the 29th November 2017, myself with two KMi colleagues (Andrea Mannocci and Thiviyan Thanapalasingam) attended the second edition of SpringerNature HackDay in London (@ SpringerNature Campus).

Aliaksandr Birukou, Executive Editor of Computer Science at Springer Nature and collaborator of our research team at the Knowledge Media Institute, also joined our group on the HackDay.

The whole event aimed at joining together the skills and interests of many developers and researchers with SciGraph, for advancing discovery.

The main web page for the event is here: https://github.com/SN-HackDay/Advancing-discovery-with-research-data (or here in case someone removes it).

As a team, we worked on Venue-centric trends problem. In particular, our projects provides to editors, conference organizers and many others, a dashboard to understand how knowledge flows across countries and continents, who are the main producers and consumers of the research output for a given conference, whether the conference is open to interdisciplinarity, and many other questions. Read More

2100 AI: Reflections on the mechanisation of scientific discovery

2100 AI: Reflections on the mechanisation of scientific discovery” is a paper submitted to the RE-CODING BLACK MIRROR Workshop co-located with the International Semantic Web Conference (ISWC) 2017, 21-25 October 2017, Vienna, Austria.

Authors

Andrea Mannocci, Angelo Salatino, Francesco Osborne and Enrico Motta

Abstract

The pace of nowadays research is hectic. Datasets and papers are produced and made available on the Web at a rate so unprecedented that digesting the information conveyed by such a “data deluge” stretches far beyond human analytical capabilities. Data science, artificial intelligence, machine learning and big data analytics are providing researchers with new methodologies capable of coping and getting insight in an automated fashion from the overload of information conveyed. Nonetheless major advances in AI solutions for knowledge discovery risk to exacerbate some negative phenomena, which are already observable on a global scale and disrupt irremediably the way of doing science as we know it. Read More

Smart Book Recommender: A Semantic Recommendation Engine for Editorial Products

Smart Book Recommender: A Semantic Recommendation Engine for Editorial Products” is a poster paper that will be presented at the International Semantic Web Conference (ISWC) 2017, 21-25 October 2017, Vienna, Austria.

Authors

Francesco Osborne, Thiviyan Thanapalasingam, Angelo Salatino, Aliaksandr Birukou and Enrico Motta

Abstract

Academic publishers, such as Springer Nature, need to constantly make informed decisions about how and where to market their editorial products. In the field of Computer Science (CS), it is particularly critical to assess which books will be of interest to the attendees of a conference. Typically, these items are manually chosen by publishing editors, on the basis of their personal experience. To make this process both faster and more robust we have developed the Smart Book Recommender (SBR), a semantic application designed to support the Springer Nature editorial team in promoting their publications at CS venues. SBR takes as input the proceedings of a conference and suggests books, journals, and other conference proceedings which are likely to be relevant to the attendees of the conference in question. It does so by taking advantage of a semantic representation of topics, which builds on a very large ontology of Computer Science topics; characterizing Springer Nature books as distributions of semantic topics; and approaching the problem as one of semantic matching between such distributions of semantic topics.

Download paper (via ORO): link

Book Review: Weapons of Math Destruction of Cathy O’Neil

weaponsmath-r4-6-06[1]
Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy.
Everyday activities are more and more shifting to a digital environment. Digital gadgets such as smartphones and werable devices are becoming inseparable part of our lives promising mostly convenience. New digital technologies have been mainly seen as empowering technologies for the users. FitBit, for example, is claimed to be a motivating device to lead a healthy and active life enabling users to achieve their goals analysing their data [1]. The data collected by this kind of devices include sleeping patterns, the number of steps, the amount of time they are engaged in physical activities and so forth. However, these data are not available just to the users but also to companies that can use them for multiple purposes. Health insurance companies, such as Vitality [2], already exploit their customers’ data in an exchange of rewards such as free tickets to the cinema or hot beverages. The potential implications of the collection and manipulation of personal data on a personal and societal level though have been downgraded. Just imagine a National Health insurance business model that operates on the basis of the classification of citizens as high- or low- risk based on their data [3]. Citizens profiled as low-risk will be granted with lower health contributions, while high-risk profiled citizens will be paying expensive and unaffordable plans.

Imagine a society where decisions on public well-being, education and so forth will be dependent on algorithmic predictions. Cathy O’Neil’s book Weapon of Math Destruction; How Big Data Increases Inequality and Threatens Democracy explores exactly these societal consequences emerging from the abuse of big data predictions.

O’Neil gives insights of how algorithms can be misused in the sake of convenience and cost efficiency resulting in practices of discrimination and bias, amplifying inequality and threatening ultimately Democracy. Her book is written for the lay public drawing though upon her academic expertise and her working experience in the financial sector. O’Neil after earning a PhD in Mathematics at Harvard, worked for the D. E. Shaw hedge fund when she initially felt a sense of disillusionment towards mathematics for their part in the financial crisis in the U.S. in 2008. The financial sector was relying on algorithmic models based on mathematical formulas that, using her words, “were more to impress than clarify”. It is when similar incomprehensible models got adopted into other sectors that she started investigating on the matter.  Read More