Bibliographic citations are one of the most important and used tool for linking and evaluating research in academic publishing. In this context, understanding author’s motivations behind a citation act can show how a particular work is perceived by other researchers, and opens interesting scenarios within the scholarly publishing domain. For instance, it would be possible browsing citation networks according to precise intentions, e.g. looking for all the works citing a particular paper with the intention of providing extensions to theories contained in it, of critiquing its content, and the like.
I’ve worked quite a lot on the definition of models to characterise citations from rhetorical and factual perspectives. One of the most important output of this research is the creation (together with David Shotton) of the Citation Typing Ontology, a.k.a. CiTO, an ontology describing different kinds of functions that citations may have, which we introduced in detail in our article entitled FaBiO and CiTO: ontologies for describing bibliographic resources and citations.
Starting from this work, my colleagues and I have started to think a mechanism to retrive these citation functions from text in an automatic way, simply trying to replicate the way in which humans usually perform this task.
To this end, we develop CiTalO, a tool that allows the automatic extraction of the nature of a citation starting from the sentence in which the citation act is happening. In practice, what CiTalO actually does is to return the most appropriate CiTO property which better describe a certain citation scenario – such as the property cito:extends for the citation sentence
It extends the research outlined in earlier work X, being
X the reference to a particular bibliographic citation within a paper.
Today I presented a paper about CiTalO (I co-wrote with Angelo Di Iorio and Andrea Giovanni Nuzzolese) at SePublica 2013, co-located with ESWC 2013. Some information about the paper as follows:
Towards the automatic identification of the nature of citations
Abstract. The reasons why an author cites other publications are varied: an author can cite previous works to gain assistance of some sort in the form of background information, ideas, methods, or to review, critique or refute previous works. The problem is that the best possible way to retrieve the nature of citations is very time consuming: one should read article by article to assign a particular characterisation to each citation. In this paper we propose an algorithm, called CiTalO, to infer automatically the function of citations by means of Semantic Web technologies and NLP techniques. We also present some preliminary experiments and discuss some strengths and limitations of this approach.
If you like this work and you are attending ESWC this year, please don’t miss our practical introduction to CiTalO during the Demo Session on Tuesday afternoon (May 28).