Citation graph

A citation graph (or citation network), in information science and bibliometrics, is a directed graph that describes the citations within a collection of documents.

A directed acyclic graph with five nodes
In this example, document b cites document d, and is cited by document a.

Each vertex in the graph represents a document in the collection, and each edge is directed from one document toward another that it cites (or vice-versa depending on the specific implementation).[1]


There is no standard format for the citations in bibliographies, and the record linkage of citations can be a time-consuming and complicated process. Furthermore, citation errors can occur at any stage of the publishing process. However, there is a long history of creating citation databases, also known as citation indexes, so there is a lot of information about such problems.

In principle, each document should have a unique publication date and can only refer to earlier documents. This means that an ideal citation graph is not only directed but acyclic; that is, there are no loops in the graph. This is not always the case in practice, since an academic paper goes through several versions in the publishing process. The timing of asynchronous updates to bibliographies may lead to edges that apparently point backward in time. Such "backward" citations seem to constitute less than 1% of the total number of links.[2]

As citation links are meant to be permanent, the bulk of a citation graph should be static, and only the leading edge of the graph should change. Exceptions might occur when papers are withdrawn from circulation.[2]


Citation graphs are frequently applied to citation analysis in academic research. Information scientist Derek J. de Solla Price described the use of citation networks to characterize patterns in the incidence of citations and references between papers according to factors such as publication date and subject area.[3] They may be also used to calculate measures of scientific impact, such as the h-index, and for studying the structure and development of different fields of academic inquiry.

Court judgments form citation networks, as judges frequently refer to earlier judgments to support their decisions. Citation analysis in a legal context is, therefore, an important commercial field. Likewise, patents form citation networks, as they must refer to prior art.

Background and historyEdit

A citation is a reference to a published or unpublished source (not always the original source). More precisely, a citation is an abbreviated alphanumeric expression embedded in the body of an intellectual work that denotes an entry in the bibliographic references section of the work. Its purpose is to acknowledge the relevance of the works of others to the topic of discussion at the point where the citation appears.

Generally the combination of both the in-body citation and the bibliographic entry constitutes what is commonly thought of as a citation (whereas bibliographic entries by themselves are not).[4] References to single, machine-readable assertions in electronic scientific articles are known as nanopublications, a form of micro attributions.

Citation networks are one kind of social network that has been studied quantitatively almost from the moment citation databases first became available. In 1965, Derek J. de Solla Price described the inherent linking characteristic of the Science Citation Index (SCI) in his paper entitled "Networks of Scientific Papers." The links between citing and cited papers became dynamic when the SCI began to be published online. In 1973, Henry Small published his work on co-citation analysis, which became a self-organizing classification system that led to document clustering experiments and eventually what is called "Research Reviews."[5]

Related networksEdit

There are several other types of network graphs that are closely related to citation networks. The co-citation graph is the graph between documents as nodes, where two documents are connected if they share a common citation (see Co-citation and Bibliographic coupling). Other related networks are formed using other information present in the document. For instance, in a collaboration graph, known in this context as a co-authorship network, the nodes are the authors of documents, linked if they have co-authored the same document. The link weights between two authors in co-authorship networks can increase over time if they have further collaboration.

See alsoEdit


  1. ^ Egghe, Leo; Rousseau, Ronald (1990). Introduction to Informetrics : quantitative methods in library, documentation and information science. Amsterdam, The Netherlands: Elsevier Science Publishers. p. 228. ISBN 0-444-88493-9.
  2. ^ a b James R Clough; Jamie Gollings; Tamar V Loach; Tim S Evans (2015). "Transitive reduction of citation networks". Journal of Complex Networks. 3 (2): 189–203. arXiv:1310.8224. doi:10.1093/comnet/cnu039. S2CID 10228152.
  3. ^ Derek J. de Solla Price (July 30, 1965). "Networks of Scientific Papers" (PDF). Science. 149 (3683): 510–515. Bibcode:1965Sci...149..510D. doi:10.1126/science.149.3683.510. PMID 14325149.
  4. ^ Zhao, Dangzhi; Strotmann, Andreas (2015-02-01). Analysis and Visualization of Citation Networks. Morgan & Claypool Publishers. ISBN 978-1-60845-939-1.
  5. ^ Structures and Statistics of Citation Networks, Miray Kas

Further readingEdit