Network Analytics

How can we reconstruct and analyse historical communication and trading networks?

How can we study novels by reconstructing character networks?

How can we understand interpret centrality in networks from a humanities point of view?

How can the study the evolution of networks?




Hexagons, Satellites and Semantic Background

D. Rodighiero

Micro Meso Macro, École normale supérieure de Lyon, France, November 15–16, 2018.

The presentation is focused on a visual method that allows for a hexagonal arrangement in network visualization. Hexagonal tilling is a way to enrich the betweenness of nodes in order to enrich the information that a network visualization can convey. What is usually employed as a background is used to show node context and semantic information. This visual method wants to bring a reflection about the visual representation of networks, which needs further developments and ideas.


Comparing human and machine performances in transcribing 18th century handwritten Venetian script

S. Ares Oliveira; F. Kaplan

2018-07-26. Digital Humanities Conference , Mexico City, Mexico , June 24-29, 2018.

Automatic transcription of handwritten texts has made important progress in the recent years. This increase in performance, essentially due to new architectures combining convolutional neural networks with recurrent neutral networks, opens new avenues for searching in large databases of archival and library records. This paper reports on our recent progress in making million digitized Venetian documents searchable, focusing on a first subset of 18th century fiscal documents from the Venetian State Archives. For this study, about 23’000 image segments containing 55’000 Venetian names of persons and places were manually transcribed by archivists, trained to read such kind of handwritten script. This annotated dataset was used to train and test a deep learning architecture with a performance level (about 10% character error rate) that is satisfactory for search use cases. This paper compares this level of reading performance with the reading capabilities of Italian-speaking transcribers. More than 8500 new human transcriptions were produced, confirming that the amateur transcribers were not as good as the expert. However, on average, the machine outperforms the amateur transcribers in this transcription tasks.


Using Networks to Visualize Publications

D. Rodighiero

EUROLIB General Assembly, Joint Research Centre of European Commission - Ispra (VA), Italy, 30 May - 1 June 2018.

Retrieval systems are often shaped as lists organized in pages. However, the majority of users look at the first page ignoring the other ones. This presentation concerns an alterna- tive way to present the results of a query using network visualizations.
 The presentation includes a case study that concerns a school of management. Its whole publications are arranged in a network visualization according to their lexical proximity, based on a technique called Term Frequency – Inverse Document Frequency (TF-IDF). These terms are further used to fill the space between the network nodes, creating a sort of semantic background. The case study shows pros and cons of such visual representa- tion through practical examples of term extraction and visualization interaction.


Mapping Affinities in Academic Organizations

D. Rodighiero; F. Kaplan; B. Beaude

Frontiers in Research Metrics and Analytics. 2018-02-19.

DOI : 10.3389/frma.2018.00004.

Scholarly affinities are one of the most fundamental hidden dynamics that drive scientific development. Some affinities are actual, and consequently can be measured through classical academic metrics such as co-authoring. Other affinities are potential, and therefore do not leave visible traces in information systems; for instance, some peers may share interests without actually knowing it. This article illustrates the development of a map of affinities for academic collectives, designed to be relevant to three audiences: the management, the scholars themselves, and the external public. Our case study involves the School of Architecture, Civil and Environmental Engineering of EPFL, hereinafter ENAC. The school consists of around 1,000 scholars, 70 laboratories, and 3 institutes. The actual affinities are modeled using the data available from the information systems reporting publications, teaching, and advising scholars, whereas the potential affinities are addressed through text mining of the publications. The major challenge for designing such a map is to represent the multi-dimensionality and multi-scale nature of the information. The affinities are not limited to the computation of heterogeneous sources of information; they also apply at different scales. The map, thus, shows local affinities inside a given laboratory, as well as global affinities among laboratories. This article presents a graphical grammar to represent affinities. Its effectiveness is illustrated by two actualizations of the design proposal: an interactive online system in which the map can be parameterized, and a large-scale carpet of 250 square meters. In both cases, we discuss how the materiality influences the representation of data, in particular the way key questions could be appropriately addressed considering the three target audiences: the insights gained by the management and their consequences in terms of governance, the understanding of the scholars’ own positioning in the academic group in order to foster opportunities for new collaborations and, eventually, the interpretation of the structure from a general public to evaluate the relevance of the tool for external communication.


The Intellectual Organisation of History

G. Colavizza / F. Kaplan; M. Franceschet (Dir.)

Lausanne, EPFL, 2018.

DOI : 10.5075/epfl-thesis-8537.

A tradition of scholarship discusses the characteristics of different areas of knowledge, in particular after modern academia compartmentalized them into disciplines. The academic approach is often put to question: are there two or more cultures? Is an ever-increasing specialization the only way to cope with information abundance or are holistic approaches helpful too? What is happening with the digital turn? If these questions are well studied for the sciences, our understanding of how the humanities might differ in their own respect is far less advanced. In particular, modern academia might foster specific patterns of specialization in the humanities. Eventually, the recent rise in the application of digital methods to research, known as the digital humanities, might be introducing structural adaptations through the development of shared research technologies and the advent of organizational practices such as the laboratory. It therefore seems timely and urgent to map the intellectual organization of the humanities. This investigation depends on few traits such as the level of codification, the degree of agreement among scholars, the level of coordination of their efforts. These characteristics can be studied by measuring their influence on the outcomes of scientific communication. In particular, this thesis focuses on history as a discipline using bibliometric methods. In order to explore history in its complexity, an approach to create collaborative citation indexes in the humanities is proposed, resulting in a new dataset comprising monographs, journal articles and citations to primary sources. Historians' publications were found to organize thematically and chronologically, sharing a limited set of core sources across small communities. Core sources act in two ways with respect to the intellectual organization: locally, by adding connectivity within communities, or globally as weak ties across communities. Over recent decades, fragmentation is on the rise in the intellectual networks of historians, and a comparison across a variety of specialisms from the human, natural and mathematical sciences revealed the fragility of such networks across the axes of citation and textual similarities. Humanists organize into more, smaller and scattered topical communities than scientists. A characterisation of history is eventually proposed. Historians produce new historiographical knowledge with a focus on evidence or interpretation. The former aims at providing the community with an agreed-upon factual resource. Interpretive work is instead mainly focused on creating novel perspectives. A second axe refers to two modes of exploration of new ideas: in-breadth, where novelty relates to adding new, previously unknown pieces to the mosaic, or in-depth, if novelty then happens by improving on previous results. All combinations possible, historians tend to focus on in-breadth interpretations, with the immediate consequence that growth accentuates intellectual fragmentation in the absence of further consolidating factors such as theory or technologies. Research on evidence might have a different impact by potentially scaling-up in the digital space, and in so doing influence the modes of interpretation in turn. This process is not dissimilar to the gradual rise in importance of research technologies and collaborative competition in the mathematical and natural sciences. This is perhaps the promise of the digital humanities.


Mapping affinities: visualizing academic practice through collaboration

D. Rodighiero / F. Kaplan; B. Beaude (Dir.)

EPFL, 2018.

DOI : 10.5075/epfl-thesis-8242.

Academic affinities are one of the most fundamental hidden dynamics that drive scientific development. Some affinities are actual, and consequently can be measured through classical academic metrics such as co-authoring. Other affinities are potential, and therefore do not have visible traces in information systems; for instance, some peers may share scientific interests without actually knowing it. This thesis illustrates the development of a map of affinities for scientific collectives, which is intended to be relevant to three audiences: the management, the scholars themselves, and the external public. Our case study involves the School of Architecture, Civil and Environmental Engineering of EPFL, which consists of three institutes, seventy laboratories, and around one thousand employees. The actual affinities are modeled using the data available from the academic systems reporting publications, teaching, and advising, whereas the potential affinities are addressed through text mining of the documents registered in the information system. The major challenge for designing such a map is to represent the multi-dimensionality and multi-scale nature of the information. The affinities are not limited to the computation of heterogeneous sources of information, they also apply at different scales. Therefore, the map shows local affinities inside a given laboratory, as well as global affinities among laboratories. The thesis presents a graphical grammar to represent affinities. This graphical system is actualized in several embodiments, among which a large-scale carpet of 250 square meters and an interactive online system in which the map can be parameterized. In both cases, we discuss how the actualization influences the representation of data, in particular the way key questions could be appropriately addressed considering the three target audiences: the insights gained by the management and the relative decisions, the understanding of the researchers’ own positioning in the academic collective that might reveal opportunities for new synergies, and eventually the interpretation of the structure from an external standpoint that suggesting the relevance of the tool for communication.



TimeRank: A dynamic approach to rate scholars using citations

M. Franceschet; G. Colavizza

Journal of Informetrics. 2017.

DOI : 10.1016/j.joi.2017.09.003.

Rating has become a common practice of modern science. No rating system can be considered as final, but instead several approaches can be taken, which magnify different aspects of the fabric of science. We introduce an approach for rating scholars which uses citations in a dynamic fashion, allocating ratings by considering the relative position of two authors at the time of the citation among them. Our main goal is to introduce the notion of citation timing as a complement to the usual suspects of popularity and prestige. We aim to produce a rating able to account for a variety of interesting phenomena, such as positioning raising stars on a more even footing with established researchers. We apply our method on the bibliometrics community using data from the Web of Science from 2000 to 2016, showing how the dynamic method is more effective than alternatives in this respect.


The Core Literature of the Historians of Venice

G. Colavizza

Frontiers in Digital Humanities. 2017.

DOI : 10.3389/fdigh.2017.00014.

Over the past decades, the humanities have been accumulating a growing body of literature at an increasing pace. How does this impact their traditional organization into disciplines and fields of research therein? This article considers history, by examining a citation network among recent monographs on the history of Venice. The resulting network is almost connected, clusters of monographs are identifiable according to specific disciplinary areas (history, history of architecture, and history of arts) or periods of time (middle ages, early modern, and modern history), and a map of the recent trends in the field is sketched. Most notably a set of highly cited works emerges as the core literature of the historians of Venice. This core literature comprises a mix of primary sources, works of reference, and scholarly monographs and is important in keeping the field connected: monographs usually cite a combination of few core and a variety of less well-cited works. Core primary sources and works of reference never age, while core scholarly monographs are replaced at a very slow rate by new ones. The reliance of new publications on the core literature is slowly rising over time, as the field gets increasingly more varied.


The structural role of the core literature in history

G. Colavizza

Scientometrics. 2017.

DOI : 10.1007/s11192-017-2550-4.

The intellectual landscapes of the humanities are mostly uncharted territory. Little is known on the ways published research of humanist scholars defines areas of intellectual activity. An open question relates to the structural role of core literature: highly cited sources, naturally playing a disproportionate role in the definition of intellectual landscapes. We introduce four indicators in order to map the structural role played by core sources into connecting different areas of the intellectual landscape of citing publications (i.e. communities in the bibliographic coupling network). All indicators factor out the influence of degree distributions by internalizing a null configuration model. By considering several datasets focused on history, we show that two distinct structural actions are performed by the core literature: a global one, by connecting otherwise separated communities in the landscape, or a local one, by rising connectivity within communities. In our study, the global action is mainly performed by small sets of scholarly monographs, reference works and primary sources, while the rest of the core, and especially most journal articles, acts mostly locally.



Epidemics in Venice: On the Small or Large Nature of the Pre-modern World

G. Colavizza

2016. International Workshop on Computational History and Data-Driven Humanities , Dublin, Ireland , May 25, 2016. p. 33-40.

DOI : 10.1007/978-3-319-46224-0_4.

Marvel et al. [12] recently argued that the pre-modern contact world was physically and, by set inclusion, socially not small-world. Since the Black Death and similar plagues used to spread in well-defined waves, the argument goes, the underlying contact network could not have been small-world. I counter here that small-world contact networks were likely to exist in pre-modern times in a setting of the greatest importance for the outbreak of epidemics: urban environments. I show this by running epidemic diffusion simulations on the transportation network of Venice, verifying how such network becomes small-world when we account for naval transportation. Large epidemic outbreaks might not have been even possible without the catalyst of urban small-worlds.


Visualizing Complex Organizations with Data

D. Rodighiero

IC Research Day, Lausanne, Switzerland, June 30, 2016.

The Affinity Map is a project founded by the ENAC whose aim is to provide an instrument to understand organizations. The photograph shows the disclosure of the first map for the ENAC Research Day. The visualization was presented to scholars who are displayed in the representation itself.



Character network analysis of Émile Zola’s Les Rougon-Macquart

Y. Rochat

2015. Digital Humanities 2015 , Sydney , June 29 - July 3, 2015.

In this work, we use network analysis methods to sketch a typology of fiction novels based on characters and their proximity in the narration. We construct character networks modelling the twenty novels composing Les Rougon-Macquart, written by Émile Zola. To categorise them, we rely on methods that track down major and minor characters relative to the character-systems. For that matter, we use centrality measures such as degree and eigenvector centrality. Eventually, with this analysis of a small corpus, we open the stage for a large-scale analysis of novels through their character networks.


The DHLAB Trajectory

D. Rodighiero; A. Rigal; L. Cellard

IC Research Day 2015, EPFL, SwissTech Convention Center, EPFL, Lausanne, Switzerland, 30-6, 2015.

This visualisation represents the research activity of the Digital Humanities Lab through publications and co-authorship. Vertical disposition is ordered by time: each layer is a different year of publications, from the lab’s foundation to nowadays. The layers displays the collaboration networks: two researchers are linked if they published together. The vertical trajectories represent the activity of a researcher through the time. The authors position is fix in the space. As consequence, the trajectories become a linear representation of collaborations continuity. The laboratory is here transformed in a geometrical structure which evolves in time despite the members instability.


Representing the Digital Humanities Community: Unveiling The Social Network Visualization of an International Conference

D. Rodighiero

Parsons Journal of Information Mapping. 2015.

This paper deals with the sense of represent- ing both a new domain as Digital Humanities and its community. Based on a case study, where a set of visualizations was used to represent the community attending the international Digital Humanities conference of 2014 in Lausanne, Switzerland, the meaning of representing a community is investigated in the light of the theories of three acknowledged authors, namely Charles Sanders Peirce for his notion of the interpretant, Ludwig Wittgenstein for his insights on the use of language, and finally Bruno Latour for his ideas of representing politics. There results a proposal to designing and interpreting social network visualizations in a more thoughtful way, while remaining aware of the relation between objects in the real world and their visualizations. As this type of work pertains to a wider scope, we propose bringing a theoretical framework to a young domain such as data visualization.



Carlo Helman : merchant, patron and collector in the Antwerp – Venice migrant network

I. di Lenardo

Art and Migration. Netherlandish Artists on the Move, 1400-1750.; Leiden: Brill, 2014. p. 325-347.

This contribution is part of the monographic number of the Nederlands Yearbook for History of Art dedicated to a large overview on the “Art and Migration. Nethelandish Artists on the Move, 1400-1750”. In the dynamics of migration, circulation, establishing trough Europe in the Modern Era, the network’s analysis play a fundamental role. The essay explores the prominent role played by Antwerp merchants in Venice in forging contacts between artists, patrons and agent of art in promoting the exchange of goods and ideas within their adopted home. In the course of the 16th century, and more particularly towards the end of that period, the complex network of Netherlandish merchant families, operating on a European level, played a crucial role in the circulation of artists, paintings and other artworks in Italy and beyond. The article proposed here deals with Carlo Helman, a Venetian resident of Antwerp origins, a major figure whose importance in this context has been insufficiently studied. Helman’s family firm traded in practically every kind of commodity, ranging from wool and spices to pearls and diamonds, and, indeed, artworks, “in omnibus mundis regnis”, as we read in the commemorative inscription on his monumental tomb in the Venetian church of Santa Maria Formosa. A high-class international trader in Venice, Helman was consul of the “Nattione Fiamenga”. Helman had a conspicuous collection of art, including classics of the “Venetian maniera” like Titian, Veronese and Bassano, but also important pictures by Northern masters. Moreover, his collection contained a remarkable cartographic section. In Venice, Helman had contacts with the Bassano dynasty, Paolo Fiammingo, Dirck de Vries, Lodewijck Toeput (Pozzoserrato) and the Sadeler brothers, artists who, in one way or another, introduced novel themes and typologies on the Italian, and, indeed, European market. The dedication to Helman on a print by Raphael Sadeler, reproducing Bassano’s Parable of the Sower, photographs the merchant’s role in the diffusion of Bassanesque themes in the North. Helman’s connections with the Zanfort brothers, dealers in tapestries and commercial agents of Hieronymus Cock are further indications of the merchant’s exemplary role of collector, merchant and agent of artists in a European network of “art” commerce.


Digital Humanities 2014: representing a controverted definition

D. Rodighiero

IC Research Day 2014, EPFL, Lausanne, Switzerland, June 12, 2014.

The network portrays all keywords used in the Digital Humanities 2014 conference, which will take place in Lausanne, Switzerland. The keywords‚ represented by nodes‚have been freely chosen by each author attending the conference, contributed via their papers and posters. Edges represent keywords appearing together in a contribution. The weight of the edges measures the occurrence of keywords pairs, multiplied by the number of authors creating them. The visualization is meant as a talking point, to foster a debate about the controversial definition of the Digital Humanities domain.


Character Networks and Centrality

Y. Rochat / H. Volken; F. Kaplan (Dir.)

University of Lausanne, 2014.

A character network represents relations between characters from a text; the relations are based on text proximity, shared scenes/events, quoted speech, etc. Our project sketches a theoretical framework for character network analysis, bringing together narratology, both close and distant reading approaches, and social network analysis. It is in line with recent attempts to automatise the extraction of literary social networks (Elson, 2012; Sack, 2013) and other studies stressing the importance of character- systems (Woloch, 2003; Moretti, 2011). The method we use to build the network is direct and simple. First, we extract co-occurrences from a book index, without the need for text analysis. We then describe the narrative roles of the characters, which we deduce from their respective positions in the network, i.e. the discourse. As a case study, we use the autobiographical novel Les Confessions by Jean-Jacques Rousseau. We start by identifying co-occurrences of characters in the book index of our edition (Slatkine, 2012). Subsequently, we compute four types of centrality: degree, closeness, betweenness, eigenvector. We then use these measures to propose a typology of narrative roles for the characters. We show that the two parts of Les Confessions, written years apart, are structured around mirroring central figures that bear similar centrality scores. The first part revolves around the mentor of Rousseau; a figure of openness. The second part centres on a group of schemers, depicting a period of deep paranoia. We also highlight characters with intermediary roles: they provide narrative links between the societies in the life of the author. The method we detail in this complete case study of character network analysis can be applied to any work documented by an index.


Analyse des réseaux de personnages dans Les Confessions de Jean-Jacques Rousseau

Y. Rochat; F. Kaplan

Les Cahiers du Numérique. 2014.

DOI : 10.3166/LCN.10.3.109‐133.

Cet article étudie le concept de centralité dans les réseaux de personnages apparaissant dans Les Confessions de Jean-Jacques Rousseau. Notre objectif est ainsi de caractériser certains aspects des rôles des personnages du récit sur la base de leurs cooccurrences dans le texte. We sketch a theoretical framework for literary network analysis, bringing together narratology, distant reading and social network analysis. We extract co-occurrences from a book index without the need for text analysis and describe the narrative roles of the characters. As a case study, we use the autobiographical novel Les Confessions from Jean-Jacques Rousseau. Eventually, we compute four types of centrality — degree, closeness, betweenness, eigenvector — and use these measures to propose a typology of narrative roles for the characters.


A Network Analysis Approach of the Venetian Incanto System

Y. Rochat; M. Fournier; A. Mazzei; F. Kaplan

2014. Digital Humanities 2014 , Lausanne , July 7-12, 2014.

The objective of this paper was to perform new analyses about the structure and evolution of the Incanto system. The hypothesis was to go beyond the textual narrative or even cartographic representation thanks to network analysis, which could potentially offer a new perspective to understand this maritime system.


Modeling Venice's maritime network - End 13th to Mid. 15th centuries

M. Fournier; Y. Rochat

2014. International Workshop ERC World Seastems - Maritime Networks in Space and Time , Paris , June 16-18, 2014.


Character networks in Les Confessions from Jean-Jacques Rousseau

Y. Rochat; F. Kaplan

2014. Texas Digital Humanities Conference , Houston, Texas, USA , April 10-12, 2014.



A social network analysis of Rousseau’s autobiography “Les Confessions”

Y. Rochat; F. Kaplan; C. Bornet

2013. Digital Humanities 2013 , Lincoln, Nebraska, USA , July 15-19, 2013.

We propose an analysis of the social network composed of the characters appearing in Jean-Jacques Rousseau's autobiographic Les Confessions, with existence of edges based on co-occurrences. This work consists of twelve volumes, that span over fifty years of his life. Having a unique author allows us to consider the book as a coherent work, unlike some of the historical texts from which networks often get extracted, and to compare the evolution of patterns of characters through the books on a common basis. Les Confessions, considered as one of the first modern autobiographies, has the originality to let us compose a social network close to the reality, only with a bias introduced by the author, that has to be taken into account during the analysis. Hence, with this paper, we discuss the interpretation of networks based on the content of a book as social networks. We also, in a digital humanities approach, discuss the relevance of this object as an historical source and a narrative tool.


Analyse de réseaux sur les Confessions de Rousseau

Y. Rochat; F. Kaplan

2013. Humanités délivrées , Lausanne, Switzerland , October 1-2, 2013.



The oltramontani Network in Venice: Hans von Aachen in Context

I. di Lenardo

2010. Hans von Aachen in Context, Proceedings of the International Conference , Prague , September 22–25, 2010. p. 28-37.

Thanks to recent archival and historical researches it is now possible to specify the identity of some personalities told in the Lives of Van Mander relating and in close contact with Hans von Aachen. The reconstruction of Venice and Treviso context, in which the artist moves, shows a thick network of relationships woven by Flemish and German communities. The presence of a portrait by Hans von Aachen in the collection of paintings of Francesco Vrients is information very valuable: firstly outlines the painter as an intimate friend of the family Vrients, and in the same time the discovery of the inscription on the drawings of Cephalus and Procri (presented for this exhbition) it is an important pointer for profiling the Vrients-circle and its relationships with the flemish jewellers lobby. Indeed is him the collector of Maastricht mentioned by Van Mander and one of the most eminent flemish personality in the lagoon, around whom, probably, gravitated intellectuals and artists: is a fact that in his house, in Campo Santa Maria Formosa, found hospitality the literate Pieter Cornelisz de Hooft on the occasion of his trip in Italy in 1599. Additional documents shall also specify the role of Gaspar Rem in a venetian and international context: his strong tie to the circle of the “Sadelers” who, especially with a shrewd art dealer like Giusto, play a crucial role promoting artists “Oltramontani” weaving friendship with Dirck de Vries, Rottenhammer, Joannes Koenig to name a few.