Visualization techniques
Visual Analytics for Linguists
ESSLLI 2014
Chris Culy
revised 7 August 2014
A note on programming languages
- Much of the interesting work in data visualization is on the web in Javascript
This is what I will use for illustrations
- Other languages are possible, especially R, Python, and Java
Sources of javascript examples
Charts
- Charts are basic, but important
- Lots of tools to make charts. Some of them are interactive.
- An interesting comparison of chart types with the same data is here
Time-varying data: Time series
Data (typically numeric) which varies over time, e.g. frequencies in time-annotated corpus, e.g. Google Ngram Viewer
Time-varying data: Timelines
Timelines are typically used for events, rather than numeric data
Maps
Lots of things you can do with them, but they can be difficult to get right.
- Google Maps, Open Street Map, etc.
- GIS systems are an alternative, e.g. CartoDB (there is a free version)
- For DIY, D3 has some mapping capabilities and more specialized Polymaps
A note about graphs
Graphs are a very popular construct, especially for (social) networks. However, we can make graphs out of many types of data.
The key is to decide what information will correspond to nodes and what information to edges. Typically:
- Nodes correspond to entities ("things", or "objects")
- Links correspond to relationships between the entities
For example, if we have word co-occurrence information at the sentence level, we would let nodes be the words, and two nodes would be linked if they co-occur in the same sentence.
Graphs are abstract mathematical objects! They can be visualized in a variety of ways.
Graphs/Networks (non-hierarchical)
Lots of ways to represent graphs/networks. Some algorithms are slow for large amounts of data, so be careful.
- Demo of 3 different graph visualizations
- Comparison of those same types (in Protovis),
Force-directed node-link (Advantage: does clustering automatically)
Arc diagram
Matrix diagram
- Lots of tools to do force-directed layouts, e.g. D3, theJIT for javascript.
Gephi is Java champion
- GraphViz has several different algorithms, also for hierarchical graphs. Harder to integrate
- Dagre: directed graphs in javascript (optionally with D3)
Hierarchical data (trees as lines)
Lots and lots of ways to represent hierarchical data. Trees are one type of hierarchical data.
Hierarchical data (trees as area)
Some considerations for LangVis
cf IBM's Many Eyes as an example of these
- Ability to see the original data
- Ability to navigate backwards and forwards between states (not common)
- Ability to link to a state (not common)
- Ability to annotate the visualization (very rare)
Charts with language / linguistic data
Dictionaries and related
- Visuwords WordNet (there are other similar ones)
- Docuburst
by Chris Collins et al. Documents + WordNet
- Dictionary definitions over time by Theron and Fontanillo
- [Demo] Verweis Viewer: link relations in a terminology database.
C. Culy, E. Chiocchetti, and N. Ralli. "Visualizing conceptual relations in legal terminology" in Proceedings 2013 17th International Conference on Information Visualization IV 2013, July 16-18, 2013, London, UK. 333-338