Linking Data
Data linking allows for analysis and visualization of multiple related data sets. Each data set can be "linked" to others either via shared identical attributes or in more sophisticated ways. Data sets are never joined or merged: instead, a logical network is build between all linked data sets. Implied (transitive) links are established automatically once user-input links are established. Once data sets are linked, "selections" made graphically or algorithmically propagate across all open displays. Examples are given below.
Planned improvements: AI-enahnced domain-specific auto-linking.
🌎 An Intuitive Example, from Geographic Information Science (GIS)
The glue-based "airplane" demo, explains how to link a table containing data on planes' 3D flight paths to maps. Straighforward "identity" linking on coordinates, is shown at minute 1:35 of the associated "Learn to fly glue, fast" video. If you're impatient & have great eyesight, see the gif↓.
🪐 An example of high-dimensional data, from Astronomy
Often times, astronomers just want to combine tabular data with images (akin to satellite maps), not unlike the "airplane" example above. But, some astronomers are also lucky enough to have "3D" data, and sometimes even 3D measurements as a function of time. Thus, a variety of formats: tabular; 2D images; 3D cubes, sometimes in a variety of coordinate systems, are often used in a single investigation. The screenshots you can open below show the many data sets linked together in glue by Bialy et al. (2021) in their discovery of the "Perseus-Taurus Supershell" described in the video here, which described the associated discovery.
Identity links (even if variable names don't match)
Some of the "3D" data used in the PerTau study were "position-position-velocity" cubes, so they are linked on two spatial coordinates, and also on velocity, called "vrad" in one data set and "vopt" in the other. The expert user knows that these variable names mean the same thing, and can just manually establish a link, ignoring the different names.
"Advanced" links, e.g. coordinate transformations
Astronomy coordinate systems can be annoying, owing to the messy spherical geometry and time-dependence needed to convert amongst them. In the version of glue used by astronomers, many of these transformation functions can be accessed direcly from the linking interface, which offers some "advanced" links by default. Users can also define their own links, of arbitrary complexity.
🧬 A very sophisticated example, using sumary statistics, from Biology
Sometimes, one needs to perform a calculation to transform a subset from one dataset to another. Here, glue genes is used to link an eQTL dataset to a spatial transcriptomics dataset, first by linking the genes across the two datasets and then by calculating a summary statistic on the gene expression matrix so that observations can be color-coded by their expression of that gene subset.
Smart Linking
As AI and other new computational approaches to linking emerge, the LIVE team are planning on expanding "smart" linking options, saving the user a good deal of time and reducing the opportunity for accidental errors. Read more on our AI page.