connector | server | visualization |
---|---|---|
gogr (for R) | gogd (in Clojure) | gog-dummy (toy example) |
gogpy (for Python) | gog-charted.co (the charted.co interface) | |
gogi (general scatterplot) |
gog
separates data processing and data visualization. Everybody wants to have nice interactive visualizations in a browser anyway. gog
is a three-piece architecture:
- connector from data processing environment to server
- gog server to pass data from connector to visualization
- browser-based data visualization that accepts data from server
All the pieces can be swapped around and even hosted in different places, allowing quite a few combinations.
Analogy with ggplot2:
ggplot(data=your_data)
All you need is a function (gog
) that HTTP POSTs your data to a gog
server. As currently implemented, that means POST to http://localhost:4808/data
. Currently, data is passed as a JSON array of simple objects, like [{"var_name": 5, ...
.
- gogr: an R package for sending data to a gog server
- gogpy: a Python package for sending data to a gog server
These are super easy to make in any language with support for JSON and HTTP.
Analogy with ggplot2: you don't need a server because everything's in
R
As currently implemented, a gog server runs on port 4808. That port is also used by the game "Command and Conquer Red Alert" and it is certainly acceptable to use another port.
As currently implemented, a gog server accepts a POST body at /data
and rebroadcasts it to all clients listening to the websocket at /data
. The server only passes the contents through, as text.
These are super easy to make in any language with support for HTTP and websockets.
Analogy with ggplot2:
aes(x=variable) + geom_histogram()
etc.
or
"Dear internet, please port ggplot from R to Javascript" - Joshua Gourneau, 2011
These are just HTML/CSS/JavaScript, viewed in a browser. They connect via websocket to http://localhost:4808/data
and accepting incoming JSON arrays of simple objects, like [{"var_name": 5, ...
. Then they present a data visualization and support some level of interactivity.
- gogi: a toy example that just displays the received text data
- gog-charted.co: the charted.co interface extended to support
gog
It's not super easy to make a good component here, but a component can then be used with any language/environment/system that sends data into gog
.
It would be nice to have visualizations that support useful features like exporting to common formats, maintaining a history of recent data sets and visualizations, and switching between common visualization types.
Something like the "graphboard" from Wilkinson's Grammar of Graphics would be nice.
You should be able to use whatever language you want for data processing and still have all the same visualization tools at your fingertips.
You should be able to visualize interactively—both quickly making new plots and interacting with your current plots—regardless of what machine(s) your data code is running on.
You should be able to have total control over your data and visualization systems, without handing data over to or otherwise relying on external providers.
We need more and better browser-based data visualization tools that are gog-compatible, sufficiently flexible, and sufficiently feature-rich.
There are places where the separation between data processing and data visualization is not always clear. Which end of the system is responsible for binning a histogram?
Ad hoc development and extension of gog
could break compatibility between components.
- Gosh a lot of it is just building out cool front-end pieces.
- Add an additional control channel for interacting with visualizations from programming environments.
- Develop or implement an existing format for representing a visualization for interoperability.
- Some clever scheme for dynamic port assignments and so on.
- Would it make sense to implement with web components somehow?
- Bundle a gog server with some good visualizations and distribute as an easy-to-run package.
- A web service for sharing visualizations.
ggobi, successor to XGobi, is a system for multivariate data visualization. It's mostly separate from data processing tools because of its focus on visualization. There is a package (rggobi) for interacting with ggobi
from R. So with that, the architecture is something like gog
. Unfortunately, ggobi
doesn't seem to be actively maintained. And it uses Gtk2. But it does have a lot of neat features that aren't availble many other places. Hadley says that ggvis
and tourr
will eventually succeed ggobi
.
imMens (read as "immense") is a cool project that has a gog
-style split architecture but very tight coupling between the data/server and browser side. It does pre-processing of possibly large datasets, then passes data encoded as PNG graphics to a browser where it is further processed and displayed using clever WebGL. The whole idea is a lot of fun and they say (in the paper) that they're working on making it easier to create imMens
visualizations.
htmlwidgets is a very neat project that makes it easy to generate web visualizations from and in R. It's all very R-based, and the functions that get produced can take any sort of input data and arguments. The way they've standardized the approach, however, means it would likely be relatively straightforward to take an htmlwidget
-ized visualization and transform it to a gog
visualization.
Plotly is pretty neat. It's similar to gog
but has more requirements for articulating data and plot options in the connectors (specifying traces
, etc.) and the server and front-ends come from Plotly's machines. Also Plotly is a business that needs to make money, and their products are not Free or open source.