Can you recommend basic data visualization tools?

6

What are some good basic visualization tools to help present and analyze data sets, whether for internal use or to embed on a site?

Tags: asked April 13, 2010

Leave a Reply

15 Answers

5

Chris's answer hits many of the high points. Others worth looking at:

Flash-based (generally to be avoided, but sometimes useful):

Javascript-based:

Miscellaneous:

  1. Flare is exceptionally cool, but last time I tried it, the documentation was really vexing. The demo is brilliant, but you essentially have to reverse-engineer the source code to figure out how it’s doing what it’s doing. (And that code is not easy to follow.)

    Though that was a year ago now, and I haven’t tried it since; does anyone else have a more recent experience?

Leave a Reply

550
5

R Specifically the lattice & ggplot2 libraries

Learn the power of condition on a categorical variable to make multiple plots:

xyplot( numphone ~ year | country, data=wp, type="b", scales="free")

Full Example (gist)

The key here is the "| country" notation which "conditions" on country to create one plot for each country in the dataset. This can be a great way to rapidly explore a dataset of counties, states, school districts, whatever. You decide what you want on your x and y axes and then condition across an appropriate conditioning variable in one line of code you can have thousands of plots output showing how your x&y vary across categories. This is the simplest case, there is much more power under the hood to unleash.

To do multiple pages and store to pdf just do:

pdf('~/path/to/mydoc.pdf', width=11, height=8.5)
xyplot( numphone ~ year | country, data=wp, type="b", scales="free", layout=c(1,1) )
dev.off()

You'll then get one page for as many countries as are in the dataset.

ggplot2 has a similar feature known as "faceting" the data.

This concept of "conditioning" and "faceting" can let you slice massive datasets into digestible morsels very rapidly.

  1. Yes yes yes. After using it for a few weeks I can’t believe I forgot to mention R + lattice. Its tremendously powerful. +1

Leave a Reply

50
3

I'm a fan of ManyEyes, but the limited color schemes bug me and it can be a browser crasher. Swivel is one alternative. Here's an example where we used it: http://www.pbs.org/newshour/rundown/2010/04/a-century-of-coal-mining-deaths.html

Of those two, ManyEyes will probably give you a faster look at the data. Swivel will give you a nicer presentation at the end.

Processing (and NodeBox, if you prefer Python to Java), is harder but will do things Swivel and ManyEyes won't.

When you just need a chart, there's always Google Charts.

Leave a Reply

785
2

Simile Widgets, which could be used at a basic level or highly advanced, can be pretty robust. I'm a big fan of the Exhibit framework, which I used for a project last summer at DMN.

  1. As one of the project leads I’m biased, but I think Exhibit is quite powerful and very easy to use. You don’t need to install anything; it’s just a javascript library. You serve up data in a file (csv, json, xml) or link to a google spreadsheet, then add “html” tags like , , , , to your page. Exhibit understands those tags and turns them into interactive visualization widgets. There’s also a wordpress plugin, datapress, that lets you put visualizations into your blog posts. Several newspapers have tried it; I’ve put links to them at http://people.csail.mit.edu/karger/Exhibit/CAR/

Leave a Reply

554
1

ManyEyes by IBM is one that comes to mind:

http://manyeyes.alphaworks.ibm.com/manyeyes/

Also just saw this post mentioned on NICAR-L that discusses Processing:

http://flowingdata.com/2010/04/13/data-visualization-tutorial-in-processing/

Leave a Reply

385
1

For internal use, R is great. (It's on my long list of things to learn, but watching Amanda Cox work magic with it convinced me it is absolutely the thing to learn.)

For external use, I'll toss in another vote for Google charts. They are quite easy to style, and they work great. These charts are Google, and so are these.

Many eyes is OK, but is done with Java and just kind of a rotten user experience. However, keep an eye on this company: It was started by Fernanda Vi"gas and Martin Wattenberg, the geniuses behind Many Eyes. If anyone is capable of designing the off-the-shelf data viz toolset journalists have been waiting for, it's them.

  1. I really like Google Charts, and have been using it for a large visualization project we’re launching next month. But I also find it to be pretty slow when dealing with a lot of visualizations (or in general).

    But of course I may not be using it optimally.

Leave a Reply

208
1

The JIT (Javascript InfoVis Toolkit) is a good option when dealing with networks or any tree-like visualizations.

One great thing is that you can feed data to it in JSON format :) It also supports ajax for dynamic loading of nodes.

Website: http://thejit.org/

Leave a Reply

10
0

When it comes to building more complex data displays for the web, Adobe Flex is really cool; I'm surprised more papers aren't using it.

It generates Flash files, but unlike Flash, it was designed to make web apps, not web animations.

The pro version comes with a handful of components to display data -- charts, graphs, all that -- and it's pretty easy to "bind" data to bits of the page, so that records can be easily displayed. At the Palm Beach Post, where I work, we built this with it (on deadline) in two days: http://stimulus.palmbeachpost.com/map/

Leave a Reply

95
0

Excel jockeys who want to make the leap to R will get a lot of help from the Learn R tutorials.

There's if you like Excel but sometimes need R functions, the Excel add-in RExcel is the best tool out there for integration.

But I think I've gotten away from the original question about basic dataviz tools...

Leave a Reply

395
0

ChrisA already mentioned Google Charts, but just wanted to add those include Interactive Charts via their Visualization API. Haven't used 'em yet, but there Gallery is impressive.

Leave a Reply

63
0

Given that the discussion of KMS tends to be highly abstract, and that certain kinds of articles may be more amenable to data-ification than others, Daniel's exercise of extracting possible data from real-life news pieces on different beats seems fruitful.

Anybody want to take a stab at a couple of typical articles from the Chicago News Cooperative, where I work?

http://www.chicagonewscoop.org/the-chicago-way-city-treasurer-threatens-defamation-suit-over-policeman%E2%80%99s-articles/

http://www.chicagonewscoop.org/library-buys-14th-century-book-by-catholic-rebels/

  • Subject of article: 14th century book that was purchased by The Newberry
  • Physical object: handwritten text covered in a blue pattern
  • Book language: Latin
  • Number of pages in book: 120
  • Book author: Peter John Olivi
  • Book author personality: "Described as charismatic and brilliant, Olivi was regarded as especially dangerous by church authorities for his intellectual influence on Catholics"
  • Cost of the book: $45,000 (for a larger collection)
  • Where it was purchased from: auction at Christie's
  • Source of funding: B.H. Breslauer Foundation
  • Prior owners of book: Dominican library in France, Hispanic Society of America
  • Book location: The Newberry, 60 West Walton Street near North Clark Street
  • Ancillary information about The Newberry: "includes first-edition King James Bible, as well as letters from Napoleon III and Thomas Jefferson, among its collection of 1.5 million books"

http://www.chicagonewscoop.org/a-post-with-little-to-do-but-plenty-of-money-to-do-it/

  • Subject of article: vice mayor of Chicago's budget (which is derivative of the Chicago budget)
  • Name of position: vice mayor of Chicago
  • Salary of position: unpaid
  • When position was created: 1976
  • Budget amount: $114,232
  • Current person: Alderman Bernard L. Stone
  • Various opinions

I'm still a little unclear on what the range of applications of (semi-)structured data might be.

And on what it would take, technically speaking, to do it. (We're currently building out our site... is this the kind of thing that has to be planned early on and built in to the backend? I'd hate to miss out on an opportunity out of ignorance.)

  1. “is this the kind of thing that has to be planned early on and built in to the backend?”

    Almost certainly not. Systems planned early and built into the core in the absence of clear use cases and requirements almost inevitably become unwieldy burdens.

    Seek to design a clean simple system with well defined functions so that later it can be mixed with other simple systems. Don’t worry about accounting for everything at the beginning, as long as you can be confident that you aren’t closing off any options…

Leave a Reply

15
0

If you are a software developer programming in .NET and Silverlight, ActiveAnalysis by GrapeCity is a component+library allows you to embed rich, interactive OLAP and data visualization features into your applications without writing a single line of code. Plus you can customize the user experience using standard programming techniques.

http://www.datadynamics.com/Products/ActiveAnalysis

Of particular relevance to this thread is the free stand-alone visualization app that is installed with the free trial download that you can use to generate visualizations without opening up any programming tool.

There are no run-time royalties. You can embed and deploy on the web and on unlimited number of desktops without additional costs other than the developer license.

Leave a Reply

0
0

For those of us more hack than hacker, has anyone tried Tableau? Just ran across it while browsing Koci's twitter feed, but don't have a Window's machine to try it out. Seems like a few paper's have started to use it already though:

WSJ: Tale of 100
Wired: Console Wars!
Seattle Times: Worker Injury Claims

Leave a Reply

0
0

If you are a software developer programming in .NET and Silverlight, ActiveAnalysis by GrapeCity is a component+library allows you to embed rich, interactive OLAP and data visualization features into your applications without writing a single line of code. Plus you can customize the user experience using standard programming techniques.

Leave a Reply

0
0

You can try our production FMiner: http://www.fminer.com . Now we will FREE make a extraction project for every new uses.

Leave a Reply

0

Your Answer

Please login to post questions.