Yes yes yes. After using it for a few weeks I can’t believe I forgot to mention R + lattice. Its tremendously powerful. +1
Can you recommend basic data visualization tools?
What are some good basic visualization tools to help present and analyze data sets, whether for internal use or to embed on a site?
Leave a Reply
You must be logged in to post a comment.
15 Answers
Chris's answer hits many of the high points. Others worth looking at:
Flash-based (generally to be avoided, but sometimes useful):
Javascript-based:
Miscellaneous:
- Google Fusion Tables
- Wordle (word clouds)
Leave a Reply
You must be logged in to post a comment.
R Specifically the lattice & ggplot2 libraries
Learn the power of condition on a categorical variable to make multiple plots:
xyplot( numphone ~ year | country, data=wp, type="b", scales="free")
The key here is the "| country" notation which "conditions" on country to create one plot for each country in the dataset. This can be a great way to rapidly explore a dataset of counties, states, school districts, whatever. You decide what you want on your x and y axes and then condition across an appropriate conditioning variable in one line of code you can have thousands of plots output showing how your x&y vary across categories. This is the simplest case, there is much more power under the hood to unleash.
To do multiple pages and store to pdf just do:
pdf('~/path/to/mydoc.pdf', width=11, height=8.5)
xyplot( numphone ~ year | country, data=wp, type="b", scales="free", layout=c(1,1) )
dev.off()
You'll then get one page for as many countries as are in the dataset.
ggplot2 has a similar feature known as "faceting" the data.
This concept of "conditioning" and "faceting" can let you slice massive datasets into digestible morsels very rapidly.
Leave a Reply
You must be logged in to post a comment.
I'm a fan of ManyEyes, but the limited color schemes bug me and it can be a browser crasher. Swivel is one alternative. Here's an example where we used it: http://www.pbs.org/newshour/rundown/2010/04/a-century-of-coal-mining-deaths.html
Of those two, ManyEyes will probably give you a faster look at the data. Swivel will give you a nicer presentation at the end.
Processing (and NodeBox, if you prefer Python to Java), is harder but will do things Swivel and ManyEyes won't.
When you just need a chart, there's always Google Charts.
Leave a Reply
You must be logged in to post a comment.
Simile Widgets, which could be used at a basic level or highly advanced, can be pretty robust. I'm a big fan of the Exhibit framework, which I used for a project last summer at DMN.
Leave a Reply
You must be logged in to post a comment.
ManyEyes by IBM is one that comes to mind:
http://manyeyes.alphaworks.ibm.com/manyeyes/
Also just saw this post mentioned on NICAR-L that discusses Processing:
http://flowingdata.com/2010/04/13/data-visualization-tutorial-in-processing/
Leave a Reply
You must be logged in to post a comment.
For internal use, R is great. (It's on my long list of things to learn, but watching Amanda Cox work magic with it convinced me it is absolutely the thing to learn.)
For external use, I'll toss in another vote for Google charts. They are quite easy to style, and they work great. These charts are Google, and so are these.
Many eyes is OK, but is done with Java and just kind of a rotten user experience. However, keep an eye on this company: It was started by Fernanda Vi"gas and Martin Wattenberg, the geniuses behind Many Eyes. If anyone is capable of designing the off-the-shelf data viz toolset journalists have been waiting for, it's them.
Leave a Reply
You must be logged in to post a comment.
The JIT (Javascript InfoVis Toolkit) is a good option when dealing with networks or any tree-like visualizations.
One great thing is that you can feed data to it in JSON format :) It also supports ajax for dynamic loading of nodes.
Website: http://thejit.org/
Leave a Reply
You must be logged in to post a comment.
When it comes to building more complex data displays for the web, Adobe Flex is really cool; I'm surprised more papers aren't using it.
It generates Flash files, but unlike Flash, it was designed to make web apps, not web animations.
The pro version comes with a handful of components to display data -- charts, graphs, all that -- and it's pretty easy to "bind" data to bits of the page, so that records can be easily displayed. At the Palm Beach Post, where I work, we built this with it (on deadline) in two days: http://stimulus.palmbeachpost.com/map/
Leave a Reply
You must be logged in to post a comment.
Excel jockeys who want to make the leap to R will get a lot of help from the Learn R tutorials.
There's if you like Excel but sometimes need R functions, the Excel add-in RExcel is the best tool out there for integration.
But I think I've gotten away from the original question about basic dataviz tools...
Leave a Reply
You must be logged in to post a comment.
ChrisA already mentioned Google Charts, but just wanted to add those include Interactive Charts via their Visualization API. Haven't used 'em yet, but there Gallery is impressive.
Leave a Reply
You must be logged in to post a comment.
Given that the discussion of KMS tends to be highly abstract, and that certain kinds of articles may be more amenable to data-ification than others, Daniel's exercise of extracting possible data from real-life news pieces on different beats seems fruitful.
Anybody want to take a stab at a couple of typical articles from the Chicago News Cooperative, where I work?
http://www.chicagonewscoop.org/library-buys-14th-century-book-by-catholic-rebels/
- Subject of article: 14th century book that was purchased by The Newberry
- Physical object: handwritten text covered in a blue pattern
- Book language: Latin
- Number of pages in book: 120
- Book author: Peter John Olivi
- Book author personality: "Described as charismatic and brilliant, Olivi was regarded as especially dangerous by church authorities for his intellectual influence on Catholics"
- Cost of the book: $45,000 (for a larger collection)
- Where it was purchased from: auction at Christie's
- Source of funding: B.H. Breslauer Foundation
- Prior owners of book: Dominican library in France, Hispanic Society of America
- Book location: The Newberry, 60 West Walton Street near North Clark Street
- Ancillary information about The Newberry: "includes first-edition King James Bible, as well as letters from Napoleon III and Thomas Jefferson, among its collection of 1.5 million books"
http://www.chicagonewscoop.org/a-post-with-little-to-do-but-plenty-of-money-to-do-it/
- Subject of article: vice mayor of Chicago's budget (which is derivative of the Chicago budget)
- Name of position: vice mayor of Chicago
- Salary of position: unpaid
- When position was created: 1976
- Budget amount: $114,232
- Current person: Alderman Bernard L. Stone
- Various opinions
I'm still a little unclear on what the range of applications of (semi-)structured data might be.
And on what it would take, technically speaking, to do it. (We're currently building out our site... is this the kind of thing that has to be planned early on and built in to the backend? I'd hate to miss out on an opportunity out of ignorance.)
Leave a Reply
You must be logged in to post a comment.
If you are a software developer programming in .NET and Silverlight, ActiveAnalysis by GrapeCity is a component+library allows you to embed rich, interactive OLAP and data visualization features into your applications without writing a single line of code. Plus you can customize the user experience using standard programming techniques.
http://www.datadynamics.com/Products/ActiveAnalysis
Of particular relevance to this thread is the free stand-alone visualization app that is installed with the free trial download that you can use to generate visualizations without opening up any programming tool.
There are no run-time royalties. You can embed and deploy on the web and on unlimited number of desktops without additional costs other than the developer license.
Leave a Reply
You must be logged in to post a comment.
For those of us more hack than hacker, has anyone tried Tableau? Just ran across it while browsing Koci's twitter feed, but don't have a Window's machine to try it out. Seems like a few paper's have started to use it already though:
WSJ: Tale of 100
Wired: Console Wars!
Seattle Times: Worker Injury Claims
Leave a Reply
You must be logged in to post a comment.
If you are a software developer programming in .NET and Silverlight, ActiveAnalysis by GrapeCity is a component+library allows you to embed rich, interactive OLAP and data visualization features into your applications without writing a single line of code. Plus you can customize the user experience using standard programming techniques.
Leave a Reply
You must be logged in to post a comment.
You can try our production FMiner: http://www.fminer.com . Now we will FREE make a extraction project for every new uses.
Leave a Reply
You must be logged in to post a comment.
Your Answer
Please login to post questions.

Protovis is really easy to use and extremely powerful.
The examples are awesome: http://vis.stanford.edu/protovis/ex/
Flare is exceptionally cool, but last time I tried it, the documentation was really vexing. The demo is brilliant, but you essentially have to reverse-engineer the source code to figure out how it’s doing what it’s doing. (And that code is not easy to follow.)
Though that was a year ago now, and I haven’t tried it since; does anyone else have a more recent experience?