Anybody using Drupal to run a news/local info site?
I have a bit of a love/hate relationship with Drupal. It has all this awesome power, but for non-developers like me it’s not really that accessible. I’m curious to hear what modules you guys might like, or any other Drupal-related tips or resources that might help me get along better with this nasty ol’ beast.
Some good video resources I’ve found:
- http://www.drupaltherapy.com/screencasts (GMap + Location vid is great)
Update (May 23): Based on a few of the answers that are rolling in (very helpful, by the way) I’m going to add a few more details about exactly what I’m hoping to do. It’s going to be a site about Tokyo – where I currently live – and there’s going to be blogs, listings of geographical points of interest, and maybe news if I can figure out a way to fund it. At worst, me and a few friends will write to start. I’ve chosen Drupal because:
- Any Tokyo site should use train stations and lines as ONE of the ways to classify content. I’ve created content types ‘stations’ and ‘lines’ and created a node relationship between them. ‘Listings’ nodes will similarly relate to stations in a hierarchical manner (h/t Chris Amico)
- In English, there’s a big blind spot for content outside the most popular areas. I hope to fill that need. All geo-specific content will be plotted on a GMAP.
Simply put, the site that I myself want to use in Tokyo doesn’t exist – so I’m making it. I figure it’ll be a good learning process too. There are a bunch of English language media joints in this city, but with due respect to them they seldom meet my needs.
For more info, there’s a mammoth explanation/discussion of this project on Google Wave.
The best tool for the job is the scripting language your students know best. Pretty much any half-decent one will do. I would, however, urge your students to download the pages first (with a time delay built in), and then parse them in a second step so they don't hit the same servers 8 million times while they're developing their code. You might suggest/require they all scrape different pages for the same reason.
I'd emphasize that scraping is really the approach of last resort for getting government data both because you should be able to request the records directly and because it's hard to prove that a dataset you've scraped is complete (maybe the government web developer missed some records). Scraping is way more useful to grab timely information (like election results, or crime/arrest summaries if you have an awesome police department) or private info (be creative).
Here's some random links to scraping instructions in a few languages. (There's also publicly available code for govtrack.us http://www.govtrack.us/developers/ and everyblock http://code.google.com/p/ebcode/ if you wanna see how the pros do it.)
Ruby + general programming (Dan Nguyen): http://danwin.com/works/coding-for-journalists-101-a-four-part-series/
Stuff from this year's NICAR: Perl (me): http://nicar-phoenix.s3.amazonaws.com/scraping-presentation.html
Python + other resources (James Wilkerson): http://www.slideshare.net/jameswilkerson/web-scraping
Dan Nguyen and I are scheduled to do a scraping talk at IRE in Vegas, which all of y'all should check out.
Big Ugly Datasets For Thumb-Fingered Journalists:
Say you're a hack covering sports, politics, business -- doesn't matter.
Somewhere out there is a file that ends in three letters: CSV. It could be megabytes and megabytes big. It will probably be so big, in fact, that it will be nearly impossible to navigate in Excel and not much easier in Access.
But it has all kinds of useful information that will help you cover your beat -- if only you could load the file, get the data you want from it, and do analysis.
The "Data-Driven Journalism" side of this curriculum should address, for journalists:
- An understanding of what a "flat file" is
- An introduction to DBMS and how they work
- How to clean a big data file for use in a DBMS using a tool like a Python script or a robust text editor like vi
- How to load data into a DBMS
- How to use a DBMS to get what you need out of large (100,000+ records) data files, export that to a spreadsheet, and plug that into viz tools like ManyEyes or Socrata to get the chart you want
And for hackers and journalists together:
- How to collaborate on a data-driven journalism project, and who to collaborate with
The Art of Engagement. If journalism is a conversation, what does it take to host conversations that matter? What are the patterns of interaction that show up? How do you create a welcoming environment that makes room for diverse perspectives in civil dialogue?
An idea that echoes some of what's been said:
Changing culture, within and without
1. How do hacks/hackers change the culture of newsrooms and IT departments for the better in the new journalism world?
2. How do journalists and journalism organizations work to change information culture to more open and transparent place within their communities?
3. Why should cultures even be changed in the first place, and is it even possible to change existing institutions or is it better to start from scratch?
Courses you want to teach: Basic journalism course for new people doing journalism online. At some point, I might try this independently.
Photography/video along with basic journalism photography, with a strong F2F component. Others could teach this better than I could. I can just vouch that there is strong demand.
I took Joi's course, and I've proposed a basic journalism course in the past. The tools and the subject matter interest me greatly, and I'd love to hear how this idea progresses.
Slides and links to open education tools, built during Joi's class:
MediaShift story about open, free journalism classes gaining ground:
All about APIs. What they are, how they work -- from popular examples like Twitter and Google Maps to lesser-known gems. How you can create cool new stuff out of other people's apps; when, how, and why you should let other people easily make their own cool new stuff out of your apps.
Thusly show journalists how to get their feet wet with the "open web" in specific, practical ways. Hopefully inspire new, open applications of reporting.
+1 on the Python/BeautifulSoup approach mentioned in Michele's post. The BeautifulSoup documentation is a good resource too.
Stands to reason that if the utility/value of screen scraping is readily apparent to a student then they'd be motivated/interested in giving it a shot.
"Mad for Metadata." What is it, how can it be used, how is it being used in newsrooms (in particular: how is it being managed in the cms and standardized across the newsroom), what are best practices outside of the industry for categorizing and organizing information and what ideas can newsrooms steal from them? (kinda broad; this could probably be narrowed a bit more)
Web Design/Content strategy: A course that explores how best to organize online the content a newsroom produces. It's a LOT of content. Very often news sites pick up their print sections and place them into the navigation structure of the website. But does that work? We could talk about how to tell (usability testing, surveys, etc.)
But we could also explore, from the developer community side, information architecture and best practices. And, from the journo side, have print designers describe their best practices. How to marry the two? Should the two be married? ... Some entry points to the discussion might be the topic-based navigation & database libraries of some of the online-only news startups.
I don't know about the structure, but I think some thought needs to be given to the format. The p2pu Digital Journalism course just wrapped up, and that was conducted over Ustream with class recordings available online for viewing afterwards.
Speaking as a programming noob, I'd love to see classes on Ustream where the instructor walks through a screencast demonstration. I always trip up when following text-based instructions, but give me a screencast so I can watch how its done and I'm usually ok.
Defining Collaborative Journalism Protocols on the Open Internet
How can traditionally vertical news organizations and practices open up and collaborate in the ideally open Internet, given limited resources and specific institutional goals that don't always overlap?
Is there a collaborative protocol that would enable independent producers to collaborate and aggregate outside of traditional vertical news structures?
This may merge with Andriak's answer about changing newsroom cultures to create more open and transparent venues for the work of journalism.
I do think it's about culture and practice, however, not necessarily about specific technological applications. This is an issue of expanding the information commons and open discourse.
(Again: I have extracted this from a previous, more lengthy answer, that I decided not to edit and delete due to the interesting but distinct conversation it provoked.)
As a current j-school participant, I think it's really important to for the practice and sharing of reporting, writing, rich media production, data literacy and programming skills to be continuously embedded in the experience of producing news "stories." My experience with j-school has been that there often isn't time in the curriculum for the skills that all my classmates bring with them to come together in innovative and context-relevant ways.
Bringing programmers into the newsroom isn't just a matter of adding new tools, it's adding different ways of thinking and making.
Rather than breaking a curriculum into modules based on skills (programming, GIS, interviewing, video editing) it would be better to structure the modules around "news problems" that teach multiple skills together and value the diverse perspectives and skills that students are bringing with them.
Off the top of my head, here are some ideas of "news problems" that seem like they would involve multiple reporting and hacking competencies:
- The city has released a new data set with thousands of records of information. Find the story in the data and explain in clearly to your audience.
- Neighborhood residents are divided over a proposed development project. How do you capture neighborhood sentiment around the controversy and engage community members to stay engaged as new information becomes available?
Some examples of media sites and some newspaper sites using Drupal. There's also a Newspapers on Drupal group. If you see something you like on one of those sites, I'm sure you could get more info from their web person.
Also, I shared a link to this question on Twitter and cc'ed Steve Yelvington, who would have some good insights.
I can relate to your love/hate relationship with Drupal! I am in the process of producing an install profile geared for public radio news departments for a project called Radio Engage. We have our first beta site up for a San Francisco radio station here: http://kalwnews.org.
The Radio Engage configuration includes features that provide original blog & audio content repositories (with mapping and semantic tagging), aggregation of external content and user engagement features. We are using Drupal 6 and over 80 contributed modules. More details coming soon as we are providing the code to several other stations and will provide a download for anyone to use.
In addition to the links Greg listed there is an install profile called Open Publish (http://openpublishapp.com/) that provides a comprehensive Drupal configuration geared for news sites. It is quite comprehensive and is getting a fair amount of use.
There is also an online training site coming along quite nicely here: http://drupalkata.org. They are hosting live sessions and also creating a training repository.
The frustration you feel is a major pain point for Drupal and thankfully one that I think is getting a lot of attention moving forward.
Happy to answer more questions as needed!
I once spent a week trying to build a site in Drupal. After losing 10 pounds and half my hair, I built it in Wordpress in one day.
I use Drupal to power the VancouverObserver.com an online news source that has grown from its pre-Drupal days (4,000 uniques a month) to its post Drupal days (50,000 to 100,000 uniques a month and growing and shrinking, depending on the month.) More than 250 contributors from the Vancouver area who are strong writers and expert in their areas volunteer and use VO as a platform for their columns, blogs, videos, investigative reports and we are about to launch a community content generated function as well. I'm a very big fan of Drupal and find it accessible, even though I'm not a tech head. I rely on my Drupal consultant/developer, David Egan for support. Our assistant publisher, Meghan Strain, has learned enough programming to be able to do basic changes in the programming on the site. Next up for us with Drupal: building a Drupal mobile component. Get in touch with us if you want to know more! email@example.com
Reporting Standards in the Digital Age: The Sherrod incident is a perfect example of why we need to maintain "old fashioned" standards in the 24 hour news cycle universe. We may not be able to do everything as we once did but surely there are new ways to do basic fact and/or provenance checks more rapidly and still get a piece up in a competitive time frame. It is astonishing, given the source, that the Sherrod video wasn't vetted. It's perfect evidence for why we need to find ways to remain competitive and still be what we ought to be for our patrons (readers? viewers? consumers?)
I have to say, though Drupal can be a gigantic pain, it is still by far the best out of the box installation for a multi-user news site.
You can pretty much get everything working without even touching PHP (except for extending functionality of core or modules).
I recommend a lightweight 4 Kitchens Pressflow installation.
You can also check out Prosepoint, a bit more out of the box news set up.
Evanstonnow.com, a site serving Evanston, IL, runs on Drupal.
Bill Smith, the proprietor, has done the setup/coding himself.
You can reach him at firstname.lastname@example.org.
I've used Drupal to build two newspaper's websites: Cornell Daily Sun and Spare Change News. Drupal is a great boost in getting a full-featured news site running, and while people are working on an out-of-the-box configuration to just flip a switch and have a news site, that's currently not the case (see here for an explanation on why from Yelvington himself).
If your paper's website is a strategic part of the organization (which it probably should be), there's a lot of decisions you do need to make, think about, and ultimately configure yourself for it work out well. That means a much higher learning curve and getting your hands really dirty in the meantime, but most local Drupal UG are very helpful, the Newspaper Group on Drupal is very active, and I think you'll find you don't even need to touch much PHP to get a quality result.
If you have any specific questions or trouble, I'd be happy to try and answer or just head over to Drupal Groups itself.
I think that the first thing to do is decide what you want to do with your site, and then decide on a CMS or framework.
Make a list of your needs, and then see how these systems stack up against that.
There was a great article recently on the structure of news websites, how to build out content, and the semantic web that I really suggest reading as well. http://stdout.be/2010/we-are-in-the-information-business/
Personally, I feel like if you need a site in a hurry, or something relatively basic and out of the box, easy to use, Wordpress is great. We used that for the Chauncey Bailey site, and I use it for my own personal site.
Drupal is good for projects that are not going to use a lot of different content types, where you want community support in place, and are willing to work with a developer. So something a little scaled up from Wordpress. We use Drupal for the Center for Investigative Reporting and California Watch.
If you have a lot of different content types, need a lot of different templating, and want more flexibility then something like Django is probably more up your alley. We use Django for the California Watch projects server, where we build more data and graphical rich applications.
Hi! Our online news site uses Drupal and we have actually developed a great Drupal Instance that anyone can use to get their news site off the ground. We also offer updates and support for a monthly fee. Our news site is www.countynewslive.com. Please contact us if you have any questions or if you want more info on our Drupal Instance. Thanks! Brian
Project management: How do you take an idea from the conceptual to launch? Although there are variations, developers usually have very particular processes they go through to meet deadlines & project goals. Have them share the different project stages and the whys behind the process.
Building, Linking and Sustaining Decentralized Newsrooms.
How can we take journalism out of hierarchical legacy institutions, and turn it into a widespread, open-source practice among peers? Here are elements of that question that we at Newsdesk.org have been exploring, and that we want to share with our colleagues
- Methodologies for identifying issues that matter and communities in need
- Developing co-op/peer-driven editorial models
- Aggregation beyond mere summary
- News items as "social objects" in the decentralized medium
- Local journalism fundraising: individual donors, crowdsourcing, grants
- Building a shared-back office to support aggregate operations and marketing.
We consider this a community effort akin to open-source software development; as such it requires collaboration between peers who may be embedded in or beholden to non-collaborative systems. In my experience this is one of the most important issues and challenges of all.
Another question: What actionable outcomes will result from this class or discussion? A protocol for collaboration? A consortium of like-minded news producers who want to take the lessons learned and apply them in practice? Food for thought.
I had joined a computer assisted reporting class a few months ago through an online class, but I got only a few knowledge from it. I guess the course that Hacks/Hackers will conduct should also cover how the Internet can help journalists improve their reporting, data gathering and, possibly, data verification. There should be tips on how journalists can obtain data/information from official government websites. In some countries, government officials tend to conceal the public information such as the amount of budget or officers' monthly payment. Is there a way for journalists to obtain such information? -- Siswoko
"Defining Articles As Social-News Objects"
What is the combination of reporting methodology, physical article structure and internal markup/coding that defines an article as a "social object" -- something that is transferable in its entirety between platforms, open to comment/discourse, and that has longitudinal value, i.e., that doesn't get stale as a story, that has legs, is able to update itself. For example, is this process automatic or manually curated? Etc.
(I have extracted this from a previous, more lengthy answer, that I decided not to edit and delete due to the interesting but distinct conversation it provoked.)
Please login to post questions.