TV News Search and Borrow: Knight Foundation Funds Expansion of Internet Archive Service

The Internet Archive announced this week that it received a $1 million donation from the Knight Foundation to expand it’s TV News Search and Borrow archive of television news clips. As of now, the archive has just over 400,000 clips that the public can access, link to, or borrow a hard copy for a fee.

“We want to make all knowedge available to everyone, forever, and for free. So it’s an ambituous mission,” laughs Roger Macdonald, the archive’s television news project director. 

And it all comes down to closed captioning.

The San Francisco based non-profit records broadcasts, and teases out the news using closed captioning tags and other meta-data. Twenty-four hours after the first airing, the clip is available in the archive. It’s an invaluable resource for journalists, researchers, and documentarians to study what was said, when, where, and in what context. Want to play John Stewart? Go ahead and search clips of ‘Benghazi’ on Fox last week. It can also be used for more noble causes, like tracking political speech.

Right now, the archive picks up all US national news and local news in the Bay area and the D.C. metro area. But with the new influx of cash, they’re hoping to expand into other local areas as well. Macdonald explains:

Specifically, we’re looking at the potential for journalists and scholars and civic organizations to record news in Virginia this year, for the statewide election for a govenor among two very ideoglogical and well funded opponents. So we think that might be of interest – for its value and insight regarding the elections and a test run for what this kind of collloborative work might contribute for the national elections in 2014 and 2016. We also want to tease out campaign commercials and make those avaialble for deep search and link them up to other crowdsourced organizations who want to look into funding. 

To expand the breadth of content, the Archive has been working closely with networks regarding their own closed captioning services and archives. As you may have noticed, most networks don’t even bother to spellcheck the closed captions. Now that the FCC has required networks to carry over closed captioning to online-only video, Macdonald is looking forward to seeing innovations in their archival processes, and increasing the manner in which the news clips can be shared, say, in an online news article. Right now, you can link to a video in the archive, or borrow a hard copy, which isn’t exactly ideal for digital journalists. 

We’re not faciliating embedding right now, but we’re interested in the opportunity to do that. Our thinking is that we want to make sure that it’s embedded with our own player, one that we’ll create that links back to the network, hopefully deeply, and the canonical meta-data…The meta-data on YouTube for example isn’t accurate at all. We’re librarians, we believe in the import of accurate and rich, granular  data. 

So what does that mean for journos like us? One hope is that as news organizations create online only video channels, they’ll include that accurate meta-data.

“For a good search index, you need granular data,” says Macdonald.

So make sure your videos are using accurate, ‘named entity’ or ‘semantic entity‘ tagging. “This helps search engines, and other machines, disambiguate terms that have multiple meanings,” he says. Like in the case of ‘Lincoln,’ for Continental, Nebraska, or the President.

“This way, it gets discovered, and it’s enduring content where we can reflect back on it, and that doesn’t exist right now,” he continues. 

“We’re excited to be part of this real shift underway right now,” he says. “We don’t know the best way to do this interface, or how to meet the needs of our various users. And we are excited when we hear new cases and we are incredibly open to adapting how we do things to facilitate public use, like direct data inquiries to the archives, and pushing data out.”

So there you have it, journos. Have you used the archives for a story? How would you best use the television search and borrow service?