Marc Lavallee and Max Cutler earlier today identifying different sources of news organization data. As a part of the Hacks/Hackers post-ONA hackathon at NPR, we rekindled NewsAnalyzr. The core idea is to build a flexible tool for analyzing the news and those who make it. A necessary foundational element is a structured database of news organizations with pertinent metadata like URL, print circulation numbers, or number of employees. Once you have this, you can, say, easily create an API to give you the title of a news organization based on URL or scrape the homepages all top 200 newspapers on a 15 minute interval during election night.
You should follow our progress on GitHub and I’ll write a more detailed post we have something useful to show of the project.