College Publisher to WordPress conversion script is now open source

Alternate title for this post: Let the exodus continue. The Python conversion script CoPress used to migrate over 50 student publications to the glorious free and open source WordPress is now itself licensed under GPL version 2. It’s optimized for College Publisher 4 and College Publisher 5 databases, but will also work with most any database you can turn into a flat CSV file. You can fork it on Github or download the brand new 1.0 release.

Right off the bat, I’d like to say that the most awesome bit about the conversion script is its ease of use. Granted, you do have to run it on the command line and it does often throw mythical, unintelligible errors if your data is screwy, but it’s about 100 to 1,000 times easier than what Sean Blanda or Brian Schlansky had to go through. Furthermore, it spits out WordPress eXtended RSS files that WordPress imports natively. Depending on the size of your archives, you could even do the entire migration in less than a half hour.

There are detailed instructions in the README I encourage you to read thoroughly but, in screenshots, here’s how you’d migrate your site.

Backup your database using Sequel Pro. This is a critically important step, as you’ll definitely want a clean version to revert to if the import goes awry.

Place the conversion script and your archives in a folder you can access from the command line. Both College Publisher 4 and College Publisher 5 migrants should receive an articles file that will need to be renamed “stories.csv.” Publications migrating from the former will have all of their image references stored in a file that will need to be renamed “media.csv.” Navigate to that directory from your terminal prompt and run “python CoPress-Convert.py.”

Once the script is running, you’ll be asked a series of questions to configure the conversion process. Most options are self-explanatory, and all are explained fully in the README file packaged with the script. The most important thing I’d like to note in this post is that, unless you have less than 500 authors in your archives, I’d highly, highly recommend importing your authors as custom fields instead of users. WordPress is not optimized to add a large number of new users through its import process. We learned this the hard way migrating CM Life‘s database last summer.

When the script is done, you’ll have a series of WordPress eXtended RSS files you can easily upload into WordPress.

Mad props go to Miles Skorpen for the long hours he spent on the conversion script, and to Albert Sun, Will Davis, and Max Cutler for their later contributions.

Feel free to send along any suggestions for improvement, bugs, fixes or general comments. I intend to maintain it for the indefinite future, it’s good Python practice when everything else I’m working on is PHP, but code contributions are always welcome. There is a short list of upgrades under consideration in the top of the script.

Provide links to context, please

In an article published today, the Daily Emerald reveals that Sam Dotters-Katz has proposed changes to the ASUO Constitution (which I would link to but it’s apparently not available online):

Dotters-Katz and Papailiou’s proposal would add two justices to the existing five-member court and require it to submit rules changes to the Senate for approval. It would also re-establish the Elections Board as an independent entity to avoid conflicts of interest that Papailiou said were endemic under previous administrations.

There’s something wrong with this picture. I’m not talking about the proposed changes, rather it’s in the way that the information is presented. Reporting in the “traditional” news brief format, the reader (me) is left more confused than informed. There is almost zero context associated with the article, and I haven’t the faintest idea what the information presented actually means.

Such is the old paradigm. Newspapers are dead; long live newspapers. I’m of the opinion however that the new paradigm, the one that everyone’s afraid of, is actually improving journalism. Go figure.

For instance, if the Daily Emerald had the innovation, talent, and tools, I would have been presented an array of options to expand my knowledge about Dotters-Katz, how the ASUO runs, and why he proposes a change to the Constitution. There would likely be a list of previous posts on this issue, a small topical wiki in the sidebar synthesizing the pulse of ASUO, and curation of student blogosphere reactions to the announcement (like this one and one from the Oregon Commentator), among other forms of information.

Instead, the readers get nothing better than a press release and I have to use Google, coincidentally, to educate myself further. Google is taking the place of the news organization largely because the newspapers are flailing. Get with the times, please, and use the infinitely useful and flexible platform the internet gives you to empower your community with information.

Oh wait, the Daily Emerald runs College Publisher. Make sure your CMS is open source, and then innovate.

Why I’m leaving

As of yesterday evening, I am no longer an employee of the Oregon Daily Emerald.

My decision comes after two months of frustration trying to get the Daily Emerald off of College Publisher. College Publisher, for those who are unaware, is a proprietary, locked, and nearly obsolete content management system (CMS). In my opinion, the first step student newspapers must take to survive in this “digital era” is to invest significantly in adopting an open source platform for their web presence. Open source allows a student newspaper to truly evolve into a student news organization. It offers the ability for you to have the final say in how, where, and why you publish your content. In proprietary systems, you leave this technological innovation up to the company to whom you’ve contracted out the work.

A metaphor for the people who have grown up with print: open source means your newspaper design and layout can be just whatever the heck you want them to be. Proprietary code means that you only have a certain number of colors, fonts, and article lengths to work with. Your sections always stay in the same location, and you can only adjust the placement of the stories to the smallest degree. All of those innovative front page newspaper designs from last Wednesday? Those wouldn’t be possible with proprietary code.

At the Daily Emerald, however, I was told we must first hire a publisher before we can consider any changes to our CMS. On top of that, we have a contract with College Publisher for at least the next six months (although we receive very little money from the deal so I’m not exactly sure what the Daily Emerald would lose by breaking the contract). Furthermore, the board meetings are closed. This means that I, the guy with Google Doc upon Google Doc of ideas, have to be invited to participate in the decision making process. To me, this sounds completely illogical. Instead, I have to pester the already overworked EIC with the things I’d like to do, and then have those suggestions go up the “chain of command.” It’s not a functional system for the real change which needs to happen.

Although I completely understand how busy the Daily Emerald newsroom is in producing a daily paper, it is busy work distracting the organization from what really needs to be built: a strategic vision for what student news is in the coming years. If I were in charge, I’d call an emergency board weekend retreat that anyone with expertise would be invited to. Student newspapers, just like the traditional media giants, need to completely rethink themselves because, by not innovating on the web, they’re is making themselves completely vulnerable to one potentially huge problem:

Competition from the people who get it.

Three threats for student newspapers

Sometimes it’s difficult being the web guy at a student newspaper. Although you’re absolutely certain “online” is going to play a significant role in the future of your organization, you’re not able to articulate the urgency of your position well enough to make the decision making wheels turn. It’s frustrating, to say the least. From the thinking and idea stealing I’ve done in the past week, I think there are at least three threats facing student newspapers who don’t reinvent themselves as multi-medium digital news organizations:

Threat one: Monetary. Advertising revenue dries up on the print side, print costs go up, and your online product isn’t compelling enough to generate the same type of revenue. That, or your online product is College Publisher and you can’t even boost the advertising revenue if you wanted to. One counter argument is that student newspapers could just go to student government to up their funding, a “bailout” of sorts, but I don’t think that could ever be a long term solution.

Threat two: Staff disappearance. Students no longer want to work at their student newspaper because their industry of choice has a bleak future. Jessica DaSilva is already facing this challenge at the Independent Florida Alligator and, as I commented, this could be the greatest short term threat, especially if your paper isn’t perceived as all that digitally progressive.

Threat three: Dearth of talent. Publishing and monetizing news online is quite different than print, and requires a skill set that potentially isn’t represented by current staff. The further a newspaper gets behind, the more it will have to invest when it does decide to make the gigantic leap in the future. This financing to buy talent might have to come out of its investments or from a significant fundraising drive.

At the moment, this is threat identification and analysis. I don’t have exact solutions to any of these issues right now. My hope, though, is that by studying and mapping out the specifics of each threat we can develop strategic plans to make the transition and keep campus journalism alive.

Gauging the state of the ecosystem

Just a gentle reminder that the first ever CoPress survey is online and looking for respondents. We want input preferably from the online editors at student news organizations, although others are welcome to contribute if the online editor has not been hired yet.

This first survey is to gauge the current state of the ecosystem. We want to know how what CMS you’re running, how many developers you have, and what languages they know, among other things. The survey will be open until 5 PM PST on 10 October. After we’ve spent time creating bar graphs and geo-mashups, we’ll release our first report. It should answer questions such as, “What is the average satisfaction with College Publisher 4 versus Drupal?” As far as we can tell, this hasn’t ever been done in our sector.

Along with the release of the report, we’ll be announcing our second CoPress survey. Our intent with this follow up survey will be to have a better understanding of what people want from a digital distribution platform. We truly value your input.

Also, props to Bryan Murley of Innovation in College Media for pointing out that it is not, in fact, September 2009. Not to be too stuck in the future, we’ve updated the survey title and links accordingly.

Introducing CoPress

One of the rather positive outcomes of my case against College Publisher from a few weeks back has been the formation of a diverse group of people around a new project to provide an alternative: CoPress. A product of the sudden realization that many online editors across the country have many of the same opinions I do, CoPress is an initiative to build a technical eco-system of student newspapers working together and supporting each other on a common, open source content management system. Until this point, it has been largely the case that, when building and maintaining digital platforms, student newspapers have found only success on their own, with their own developers, creativity, and fortitude.

We hope to change things up. 

Together we have strength. I think I can speak for everyone involved when I say that the collective vision of CoPress emphasizes the community, and how the community can work in harmony. Innovative, standards-compliant software is one immediate issue we’re trying to solve, but it isn’t the only one. Brian Murley, of the Center for Innovation in College Media, forwards that hosting is also an issue. From that discussion, we’ve also learned that supporting a piece of software with the technical expertise to keep it updated is critical. These problems will have to be addressed in order for any student newspaper to survive. It’s more powerful to work together than individually. We’re not profit driven, although the consortium will need to be financially sustainable. We’re driven by a genuine interest to work together because, when we do, we can create beautiful ways for student newspapers to flourish in the digital age. 

In the interest of radical collaborative openness, we’re doing as many things as transparently as possible. The motivation for this comes from a concept I call an “open source organization,” although I’m well aware “open source” has become a buzzword for many recent projects. It started with Whitman Direct Action, I’m evolving it with Oregon Direct Action, and I think is applicable here, too. The idea is simple: put all of the data about what you’re doing online, and structure the data such that your audience, let it be the team, the partners, or the community, can follow along to the degree they would like to participate. Clay Shirky says we have a lot of cognitive surplus floating around. It’s time we put it to use.

Our conference calls are recorded and available as a MP3 download, with near future plans to create a podcast that will make listening in even easier. We synthesize research and coordinate efforts on our wiki. Information is also expressed with Twitter, delicious, and Flickr. We connect via a Google Group and, if you don’t find a piece of information you need, you’re more than welcome to contact CoPress.

At the moment, we’re working on a few things. First, we’re beginning to research the software options we’re most interested in: WordPress, Drupal, and the Populous Project (built on Django). CoPress would love to support the Populous Project, another student project, and eagerly awaits their alpha release in the coming weeks. WordPress and Drupal, however, have deployability and hackability characteristics that will be hard to match. Second, we’re compiling the names of online editors, webmasters, and internet geeks at student newspapers around the country who might have interest in what CoPress will have to offer. From this, our hope is to do a series of surveys gauging the technical expertise in today’s newsroom. We want to make sure as best we can that we’re serving the needs of everyone, not just ourselves. Last but not least, we’re continually evolving our web presence as a tool to help better achieve our aims.

And this is just the beginning. Thanks to Adam Hemphill, Greg Linch, Kevin Koehler, Joey Baker, Bryan Murley, Jared Silfies, Albert Sun, the Populous Team, and anyone I’ve missed. I look forward to working closely with you and others in the coming months to make all of these ideas and more our collective reality.

The plot thickens

On my argument against College Publisher, and for an open source coalition of student newspapers, Brad Arendt of The Arbiter presents several good points about the advantages of using College Publisher.  Considering the time he took writing a well-detailed comment, I thought I would clarify on a several things I think he missed.

First, I think student newspapers should actively work on developing 1 or 2 alternatives to CP. This may not mean collaboratively building a CMS from scratch, rather it’s more likely to be facilitating a developer ecosystem specific to our needs around common platforms. For anyone familiar with WordPress, which I’ve helped implement for the Whitman Pioneer and most recently, Oregon Direct Action (which is a work in progress), it’s strength is an abundance of plugins and themes you can add to your install. A developer ecosystem is important for continued innovation and, as far as I can tell, CP doesn’t have one.

Cost is certainly an issue. Both CP and WordPress, Django, or Drupal are “free,” but the critical difference is that CP comes working out of the box for student newspapers and the others require a developer. One stated goal is to have an open source alternative that can be quickly up and running with full functionality. If the paper has resources to develop their platform beyond point, they would be able to do so with the support of other developers across the country. This platform would also be available to local papers, although that is not the intended market. Furthermore, I do see a business model in this, in a very Ubuntu and WordPress-esque fashion.

Quoting Brad,

There are some rather innovative and creative things which the CP4.0 system does offer. I would not say it limits creativity, rather it is the students you have on staff who know what to do with the tools that limits your creativity more than CP4.0. The Daily Pennsylvanian has done some very creative stuff in the LAMP environment, which is open source. The Daily Tar Heel has also figured out an interesting work around for blogs, granted done via WordPress but the 4.0 system and the students figured out how to “fit” it in.

Personally, I think arguing that College Publisher allows for innovation is completely erroneous. LAMP, which means Linux, Apache, MySQL, and PHP, is an open source stack and doesn’t stand for anything specific. I don’t mean to discount your example, I’m just not sure how you mean to imply CP is innovative by allowing hacking outside the platform. Furthermore, any server environment should allow working with and around the software running on it. Allowing WordPress to be installed as a blogging platform is not a sellable strength of College Publisher

Brian also mentions that CP does provide backups of your site for the scenario in which it disappears.  Unfortunately, these, I imagine, are only backups of your data, not the content management system your data is living in. If your site were to go down, you would have to install and develop an alternative CMS, as well as port your database, before you have a live site.  You shouldn’t have to completely rebuild your website if College Publisher disappears.  When the web presence becomes the only presence, having your site suddenly not exist would have very real consequences.

One case against College Publisher

When you control the platform, you also control the content and innovation associated with it.

In the school news industry, College Publisher, now branded as the College Media Network, desperately needs a competitor. Owned by MTV, a subsidiary of Viacom, College Publisher provides a content management system now used by “550 going on 600″ student newspapers across the country. It offers under-staffed and under-funded newsrooms an easy way to get their content online at a price that can’t be beat.  

Why is Viacom interested in managing the online platforms for as many college newspapers as possible? To deliver advertising, of course. As a part of the contract for a cheap, if not free, way to get your stories and images online, College Publisher reserves the top placements on your site for their own use. This allows an even bigger media giant (Viacom) to directly make money off a school newspaper’s content, either by selling advertising slots to big corporations like T-Mobile and Bank of America or by running advertisements for their other properties. Student newspapers are especially valuable to Viacom because they largely produce for its key demographic: the college student. Most, too, are held captive to this partnership because there isn’t the motivation, manpower, or vision for more innovative options.

Should any independent student newspaper want in a part of this? No.

College Publisher, unfortunately, is not the innovation aspiring journalists and reporters should depend on in this changing media environment. Claiming RateMyProfessors.com and a CMN Facebook app are “national media outlets” is not creativity. Rather than outsourcing the heavy-lifting to College Publisher, student newspapers need to allocate resources internally to running and developing their own platform. This can seem somewhat paradoxical, adding to your staff when you’re losing more and more revenue, but it is a necessity for survival. The future isn’t all that bleak, we’re just in a time of transition.

At Publishing 2.0, Scott Karp argues that newspapers need to take a hint from General Motors and learn how to innovate. Most newspapers have had roughly the same business model since the 1950′s which they’re now largely attempting to reapply to the internet. It’s not the same medium, though. Advertising and classifieds were king in past years, but the playing field is now open to the most ambitious entrepreneurs. Maybe a model like Spot.us will succeed, maybe it won’t. Without trying new things, there’s no way to find out.

Part of the innovation that has to happen, I would like to add, is how you manage, display, and distribute your content online. For student newspapers, the solution isn’t College Publisher. It’s too restrictive, poorly developed, and proprietary, locking innovative students to a platform that limits creativity. Page load times are atrocious because of far too much Javascript, and if they go out of business, your website goes down. The answer, instead, is open source.

One component of a strategy for student newspapers to move forward is a consortium dedicated to collaboratively building an open source content management system which best fits everyone’s needs. We need a robust, free to use platform that thrives under many of the same values which the open source movement holds dear. The growth of such a community around the publishing software used by student newspapers would be of tremendous value to everyone, especially because most papers aren’t in competing markets. Collaborative innovation is a win-win for these types of organizations, a fact I think few have realized.

As the start for a transition I hope to begin with the Oregon Daily Emerald in the winter, I’m taking steps forward. At this point, my work involves researching mature platforms already in the ecosystem, such as WordPress, Drupal, and Django, contacting people at what I think are progressive school newspapers, and working to identify the crucial features for any online newsroom (like managing media assets and placing advertisements). While I recognize there are already many content management systems on the market, my paradoxical goal is for a platform as easy to use and install as WordPress that also offers advanced management features. Software that any student newspaper can install, but also be able to develop further if they have the resources to do so.

I’m passionate about making this happen. Let’s do it.

Ironically, the College Media Network blog runs WordPress. They obviously aren’t drinking their own Kool-Aid.