Rdio vs. Spotify

Some many know me as a hardcore Rdio fan. I love the UI, appreciate their story, and have mostly been satisfied with the service.

Last year, Andy Baio wrote two articles comparing Rdio to Spotify. In both instances he concludes Rdio comes out slightly on top — their catalog is just a little bit better.

But, I’ve never been able to get the Blue Scholars or St. Germain on Rdio so I always questioned how the catalogs really compared. Yesterday, after scrapping my first hackathon project at 7:30 pm, I decided to use the Echonest API to answer my question once and for all. Which is better, Rdio or Spotify?

As it turns out, Spotify. By a landslide. Of the 92,343 songs I had time to pull down, 74,703 are available on Spotify and 56,988 are available on Rdio. Furthermore, Spotify had 21,370 songs that weren’t available on Rdio, whereas Rdio only had 3,655 songs that weren’t available on Spotify.

Plus Spotify has the Blue Scholars and St. Germain. Time to make a switch.

Why I will never pursue cheating again

In other words, my theory is: Cheating (on a systematic level) happens because students try to get an edge over their peers/competitors. Even top-notch students cheat, in order to ensure a perfect grade. Fighting cheating is not something that professors can do well in the long run, and it is counterproductive by itself. By channeling this competitive energy into creative activities, in which you cannot cheat, everyone is better off.

Panos Ipeirotis — Why I will never pursue cheating again. A computer scientist teaching in a business school details a year of trying to combat cheating on assignments. Overall, he spent 45 hours addressing the problem during a 32 hour lecture course, and 22 of 108 enrolled students admitted cheating. Solutions could include:

  • Public projects – All of the work ends up public, so embarrassment is the deterring factor.
  • Peer review – Students have to present their work in class, and are judged by others.
  • Competitions – Grades are performance-based (e.g. students build websites to attract the greatest number of unique visitors).

Takeaway: If plagiarism is your biggest worry, you’re doing it wrong.

Preliminary results from our informal Knight News Challenge survey

Infographic by Lauren Rabaino, updated April 18th

In preparation for a roundtable discussion this weekend about the Knight Foundation’s commission on the information needs of communities, a few of us decided to survey past News Challenge grantees. A big thanks to Chris Amico, Will Mitchell, Max Linsky, and Lauren Rabaino for helping out with various parts. We wanted to pull together data like how many of the projects are still active, whether the grantees started their projects before receiving funds, and whether the amount they received was sufficient to achieve their objectives. On a program-wide scale, we wanted to know the percentage breakdown of content vs. education vs. software projects, the average lifespan of a project, and what type of institutions typically received funding. Some of this we were successful in collecting; some, not so much. All of our data is available as a Google Spreadsheet. Continue reading “Preliminary results from our informal Knight News Challenge survey”

Background information on our survey of Knight News Challenge projects

If you’re reading this post because I, Chris Amico, or one of two other collaborators emailed you this link, congratulations! You’re one of the 64 projects funded since 2007 through the Knight Foundation’s News Challenge contest. These projects have been granted $21.9 million dollars over the last four years, and we’re curious to hear how they ended up.

A bit of background. Next weekend, David Cohn of Spot.us (not one of the trouble-makers) is bringing a couple dozen of us together for Hardly Strictly Young. It’s at the Reynolds Journalism Institute, sponsored by the Knight Foundation, and will be my first trip to Missouri. Over two full days, we’ll discuss facets of the Knight Foundation’s commission on the information needs of communities. Part of this, or at least what those of us running the survey think, is to help the Knight Foundation learn from the first four years of the News Challenge. It is arguably the most significant effort from news industry actors to inspire innovation within said industry. In other words, it’s been our only hope.

Unfortunately, there’s not much data for us to work with. Yet. The Knight Foundation has all of the winners listed on the News Challenge website, along with their project descriptions and amount granted, but very little information on outcomes. This is where you fit into our crowdsourced reporting project.

We have two sections on our survey form. The first asks for quantitative information on your project, and is intentionally required for you to submit the form. We want to know whether your project is still active, how much of you grant you actually spent, and whether you achieved your stated objectives. These responses will go on the big ol’ spreadsheet of data we’ll eventually release. The second (optional and/or anonymous) section asks for a qualitative perspective on your project, including how it was successful, what challenges you faced, and what you thought of your experience with the News Challenge. These questions are intentionally broad. If you decide to respond anonymously, we won’t publish the remarks with your name (if we choose to publish them).

This data is quite important. Thank you in advance for taking at least a few minutes to respond. To make things fun, we’ll be updating a public list of who has and hasn’t yet responded. So encourage your friends who haven’t yet replied to do so. I’d like to thank On The Media for the creative idea.

Researching better search functionality for the CUNY J-School network

Search is currently the dominant information retrieval paradigm, and WordPress’ internal search functionality is one step removed from atrocious. With that in mind, I’d like to significantly improve how search works on the J-School’s WordPress network. These are the notes I’m putting together as a part of my planning process.

A search for my name currently looks something like this:

Ideally, the search functionality should support these requirements:

  • Query across all of the content objects associated with the J-School’s primary website. These objects include posts, pages, events, blogs, databases, members, groups, and (coming soon) job opportunities. Eventually it would be nice to search attachments as well.
  • Expand a query to include content from any of the 216 and counting websites within the network. Filter results to a specific site, or by author, publication date, categories, or tags.
  • Highlight results based on matched keywords. If possible, show the sections of text matching the query.
  • Log queries and (optionally) provide analytics on search trends.

As far as I can tell, the options on the table are Sphinx, Solr, and search as a service from IndexTank. Sphinx appears the lowest-hanging fruit; Solr takes a couple of weeks to set up and configure, and IndexTank costs money for anything over 500 queries/day.

For Sphinx, there’s a WordPress plugin making it easier to integrate the two. The author has reasonably detailed documentation for installing Sphinx via the admin, if you chose to do that.

Another sys admin has written a three part series on extending WordPress search with Sphinx.

Extending search sources to custom fields is apparently as simple as adding to the select query.

The best way to dynamically add new blogs to the index for WordPress multisite is by editing the .conf file, although I’ll need to develop a way to add a unique index for every piece of content.

I intend to get Sphinx working on the development environment first, document the steps it took, then implement on production.

Equity research from the CoPress-era

For a friend, these are links I pulled together when researching CoPress’ equity split Fall 2009.

Startup Equity Distribution

Notice that I used the word allocation above. Allocated means not vested. In my mind all founders stock should have either a milestone or time based (or some mixture of the two) vesting schedule. If you want to know why find someone to tell you a story about a cofounder who walked away from the company and is still holding a 25% ownership stake. Trust me. It creates problems. Personally I prefer 25% one year cliff vesting with 6.25% quarterly vesting thereafter combined with individual milestones.

It’s all about K.I.S.S. Lance argues against equal equity distribution and for dividing it based on contributions of time and expertise. One approach is to determine the valuation of the company, and then use a function of proposed wages and time contributed to divide up ownership.

Equity distribution amongst startup co-founders?

Technically, equity distribution is proportional to the “value contribution” by each stake holder. In general, tangible contributions (investment, land, resources) are considered much more important than intangible contributions like experience/expertise.

The options seem to be 50/50 or distribution as a function of contributed value. People answering the question lean more towards the latter and offer some suggestions as to how to do it best.

Calculating Partnership Equity Splits

Potential formula for equity distribution: break down money to be invested, time to be invested, and experience of partner into percentages, and then determine percentage contributions of each partner. This breakdown then determines overall split of shares.

How do I survive when starting a business without a paycheck?

There are very creative ways to live cheaply if you’re dedicated. The best response in my opinion is to live out of your car and buy a gym membership for exercise and showering.

Equity-Split Results, Part 1: When Do Teams Split Equally?

Interesting chart comparing different situations. An equal split is more likely amongst smaller teams coming from similar backgrounds that divide equity at the start of the project or company.

Dividing equity between founders

One thing I’ve also noticed is people tend to overvalue past contributions (coming up with the idea, spending time developing it, building a prototype, etc) and undervalue future contributions. Remember that an equity grant is typically for the next 4 years of work (hence 4 years of vesting). Imagine yourself 2 years from now after working day and night, and ask yourself in that situation if the split still seems fair. Another consideration is if one founder has had greater career success and will therefore significantly improve the odds of getting financed at an attractive valuation. One way to figure out how much this is worth is to estimate how much having that founder increases your valuation at the next financing and then, say, split the difference. So if having her means you can raise $2M by giving away 30% of your company instead of 40% of your company, let that founder have an extra 5%.

Variables to potentially consider include: past and future contributions, career success, and who had the big ideas (and whether those ideas have any technology or intellectual property associated with them).

On the ground MobilizeMRS Research

Thursday morning, Wayne, Karen, and I went down to the clinic in Arequipa to discuss OpenMRS, FrontlineSMS, and MobilizeMRS with Lilia, the director of the clinic, and Maris, the assistant director of the clinic. There were a few goals to the meeting: understand the rudimentary electronic medical records system (EMR or MRS) in place now, assess the pros and cons of that system vs. OpenMRS, and discuss the possibility of running a clinic efficiency experiment with FrontlineSMS. We got through the first two agenda items pretty well but, being on Peruvian time, didn’t make it very far into the third.

Brain and note taking dump ahead.

The clinic has an EMR at the moment which is very limited. It was developed by a local programmer they still have good relations with and, every time they want expanded functionality, they just ask he (or she) to build it. Furthermore, the clinic staff has been talking over the last year about different ways to expand the tools. At the moment, it captures data about the patient, vital signs, and has a free text area for diagnoses. Continuing development on this software will require significant money, of course, which is why OpenMRS is probably a better long term option. Writing software for a pretty common use case doesn’t make much sense when there are customizable open source options available. Thanks to a relatively fast internet connection today, I was able to upload a HD walkthrough of their current EMR:

Tour of the clinic’s custom EMR from Daniel Bachhuber on Vimeo.

One fairly significant problem we faced Thursday morning, however, was trying to convince the clinic staff of the merits of OpenMRS without a full featured online demo or video tutorials. I personally haven’t experimented with the software very much, nor do I know all of the useful components of a medical records system, so I couldn’t necessarily sell the software with my salesmanship.

Wayne, being proactive, took the conversation from step zero so that Lilia and Maris would be able to help assess the merits and demerits of their current system:

Basic needs of a Medical Records System from Daniel Bachhuber on Vimeo.

According to the doctor, the basic needs of a medical records system are three-fold:

  1. Documentation – an EMR should have the ability to take notes and capture information on labs, Rx, Dx imaging, etc. Most importantly, this information should be searchable.
  2. Networking –  an EMR should lend accessible communication, both internally (within the clinic) and externally.
  3. Decision support – an EMR should be intelligent, and assist the clinic staff in identifying high-risk patients, etc.

Once we had these criteria established, we started talking about the pros and cons of using their current system.

Pros and cons of the current system

The pros of their system are:

  • Easy implementation – the software is already installed on the computer and they know how to use it.
  • Design specific to clinic – they can choose how they want the software to operate because they direct the development of it.
  • Know[n] commodity – they know what they’re dealing with.
  • Personal sw. provider – the developer is local and can come to the clinic to provide support, etc.
  • Economically speaking + impact – Cheap for what it does.

The cons of their system are:

  • Design specific – the design of the software is tied very much to the needs of their clinic today, and not five years in the future.
  • Expandability – uncertain as to how difficult it is to extend the system.
  • $ for upgrades – have to pay to have the developer build every single upgrade. Also, only the developer knows how to build or maintain the system.
  • Don’t really know “OpenMRS” – don’t have the proper education materials to illustrate the power and flexibility of OpenMRS.

The unfortunate thing is that their current system doesn’t match up to the needs of an EMR very well. As it stands, it’s not much more than a data storage tool. They use it to house basic information about the patient, symptoms, and diagnosis, but it isn’t very useful as a tool to manipulate the information. On top of that, the networking support (connecting computers in the reception with those in the doctor’s rooms and farmacia), has yet to be built and decision support is cost ineffective.

The clinic is interested in OpenMRS, however. On Monday or Tuesday, Wayne will be showing Lilia and Maris a demonstration of the EMR he uses back in the States. This will ideally convince them of the practicality of having a robust EMR. We’d also like to get them to a clinic in Peru that has a working demo of OpenMRS soon. If this proves feasible, then we might be able to send the programmer they have to an implementer’s training with PIH.

A thought on bringing the programmer into the fold: this might actually be an economic enterprise for him or her. My thinking is that there are a number of clinics in Arequipa still using paper records, so if the clinic HBI works with becomes a local model for using OpenMRS, then that might get the other clinics interested in medical records and incentivize the developer to get to know OpenMRS better.

In the interim, though, the clinic will still put a bit more money into the system they already have.

On the note of SMS, we discussed the possibility of how mobile might be useful to increase clinic efficiency:

Day seven, Arequipa from Daniel Bachhuber on Vimeo.

The idea wasn’t very well received, though, because the assumption is that the demographic that the clinic serves most likely will not have cell phones, and the clinic staff couldn’t really understand how the technology could be useful. Anecdotally, however, a doctor said the penetration of mobiles in this market is near or over 90%, a statistic which doesn’t seem too unrealistic to me. Furthermore, I think that mobiles could play a significant role in improving the efficiency of the clinic.

We’ve got an experiment cooking too. Building upon the pediatric idea briefly outlined in my previous post, we’d like to have a control group, an experimental group which receives a reminder for their appointment, and another experiment where the group receives a unique code for a discount on their appointment. In preparation, the clinic will start collecting cellphone numbers at registration. Ideally, this experiment will be later this spring or early in the summer.

One last thought on efficiency: we’d also like to run a two week experiment (probably in February) where patients receive a time-stamp upon checking in to the clinic, and another one when the doctor takes them for their appointment. I think mobile could a tremendous impact on the clinic’s ability to efficiently deliver healthcare (the concept of being on-time for appointments is nearly zero), but baseline numbers will be really important to calculate impact.