Better Models for Education

A cautionary tale

Four years ago leading universities jumped into the bandwagon of massive open online courses. They didn’t get much more attention since then:

Google Trends

This is international data. In the US, interest in MOOCs declined, despite respectable institutions kept offering new courses on various topics. Is it a marketing failure, which best universities would be proud of, or a bad educational technology?

Let’s see. A typical MOOC consists of

  • lecture slides and exercises
  • a talking head that reads the slides
  • a discussion board, barely alive
  • an optional certificate

Despite many professors having good presentation skills, this technology is not different from a textbook. In fact, ten years before MOOCs, the MIT offered a much better solution: OpenCourseWare — a guideline how to study like an MIT student. It wasn’t tied to particular enrollment dates, pace, or lecturer. Instead, it showed what a diligent student should complete in one semester.

MOOCs became popular after Sebastian Thrun and Peter Norvig had released their open AI course. More than 100,000 students had enrolled, and universities decided to supply more courses. But the AI course was backed by new exciting technologies like self-driving cars and text recognition, while a standard university course covered boring rudiments available in any textbook.

The quality of online courses didn’t improve over time. Each professor appreciated his own brand and didn’t collaborate with colleagues from other universities. So each one had his own course, that is, slides and exercises. For example, a large MOOC provider offers 609 “data science” courses. Students enroll in just a dozen of them, when the lecturer already has a very good reputation. Like Andrew Ng and his machine learning course based on Stanford’s CS229 and available online since 1999.

The history of MOOCs shows how a lot of smart people keep making things that don’t work. Interestingly, it has to do with their core competencies and not online education itself.

Because someone else did better.

Y Combinator: Engaging educators

University professors have little motivation to work with students. Richard Feynman described teaching as “something [to do] so that when I don’t have any ideas and I’m not getting anywhere I can say to myself, ‘At least I’m living; at least I’m doing something; I’m making some contribution’—it’s just psychological.” So when it comes to research vs teaching, many professors choose research.

Anyway, most universities teach future workers, not researchers or educators. Normally, you expect workers teaching workers. Workers raised by professors are like Tarzan raised by gorillas. An innocent problem in a primary school, but the difference in interests increases as education progresses.

How to align the interests of educators and students? By involving the educator in the student’s real passion. That’s what startup accelerators do.

Y Combinator, the most prestigious of accelerators, invests in early-stage startups and puts their founders through a 3-month training program. The 5% stake that Y Combinator acquires for $120K ensures that the mentor’s wellbeing depends on the performance of his students.

Mentorship and apprenticeship are old business practices, of course. Startup accelerators add a social component by bringing many founders to one place. They also escape the research lab hierarchy, when a senior faculty member secures funding and employs graduate students as cheap labor force.

The MIT Media Lab is perhaps the most famous academic lab that operates like a startup accelerator. Professors join the companies founded by their graduates. That’s not a general practice in other universities, in which offering a stake for better mentoring sounds like an insult.

Khan Academy: Engaging students

Engaging students is the second most important task of an educator after engaging himself. This task takes time, so schools and colleges prefer to get rid of the least motivated troublemakers, instead. Many leave college because they see better options. How can educators decrease attrition?

Khan Academy was a one-man project done by a hedge fund analyst in his spare time. The founder taught math on YouTube years before universities started publishing videos of their own classes.

But arguably the best part of Khan Academy appeared later, when students started solving exercises online and getting immediate feedback. Happened before, but Khan Academy polished this technology with data:

In brief, Khan Academy sets the sequence of exercises such that students are not discouraged by frequent failures. It’s part of Khan Academy’s gamification mechanism, which keeps learners motivated throughout K-12.

Stack Exchange: Asking and answering questions

Good educators teach the Socratic way, by asking leading questions. This technique does not scale well in a class with 100+ students. A good alternative is a Q&A website, like StackExchange or Quora.

StackExchange covers many academic subjects up to the graduate level. Its community encourages good questions and punishes for ill-prepared ones. Over time, a motivated person learns how to do preliminary research and ask right questions.

Answering these questions makes more sense than standardized tests or oral exams. Other advantages? Real problems, clear rewards, faster feedback.

Wikipedia: Accumulating knowledge

Wikipedia is fifteen year old, but the education system integrated only one half of it: students copy-paste Wikipedia content into their essays. It should be the other way around! Instead of assigning essays that no one reads, university professors could assign editing Wikipedia articles.

That’s a real contribution. Wikipedia editors check changes and reject the bad ones. It’s easy to track these edits. The Wikimedia Foundation always look for new editors and broader coverage. The content goes straight onto the front page of Google Search.

Despite all the advantages, I saw very few professors who practice this. That’s again about engaging educators, rather than students.

GitHub: Offering creative assignments

GitHub became a Wikipedia for code. Anyone can contribute to a project of interest. The list of open issues suggests possible contributions.

Like Wikipedia and StackExchange, GitHub addresses genuine problems, not synthetic exercises. Software engineers dominate, but any STEM project suits this platform.

Kaggle: Encouraging competition

Though the idea of 3,500 statisticians competing for $50,000 may seem irrational, Kaggle attracted thousands of math-savvy folks to practical problem solving. “Practical” is Kaggle’s key innovation. Competitive problem solving existed before in international olympiads and websites like Hacker Rank. Kaggle made such competitions useful, massive, and scalable.

Some CS departments encourage students to take part in Kaggle competitions. Why here and not on Wikipedia or GitHub? Kaggle challenges look much more like a standardized testing with clear-cut ranking. No need to evaluate whether the student made a useful contribution or just cheated.

Code4Startup: Learning for doing

Learning by doing is an old, popular, and effective technique. But task assignment is a trap. Stupid tasks kill motivation, and the rest dies by itself.

The simplest way to improve motivation is to increase the reward. Startup success stories turned to be a very effective one. More importantly, they are free.

Code4Startup turned this idea into a service. They offer courses showing users how to make a clone of a successful startup. Unlike MOOCs, these courses show how to turn coding and marketing skills into a useful product.

Code School and treehouse take a similar approach.

A honorary mention goes to McDonald’s and Walmart. These companies employ and train the people which top universities would never admit (and other universities get rid of these people after admission). Those who complain about students paying them $50K a year must try to teach a person working for the minimum wage.

Code Review: Giving feedback

Feedback prevents bad habits. In music, you want someone to hear you playing and to fix your techniques before you mastered them. Because one hundred thousand repetitions later you still may do it wrong. And music is complex enough to require a dedicated person sitting next to you and giving tips.

In other areas, technology provides a medium. Kind strangers from Stack Exchange Code Review help developers write better code. Duolingo fixes pronunciation. Show HN let the developer know if his MVP is good enough to keep on going. And elsewhere, video calls connect students with any teachers they can afford.

The calls are the best. Feedback is too complex for technology. Humans have to do it. Skillful people, really. So while lots of apps offer to teach you math or music for $20 a month, they sell new problems, not solutions. And any buck saved on teachers turns into hours of wasted time.

A comment

The services I mentioned have nothing to do with the formal education system. Many of them are not even labeled as educational. But they do what colleges are supposed to do, and do it better.

Three more things. (1) These services never associated themselves with colleges. More importantly, none attempted to reform the formal educational system. That’d be an interesting waste of time, as it was for John Dewey and other reformers. (2) These services scale and depend less and less on the limited supply of really good professors. (3) These services specialize. They don’t teach everything; they make narrow tools to improve specific skills.

Comparing their popularity with that of top universities (the MIT is much more popular outside the US; other terms are insensitive to geography):

Google Trends: The United States

Selected services (the two plots have different vertical scales and only trends are comparable; for more, check the links):

Google Trends

So if education is changing, it it’s changing outside traditional institutions.

Marketing by Elon Musk

While the Uber story shows that a poorly regulated industry may be a good place to start a new company, Elon Musk suggests another opportunity borne by government:

But government is inherently inefficient. So it makes sense to minimize the role of government such that government does only what it has to do, and no more.

After this quote, some people cut their Social Security cards into pieces and run to a libertarian sea platform, away from government. This is, however, not what Musk means. Here’s some background.

It’s not a secret that, since 1958, NASA received $1 trillion dollars from federal budget to create the stack of technologies that SpaceX currently uses in its own commercial projects. SpaceX’s initial capital of $100 million makes 0.01% of this investment in space odysseys. The other 99.99% came from the government, which is presumably the necessary minimum mentioned by Musk.

And as Musk rightly reminds in the same interview:

But funded by the government just means funded by the people. Government, by the way, has no money. It only takes money from the people. [Laughter.]

So SpaceX took away dozens of engineers trained by publicly funded NASA and secured at least $500 million in government contracts.

Tesla Motors, another company founded by Musk, sells cars eligible for a $7,500-worth federal subsidy and numerous of state subsidies of a comparable amount. It’s about 20% off each car to help Tesla compete with fossil fuel vehicles.

His third company, Solar City, also advertises solar tax credits and rebates as its competitive advantage over traditional utilities. It promises that “some [state governments] are generous enough to cover up to 30% of your solar power system cost.”

The subsidies are, of course, not the point here. They are the second way toward clean and renewable energy, after complete pricing of fossil fuels (which is broadly supported by economists, see Pigou Club). In practice this transition will happen very much like what Tesla and Solar City do now.

But for anyone practically or intellectually interested in how this business works, executives happen to be a pretty misleading source. Even when these executives write long books about their companies or hire well-known economists without giving them complete data. Instead of the story how the company really works, the reader gets ideological cliches about business, management, and government. With teachers like them, it’s not a surprise that 9 out of 10 startups end up nowhere.

This happens mostly in hi-tech, with all this sudden success and media exposure. But the most competent executives manage to keep a low profile even here, because they know that they are best at running companies, not at teaching people how things work.

The French Connection

One month ago the French police arrested two Uber executives for running an illegal cab company (yes, Uber) — the sort of accusations supported by the French court. I hope the CEO of Uber won’t end like Al Capone, but I would say a couple of good things about the company in advance.

Uber found an ironclad source of value: a heavily taxed and regulated industry with unsophisticated laws. A few changes in the business model totally confused regulators, and Uber currently enjoys a tax advantage across North America and Western Europe. The company surely shares the profits from this advantage with its drivers and clients, but at the expense of other cab services.

These traditional cab services operate in a boring market where no one makes big profits. High prices usually just include all the payments to the city, like expensive licenses and employee-related taxes. Are these payments a waste? In cities like Paris tourists enjoy clean streets and good roads because they pay this high price for personal transport. And so do locals: a taxi means more jams, more pollution, and more roads. On the contrary, when cities keep transport-related taxes low, the mayor gets reelected but the entire city spend each morning in jams.

What about the better drivers that Uber has? Actually, they are paid higher wages:


How’s that? The part-time employment that implies lower taxes (and cost hiding). Which brings us back to the argument above.

Yep, Uber has all those driver ratings and such, but even small traditional taxi companies learned how to get feedback on their drivers. But better employees want wages that are — taxes included — incompatible with the industry.

The second success factor of Uber-like multinational taxi services is the McDonald’s signal. Wherever you happen to be, there’s a company with the known standards of quality. As for taxis, at least you won’t end up in the wrong part of the city with the driver having his first week in a new country. But that’s for folks who travel a lot across cities, so not the biggest deal.

It’s possible, of course, that Uber creates value in other ways, like managing its cab fleet better. They don’t reveal this information. They did reveal the interest in replacing humans with self-driving cars after their raid on Carnegie Mellon, but that’s for the future.

This sounds less revolutionary and disruptive than the “sharing economy” evangelism, but startup founders waste time on ideological companies that fail because sharing by itself creates little value. It’s really better to spend less time in development and more time in looking for real sources of value here.

Twitter, Brevity, Innovation

Singapore’s Minister for Education [sic] recollects his lessons from Lee Kuan Yew:

I learned [from Lee] this [economy of effort] the hard way. Once, in response to a question, I wrote him three paragraphs. I thought I was comprehensive. Instead, he said, “I only need a one sentence answer, why did you give me three paragraphs?” I reflected long and hard on this, and realised that that was how he cut through clutter. When he was the Prime Minister, it was critical to distinguish between the strategic and the peripheral issues.

And that’s what Twitter does. It teaches brevity to millions. Academics and other professionals who face tons of information daily must love it. First, because it saves their time. Second, it prioritizes small pieces of important information.

Emails and traditional media do this badly because people can’t resist the temptation to get into “important details.” But my details are important only after you asked for them. And Twitter restrains me from writing them in advance by leaving me only 140 characters (right now, I’m over 100 words already). So, it saves two people’s time. As Winston Churchill, himself a graphomaniac, said, “The short words are the best.”

Short messages earn most interactions
Short messages earn most interactions (Source)

Like many other good ideas, this wasn’t the thing founders initially had in mind. They had to cut all messages to 140 characters to make them compatible with SMS and, thus, mobile. Later on, web services, such as Imgur, borrowed this cutoff. This time not as technical restriction, but to improve user experience. That’s an easy part.

The second part is difficult. Twitter is bad at prioritizing information. Tags and authors remain the major elements of structure. Search delivers unpleasant experience (maybe this made Twitter cooperate with Google). If you missed something in the feed, it’s gone forever.

This weak structure is partly due to initial engineering decisions. However, structuring information without user cooperation is difficult everywhere. And users won’t comply as twits should be effortless by design. It means engineers have to do more of hard work. In turn, it costs money and time. There must be strong incentives to do this. The incentive is not there because Twitter lacks competition.

Would anyone step in and fix it? Suppose, you’re taking a cheap way and ask users to be more collaborative. You can make Twitter for academics with all the important categories, links, and whatever helps researchers communicate more efficiently. This alternative will likely—if it hadn’t yet—fail to gain a critical mass of users. Even in disciplined organizations, corporate social networks die due to low activity. Individually, employees remain with what others use. The others use what everyone uses, and everyone uses what he used before. You need something like a big push to jump from the old technology.

Big pushes away from Twitter is more like science fiction now. Whatever deficiencies it has, the loss-making company priced at $30 billion dollars wins over better-designed newcomers. In the end, its 280 million users are centrally planned by Twitter’s CEO. That’s about the population of the Soviet Union by 1991.

It’s not new that big companies lock users in their ecosystems. The difference is, sometimes it’s justified, other times it’s not. For Twitter, it’s difficult to imagine any other architecture because major social media services all impose a closed architecture with third-party developers joining it on slavery-like conditions. To take the richest segment, most of iOS developers don’t break even. So, apart from technical restrictions that Twitter API has, the company doesn’t offer attractive revenue sharing options to developers that contribute to its capacities and, thus, market capitalization. For example, to address the structural limitations mentioned before.

All in all, interesting experiments in making communications more efficient end very quickly as startups reach traction. After that moment, they become conservative, careful, and closed. And this is a step backward.

Not So Free Facebook

The IT industry has two types of products: those that save time (think of Google Search) and those that waste consume time (like Facebook). Though both are free, time spent on Facebook is sort of opportunity costs, typically equal to the user’s wage or whatever he does instead.

Even if the consumer formally pays nothing for either of the services, his behavior is not the same. That’s because of demand elasticities. One marginally relevant example:


These are demand curves. Percentage shows adoption rates. Nevermind the goods on the right. These are not IT and even not the developed world, but this is the most illustrative data of this kind around.

Most goods have elastic demand here. The blue curve also shows the striking difference in demand between zero and any positive price. This is very much like web products: the user base shrinks rapidly when the price becomes positive. For the freemium models, the premium user base is south of 5%. That’s why startups avoid pricing users at early stages.

Facebook also likes to pose itself as a free product. But it’s not really free. According to stats, an average user spends 40 minutes per day on Facebook. Though overstated, such usage is equivalent to $13 paid each day with the median US wage taken as opportunity cost.

Facebook, unlike Google, can set nominal access fees. Users already pay a lot for it, and equilibrium is around inelastic zone of the demand curve. Paywalled Facebook would make its shareholders happier because its current evaluation at $200 per user skyrockets with the enhanced cash flow. The current ad-based model is a dead end for Facebook because its ads target cold clients (compared to Google’s and Amazon’s visitors). While current earnings are very low for such a big company, Facebook’s P/E ratio of 75 is what investors are ready to pay knowing the forthcoming switch to a viable business model—and the paywall is one of them.

The logic of low elasticity under positive opportunity costs is relevant for other time-consuming services. Major newspapers had got paywalls long ago, but for other reasons: they have fewer users and high labor costs. Genuinely scalable web services are reluctant to experiment with payments and settle with nicely looking “premium” prices, like $5 or $10, which are loosely connected with costs and nearby offers, but never look like empirically grounded. Generally, these services prefer rules of thumb to experimentation. Maybe that’s a miss, since when the monthly fee is way below the hourly wage, demand is expected to be inelastic, so revenue opportunities must be around.

And yes, that’s possible because the IT industry is basically many monopolies complaining a lot about competition which isn’t there.

It’s a Wonderful Loan: Economics of P2P Lending

The Financial Times wonders why big banks are going after P2P lending. Why do banks need companies like Aztec Money and Lending Club, which have negligible credit portfolios and messy business model? Well, banks themselves might say about their motivation in this case (so far they didn’t), but I can think of a good economic reason why they should pay attention to P2P lending.

This reason is older than the Internet, computers, and banks themselves. It’s information about the borrower. In between conspiracies against the public, banks do a very useful thing: they take off the lender’s headache about the borrower’s payback. Banks have to know their borrower well. And typically, they do and keep the net interest spread low. Here’s the rates for banks and credit unions:


Credit unions have been in the industry like forever. They would fit what the FT names “democratizing finance” and have much in common with the ideology behind P2P technologies. Credit unions have higher deposit rates and lower interest in the table because they know more about borrowers. Unions lend only to trusted folks and the number of individual defaults decreases, so you see better rates. Better rates also mean an even lower probability of default, so it’s reinforcing.

The Grameen Bank (and Nobel laureate Yunus) played this idea brilliantly. They radically reduced the market interest rates in poor countries, where high rates coupled with high default rates had been strangling the economy. The Grameen Bank entered very much like a credit union. Borrowers had to provide references from local peers to get access to money. The interest rates have been reduced from 50–100% annually to a single-digit number.

The Grameen-type firms and credit unions are limited in geography and expertise. You could back only your neighbor and only in a very simple business. If he tells you he’ll buy a cow to sell milk, you’re okay. But if a guy on the other coast needs a credit line to build “radar detectors that have both huge military and civilian applications,” you want to know the risks better. That’s why in a complex economy, Grameen is no longer relevant. Each loan application requires more information about the borrower, his credit history, and, most importantly, the purpose of the loan.

The purpose is vital for business loans. Banks learned to dig information about the borrower and to come up with the individual probability of default (you can try to predict yourself). But they’re getting worse in knowing the client’s business. First, businesses are getting more complex. Second, banks reduce their human workforce and local branches, while local branches provided a lot of soft information on borrowers and their performance. Jimmy Stewart’s banking was about observing his little town’s economy and deciding what would be creditworthy there. Without this source, banks pool risks and set higher interest rates, deterring borrowers.

Here comes online P2P lending. When a nuclear physicist from CERN lends money to a nuclear physicist from NASA via P2P system, it tells something about the borrower’s project. The guy from CERN is the right guy to judge. He also throws his own money into this. And that solves both the complexity (you can always find a lender-investor with the right expertise) and neighborhood problems (an expert comes from anywhere). Plus it’s technically free. The CERN physicist has already done the job banks couldn’t do: he found the borrower’s project, evaluated, and approved it. It looks like an investor’s job, and it is. P2P lending platforms like Kiva do mix investing and lending. Users do informal research before lending money.

This info allows banks do P2P loan matching (like some VC and foundations do), buy individually-backed loans, securitize them, and so on. This is a rare example when new technologies are not eating someone else’s pie (like YouTube does to mass media) but create their own. Without this easy expert-loan matching, businesses face higher interest rates, often above their breakeven point, which means no business at all.

Still, P2P platforms themselves seem distracted from this advantage. Most reasoning behind them mentions phantom problems like “predatory interest,” much paperwork, and refused applications in traditional banking. These are not the problems. The financial industry is highly competitive even after the series of post-80s M&A. It evaluates the risks with huge volumes of data, hires good quants, and saves a great deal on scale. In fact, the low market capitalization of major banks indicates that they have no means to “exploit” customers (Google and Amazon do, though in a delicate manner, as here and here). So net interest margin declines:


The bank’s paperwork and rejections are just the costs of low interest rates. It makes no sense for startups to “fix” banking in this direction because it’ll increase the rates—sort of getting the industry back into prehistoric times. The information flows between lenders and borrowers is the real thing to focus on.

One to n: Market Size, Not Innovations

In his popular Zero to One, Peter Thiel singles out original product development as the most important step for entrepreneurs to make. After that, “it’s easier to copy a model than to make something new. Doing what we already know how to do takes the world from 1 to n, adding more of something familiar.”

Of course, building a prototype is important. But it’s not the most important problem in the hi-tech industry. More often, startups passes the zero-to-one step trivially. They fail in what comes next: in going from one to n.

Right from the preface, Peter Thiel supports his thesis with the cases of Microsoft, Google, and Facebook. But these companies never went from zero to one. Their core products were invented and marketed by their predecessors. Unix was there ten years before Microsoft DOS release. AltaVista and Yahoo! preceded Google. LiveJournal had pioneered social networks five years before Mark Zuckerberg founded Facebook. Do a small research on any big company mentioned in the book’s index, and you’ll find someone else who did zero to one before the big and famous.

Now, there’s an obvious merit in what Microsoft, Google, and Facebook did. Reaching billions of customers is more difficult than being a pioneer. However, it principially changes the startup problem. Going from zero to one doesn’t make a great company. Going from one to n does.

And startups pay little attention to their one-to-n problem. Take the minimum: the product’s target market, the n itself. In their stylized business plans, founders routinely misestimate their ns by a few digits. For one example, developers of a healthy-lifestyle app equated this app’s market to all obesity-related spendings, including things like liposuction. Naturally, the number was large, but it wasn’t their n.

Many founders sacrifice several years of their lives to ideas with overestimated ns. Back to Thiel’s examples, Microsoft, Google, and Facebook knew their huge ns before their grew big. Moreover, they purposefully increased their ns by simplifying their products on the way. In the end, each human being with Internet access happened to be their potential (and often actual) customer.

What do other founders do, instead? They see a monster like Microsoft and run away from competition into marginal niches. A marginal niche leaves them with a small n, while requiring about the same several years of development. In fact, it’s cheaper to fail early with such a niche product because if a modest project survives, it distracts its founders from bigger markets. The project functions like a family restaurant: good people, nice place, but, alas, no growth.

How to escape competition right? For example, by building a path to a big market right from the start, as Y Combinator suggests when it welcomes a possible competitor to Google.

Here, Zero to One again may mislead if taken literally. The book’s emphasis on innovation and technology sidelines simple facts about successful companies. Successful companies are lazy innovators. In their early years, Microsoft, Google, and Facebook were too small to invest in serious innovations. They’ve been built on simple technologies. Google run on low-cost consumer hardware and Facebook was a simple content management system written on PHP in a few weeks. Common-sense creativity, not fancy innovations, supported these companies. While their simple initial products remain critical to business performance, their graveyard of failed zero-to-one innovations grows (look at Google’s).

The path to a big market is perpendicular to innovations. In the innovation scenario, founders become scientists who dig a single topic until the zero-to-one moment. Such as very advanced DeepMind, which was virtually unknown before Google’s acquisition. In the big market scenario, founders devote their attention to marketing, namely, how to earn new users and retain their loyalty. Often, this task is easier to complete with handwritten postcards to early adopters than spending years teaching a computer to recognize cat videos. And it’s clearly not a single zero to one step, but many steps back and forth, with the foreseeable n in mind.

Not Reinventing Education


Education seems like an attractive market to entrepreneurs. It’s huge. It has low competition. Its core technology comes straight from the Stone Age. Why won’t you create something cool here?

Education is different. Not only because of ideology, but because of the very Stone-Age technology that looks replaceable. Universities didn’t change much since they had come out of monasteries a thousand years ago. And they are highly competitive despite that. You won’t find another industry in which a company remains on top for nine centuries.

Most startups created thus far compete with textbooks, not education. New textbooks now work in a browser and interact with the reader. Online courses offer lectures and materials from the best teachers. But it’s not the university experience, as the teachers themselves agree. Books remain books even online. Best books were in libraries for centuries. It never withheld the learners.

Education is cooperation. Cooperation happens in groups, and groups are limited by definition. Harvard might have a million students (after all, Walmart has 245 million customers weekly), but then it would be an ordinary place. Until groups are small and carefully selected, its members may learn from each other and the faculty. In other words, education is all about one limited resource: people’s attention. IT can’t scale up people’s attention yet. It does routine stuff and does it well, like crawling petabytes of data daily. For this, it’s more likely to create the next Google worth $300 bn. than to create a $10 bn. business in education.

PS: Investors agree:


Startups across countries

A few plots in addition to yesterday’s post on startups.

Startups and economic development

Sources: dataset and Penn World Table 7.0.

That’s not a bad fit for relations between startups and GDP. The number of startups in the dataset seems to be a good indicator of entrepreneurial activity in general.

Startup nation

Here’s an illustration for Dan Senor and Saul Singer’s thesis about Startup Nation:
Israel has relatively more startups than the US. Tel Aviv and Silicon Valley drive the numbers for their countries, so it’s not exactly a nation-wide phenomenon. You call the book Startup City, though the result is no less impressive.

Web data and language barriers

Like other sources based on voluntary reporting, CruchBase may have data biased on one or another way. For example, it may underrepresent countries, in which English is not a major language. And we expect a bias in favor of bigger firms. And here’s the case:

China and Russia indeed either have bigger startups on average or just underreport to CrunchBase. The latter is the case because these are exactly two major countries that stand behind a language firewall. They have their own Facebooks, Twitters, and Amazons. So, we expect them to be less active on CruchBase. More so:

The surprising break after the 90th percentile separate countries into two groups. What are the groups? Look here:

(US and UK are excluded to make the graph readable. 100+ startup countries included.)

Group 1 are countries with < 0.02 startups per 1,000 inhabitants and Group 2 are the rest. And in result Group 2 contains countries with an explicitly high role of English language. So, the break indeed looks like a language thing.

Nevertheless, language per se is not a big factor in development, so it doesn’t bias the data on GDP in a systematic way. (You can also control the very first plot for the percentage of English-speaking population.)

Investing and failures in startups

The efficient market hypothesis got a bad press after 2008. Not surprisingly. It’s a half-truth. For instance, what Robert Shiller identified as genuine mispricing Robert Lucas called a minor deviation. Also, the hypothesis has many interpretations, and here’s one of them.

(data link)

On the left we have the mean of money that startups received over their lifetime. On the right is a rude measure of risk: the ratio of acquisitions to closed companies in the respective market. So, enterprise software has three successful acquisitions per one failure. I dropped “operating” startups because it’s difficult to interpret their success.

The graph is interesting because clean tech gets much funding but has one acquisition per two failures. Analytics gets small funds (not so sexiest as it was called?), but gives very stable outcomes. These two are exceptions because in general funding match the risk measure. And so in other markets: it’s enough for one product (like housing) to have abnormal pricing for the entire market to be under risk.

That is an attempt to make complex things embarrassingly simple, of course. For example, some may insist that average funding is a measure of capital intensity, not of competition among investors. Or what we should honestly calculate returns, as was done here. But it all seems to be half-truths, including this piece. We have to keep watching.