Answering questions already answered on Stack Exchange

In one of the previous posts, the relative distribution of upvotes on Stack Exchange showed that adding one more answer to a question isn’t much demanded by readers because readers give at least half of upvotes to the first-placed answer alone:

(y-axis: the answer’s mean fraction of upvotes in a question; x-axis is the position of the answer)

But the absolute number of upvotes tells why answers still appear:

(y-axis: mean of upvotes; other notation is the same)

Questions with few answers happen to be unpopular in general (that’s why they have fewer answer in the first place). You can barely notice upvotes for questions with a single answer. But they grow as the graph says. The last frame describes the case of 16 answers, but it’s better to be careful beyond that because the original sample has too few questions with many answers.

The bottom line is that answering a question with many answers may be more useful to users than answering questions with no answers at all. That’s at least valid under the assumption of unconditional expectations. Controlling for time is the next most useful thing to do to understand how demand and supply operate in Q&A markets and what can be done to make them more efficient.

Stack Exchange and reward for being on the top

As mentioned in the previous posts, Stack Exchange has a very interpretable structure. It’s a market in which demand for answering a question meets supply, and supply is paid with upvotes. Such a rude interpretation is necessary for learning how knowledge exchange works.

I once looked into a demand side of Stack Exchange, but now a few points on the supply side. In general, we are interested in efficient allocation of resources. Given the fact that sometimes one answer is enough (especially for software development questions), many answers may be a waste.

And that’s the distribution of answers per question:

Well, it’s a peak at 2 with a long tail. The details:

number of answers Freq. Percent Cum.
1 2,123 18.35 18.35
2 2,601 22.48 40.83
3 2,138 18.48 59.3
4 1,458 12.6 71.9
5 967 8.36 80.26
6 674 5.82 86.09
7 461 3.98 90.07
8 325 2.81 92.88
9 190 1.64 94.52
10 135 1.17 95.69

About 80 percent of questions end with five answers or less.

The Reward for Being on the Top

But what’s the reward for having your answer on the top of the others? These are the means of fractions of total upvotes by the position a given answer occupies:

It says that the answer on the top have an stable advantage over all answers to a given question. You can see that after the fifth answer, adding more answers does not decrease total upvotes given to the existing answers. And the first answer gets no less that half of all upvotes.

That’s a huge bonus, since multiple other answers have to split the remaining half of upvotes. That may be discouraging for participants, as competition is high and the winner takes all.

Sample summary statistics

Variable Obs Mean Std. Dev. Min Max
upvotes 45463 5.390141 24.11624 0 1553
downvotes 45463 0.1868992 0.9164532 0 82
net (up – down) 45463 5.203242 23.94713 -19 1552
position 45463 4.449794 6.709625 1 114
total_answers 45463 7.899589 10.94102 1 114
relative position 45463 0.6272573 0.2922457 0.0087719 1
total_uv_b~q 45463 67.75934 221.0013 0 2488
frac_uv 44188 0.2440255 0.3074606 0 1