2020-11-06

Pareto Principle Gives Extreme Results

 Intro And Thesis

The Pareto Principle ("80/20 rule") is the observation "for many outcomes roughly 80% of consequences come from 20% of the causes", such as 20% of Italian landowners own 80% of Italian land, and has been extended to things like 20% of development time/effort can be use to write 80% of desired software functionality.  This can be written as a {0.2, 0.8} scenario.

Sometimes the principle can be applied recursively at multiple scales.  Perhaps 20% of 20% of landowners (4%) own 80% of 80% of land (64%).  In other words: {0.2, 0.8} at all scales implies {0.04, 0.64}.  Perhaps for some matters (land ownership, wealth), things can be that skewed at many scales.  For stuff like time to develop software functionality, I suspect the Pareto Principle can only be applied at a few carefully chosen scales.

Let's look at the following table to appreciate how recursive application yields some extreme scenarios.

input proportion
output proportion
3.8e-150.01
4.1e-100.05
6.1e-080.1
9.1e-060.2
1.7e-040.3
0.0010.4
0.0070.5
0.0250.6
0.0760.7
0.2000.8
0.4680.9
0.6910.95
0.9300.99
0.9640.995
1.0001

We see the famous {0.2, 0.8}, but we also see {0.007, 0.5} which would imply that you can get 50% of desired software functionality with less than 1% of the effort required to get 100% functionality.  There's even more extreme results like {0.001, 0.4}; one-thousandth the effort to get 40% of the benefit.  Putting these two scenarios together: you work some amount (0.001) to get to 40% functionality and you have to work an additional six times that amount to get to 50% functionality...and then you have to work and additional 9,000 times that amount to get to 100% functionality.

My thesis is that the Pareto Principle leads to proportions that are surprisingly extreme, and thus we should be very hesitant to apply the principle beyond a single well-chosen scale.  There might be lots of naturally-sized tasks where ~20% of the effort gets you ~80% of the benefit, like how carefully you hang a curtain and how good it looks.  But I doubt that {0.2, 0.8} and {0.007, 0.5} both apply to hanging a curtain.

Another question that arises is that if you can get 80% of desired software functionality with 20% of the effort required to get 100% functionality, do we really believe that multiplying the amount of desired software functionality by 1.25 multiplies the required effort by 5 (coming from 1/0.8 and 1/0.2)?  Or imagine someone bloating the desired software functionality to 1.25x so that the actually desired 1x software functionality will supposedly only require 20% of the time of their original 2x-functionality schedule.  I think these scenarios illustrate that the Pareto Principle is very unlikely to be applicable when you are considering different amounts of output to ask for.  Usually when you ask for more output, those outputs will contain easy parts and hard parts.  The only way for the Pareto Principle to hold as you increase desired functionality is whether you are always adding the easiest parts of the easiest features; but humans never do that; humans ask to add complete-enough-to-be-useful features, not the easiest 20% of a feature.

Math Work

Google spreadsheet.

Terminology:

  • x: the cause/input proportion, like effort or landowners
  • y: the consequence/output proportion, like software functionality or owned land
  • r: the output ratio, r=0.8 for the 80/20 rule
  • p: a reformulation of r that makes the equation much simpler
  • d: how many times the Pareto Principle has been applied recursively; perhaps 'd' can guide how hesitant you should be to predict such a scenario
  • ln: natural logarithm function
  • log(value, base): two-argument logarithm where second argument is the base
  • [] and (): various styles of parentheses so it's easy to visually match them up

Nice equations:

  • p = ln(r)/ln(1-r) = log(r, 1-r);
    • or more generally, p = ln(ExampleOutputProportion)/ln(ExampleInputProportion)
  • y = x^p = x^log(r, 1-r);
  • x = y^(1/p) = y^log(1-r, r);
  • d = log(x, 1-r) = log(y, r);

The original insight was that each {x, y} pair follows these parametric equations:

  • y = r^z
  • x = (1-r)^z

In English for the 80/20 rule: For each {x, y} scenario, y is 0.8 raised to some power z and x is 0.2 raised to the same power z.

Steps to get to the nice equations involving x, y, and p:

  • Get rid of z so we can bring x and y together in a single equation
    • x = (1-r)^z  ⇒  z = log(x, 1-r)  ⇒ z = ln(x)/ln(1-r)
      • used identity log(v, b) = ln(v)/ln(b)
    • y = r^z  ⇒  y = r^[ln(x)/ln(1-r)]
  • Simplify equation for y
    • y = r^[ln(x)/ln(1-r)] = (r^[1/ln(1-r)])^ln(x) = x^ln(r^[1/ln(1-r)])
      • we used identity: a^ln(b) = b^ln(a) 
    • Let's say p = ln(r^[1/ln(1-r)]) so we can have y = x^p
    • p = ln(r^[1/ln(1-r)]) = ln(r)/ln(1-r)
      • we used identity ln(a^b) = ln(a)*b
    • Now we have y=x^p and p=ln(r)/ln(1-r)
  • Solve for x as well
    • y = x^p  ⇒  y^(1/p) = (x^p)^(1/p) = x^(p/p) = x
      • raise y=x^p to (1/p) power to isolate x
    • x = y^(1/p)

 

No comments:

Post a Comment