Info Gain of Model

The goal for this section is to quantify the information gain of this model and compare it on an apples-to-apples basis to a hypothetical competing model.

Model Characteristics

I previously established the following performance characteristics for this model:

Information gain measures are discussed here. The Venn diagram defining the types of information is linked below.


Mutual Information

Mutual information between default and the test is calculated as follows. This is the entropy of the original base rate minus the conditional entropy of default given the test classification.

$H(X)$, the entropy of the original base rate, is calculated as follows. The letters A through H, used in the formulae below, are as defined in the image here.

$H(X|Y)$, the conditional entropy of default given the test classification, is calculated as

Then, $I(X;Y)$ is calculated as

Recall that the units for this value is mutual information, or “information gain,” in average bits per event.

Percent Information Gain (P.I.G.)

The Percent Information Gain (P.I.G.) is the ratio $\frac{I(X;Y)}{H(X)}$, calculated as follows.

Savings Per Bit

Between the savings-per-event value and the bits-per-event value, just calculated, it is possible to measure a savings-per-bit value. This concept is powerful, because it places a financial value on the information content of a model or data source.

The savings-per-event value of $337 was calculated here.

The bits-per-event value, or mutual information, value of 0.0860 was calculated above.

Thus the savings-per-bit is given by

Alternative Model Characteristics

A hypothetical competing model has the following performance characteristics. Note that it performs somewhat better than my model.


Mutual Information

Percent Information Gain (P.I.G.)

Savings Per Bit

The cost per event for this alternative schema is $838. The savings per event is calculated as .

Points of Comparison

The following table establishes the important points of comparison between the competing models.

Parameters My Model Alternative Model
Mutual Information 0.0860 0.1205
Percent Information Gain 10.6% 14.9%
Cost per Event $913 $838
Savings per Bit $3919 $3419

A few important differential comparisons are possible.

The incremental information gain of the alternative model over my model is

If my model was available to an organization, the maximum price that the organization should be willing to pay for the alternative model is

At this maximum break-even price per score, the incremental value per bit from the alternative model is

Some content from this note was taken from the spreadsheets listed below. They are distributed as part of the Mastering Data Analysis in Excel course on, and licensed by Daniel Egger under CC BY-NC 4.0.

  • AUC_Calculator-and-Review-of-AUC-Curve.xlsx
  • Data_Final-Project.xlsx
  • Information-Gain-Calculator.xlsx

Some other content is taken from my notes on the Coursera course “Mastering Data Analysis in Excel.” It is sponsored by Duke University and the course content is presented by Professor Daniel Egger.