Toggle menu
122
332
11
3.4K
Information Rating System Wiki
Toggle personal menu
Not logged in
Your IP address will be publicly visible if you make any edits.

A simple averaging technique to supplement the Bayes equation

From Information Rating System Wiki
Revision as of 15:16, 8 September 2024 by Lembot (talk | contribs) (Pywikibot 9.3.1)

Main article: Aggregation techniques

Background

As we saw previously, the Bayes equation can easily be misapplied to situations that are not based on rigorous probabilistic studies. The example given was along the lines of 100 people who are not sure whether it will be sunny or cloudy tomorrow because either they individually don’t know (P=50%) or their probabilities cancel out to become 50%. But the 101st person is certain that it will be sunny. According to Bayes, this will give you 100% certainty that tomorrow will be sunny because the 50% opinion, no matter how it is arrived at, simply has no influence.

Another simpler example is if 10 people give you a 60% chance of rain tomorrow and thereby, via the Bayes eqn, cause you to believe with near certainty that it will rain tomorrow. The problem here is that the 10 people are not conducting independent tests (or simulations) to judge the probability of rain. They are, more likely, simply reflecting the single weather report they all watched on TV.

It is clear that Bayes cannot be used in cases where the source offers no better than a handwaving estimate of probability resulting from a casual opinion. Since this is going to happen in a large number of cases, we will need a more realistic way to combine these opinions.

One way is to simply average the probabilities being given by each source. For the 10 people who watched the same weather report, the average will then be 60%, reflecting only the single source they obtained their information from.

So now we would have two methods for combining probabilities, a simple averaging technique and Bayes. It would be left to the user to choose which of these to use.

A weighting factor between simple averaging and Bayes

But these two choices seem like two opposite endpoints on a continuum. On the one hand we have Bayes for rigorous probabilistic tests and on the other we have simple averaging for the most unrigorous opinions. Unlike Bayes, the averaging technique provides no reinforcement (ie the can’t be higher than the highest ). But it seems like a large crowd should provide some reinforcement. If 10 people say 60%, isn’t that sometimes better than a single person saying 60%? What if there were two independent weather predictions that give the chance of rain at 60% and the 10 people are divided into these two groups? Then you could apply Bayes for two sources at 60% and get a reinforced result of 69%.

A straightforward way to do this is to simply have a user-chosen weighting factor between simple averaging and Bayes:

where:

is combined probability

is the simple-averaged probability

is the Bayesian combined probability

is the weighting toward Bayes. If = 0 then only simple averaging is used. If = 1 then the algorithm uses Bayes only.

The input probabilities for and are the same. That is, they are modified using trust in exactly the same way (using Sapienza’s equation, or using the modified forms of this equation described here).

Combining input probabilities in simple averaging

However, the input probabilities are not rolled up in exactly the same way to create . For Bayes, nodes are combined with their parent to create . The of the parents are then combined with their parent to create a new and so on all the way to the topmost node. For simple averaging, doing this will result in double-counting nodes, so the technique is to just append probabilities as we work up and then take the average once the top-most node is reached by dividing the sum by the number of nodes.

Example

The following example shows how this works:

We start at the bottom and find the modified probabilities based on Trust. As noted above, this is no different than what we’ve always done:

The modified probabilities for Nodes 3 and 4 are then appended to Node 1

The modified probabilities for Nodes 5 and 6 are appended similarly to Node 2:

These lists of probabilities are modified again by trust (for the 0-1 and 0-2 connection) and appended to the Node 0 probability:

The average for node 0 can now be found:

The Bayesian combined probability for this case is:

If we apply a weighting factor of, say, 20% as discussed above, we obtain:

This snippet performs this calculation and allows you to change values and tree configuration.

This algorithm can be modified to enhance the privacy of information transmitted up the nodes.