Toggle menu
122
332
11
3.4K
Information Rating System Wiki
Toggle personal menu
Not logged in
Your IP address will be publicly visible if you make any edits.

Error bars and a problem with Bayesian modeling

From Information Rating System Wiki
Revision as of 15:06, 17 October 2024 by Pete (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Main article: Technical overview of the ratings system

Error Bars for Incomplete Analyses

Error bars can be placed around incomplete results as they stream in. To do this we can compute the Pcomb for all the nodes in the network, whether they’ve answered or not, by putting P=50% on the nodes that haven’t answered. This way they don’t affect the calculation and the calculation can remain as simple as possible: just compute as if you have all the information. As nodes update their probabilities, better values will result. User’s can set their update time intervals to whatever they want.

To calculate the error bars we would perform three calculations: 1) As above, with P=50% on the nodes that haven’t answered and P for the nodes that have. 2) With P for the nodes that have answered and the remaining nodes at P=100% adjusted for the trust (which we presumably have). 3) With P for the nodes that have answered and the remaining nodes at P=0% adjusted for trust. Calculations 2 and 3 will then give us the max and min error around Calculation 1. At every time interval the user will see a graph with all their points and error bars around each.

A Problem with Bayesian Modeling

One of the reasons to do this is that outstanding nodes, even just one, can have a huge influence on the answer. Given that, it would be good to see what the potential error is to prevent people from stopping their calculation prematurely.

This can certainly happen in cases where there is a close split between two views. To see this we can run a case or use this python script where the choices are Sunny or Cloudy, N=10, and 5 nodes think P=90% Sunny and 5 nodes think P=90% Cloudy. The result will be an even split between Sunny (50%) and Cloudy (50%). Now, if 1 extra node thinks it will be Sunny then our combined probability will be Sunny (90%) and Cloudy (10%). In other words the single extra node determines the outcome.

It doesn’t matter how many nodes we have. N=100 results in the same thing. See what’s wrong with this? How can a single extra opinion in an otherwise split contest among many make you almost certain that the one node is right?

The Bayes eqn. works by modifying a prior probability with new evidence to generate a posterior probability. It doesn’t know how the prior was generated. The prior can be one experiment or the result of several. Therefore if the results of several experiments are uncertain (a 50/50 split) they will cancel and the new experiment will carry all the weight. It’s as if you just had the one experiment.

However, Bayes works when it is based on properly sampled experiments, each of which is independent. If 100 nodes are 100% certain of an outcome and another 100 are 100% certain of the opposite outcome, it means they are incorrectly sampling their own space or introducing some other error. The macro result can’t contradict the micro results without someone being wrong.

In cases where there is a clear majority opinion then that opinion will combine, via Bayes, to produce an almost certain result. If we take the above example and add 5 more nodes in the Sunny direction, we will be almost 100% certain of Sunny weather tomorrow. One more node in either direction won’t change that so, in a sense, we’ve converged and don’t have the problem mentioned above. But we still have a serious problem. If 105 nodes think Sunny and 100 think Cloudy, are you really 100% certain of Sunny weather?

Again, Bayes is only useful with rigorously derived probabilities based on independent and correctly sampled experimental results. Most opinions are not that. The probabilities people generally come up with are just made up or sampled from invalid groups (eg friends who have the same opinion) or are repetitions of the same study (eg the same weather report on TV).

It seems we will need to provide a different model than Bayes to account for this. One idea would be to simply average the probabilities and weight them using trust. Stay tuned.