Error bars and a problem with Bayesian modeling: Difference between revisions

Content deleted Content added

Inline

Revision as of 19:22, 16 October 2024

Main article: Technical overview of the ratings system

Error Bars for Incomplete Analyses

Error bars can be placed around incomplete results as they stream in. To do this we can compute the Pcomb for all the nodes in the network, whether they’ve answered or not, by putting P=50% on the nodes that haven’t answered. This way they don’t affect the calculation and the calculation can remain as simple as possible: just compute as if you have all the information. As nodes update their probabilities, better values will result. User’s can set their update time intervals to whatever they want.

To calculate the error bars we would perform three calculations: 1) As above, with P=50% on the nodes that haven’t answered and P for the nodes that have. 2) With P for the nodes that have answered and the remaining nodes at P=100% adjusted for the trust (which we presumably have). 3) With P for the nodes that have answered and the remaining nodes at P=0% adjusted for trust. Calculations 2 and 3 will then give us the max and min error around Calculation 1. At every time interval the user will see a graph with all their points and error bars around each.

A Problem with Bayesian Modeling

One of the reasons to do this is that outstanding nodes, even just one, can have a huge influence on the answer. Given that, it would be good to see what the potential error is to prevent people from stopping their calculation prematurely.

This can certainly happen in cases where there is a close split between two views. To see this we can run a case (https://peerverity.pages.syncad.com/trust-model-playground/ or use https://gitlab.syncad.com/peerverity/trust-model-playground/-/snippets/138) where the choices are Sunny or Cloudy, N=10, and 5 nodes think P=90% Sunny and 5 nodes think P=90% Cloudy. The result will be an even split between Sunny (50%) and Cloudy (50%). Now, if 1 extra node thinks it will be Sunny then our combined probability will be Sunny (90%) and Cloudy (10%). In other words the single extra node determines the outcome.

It doesn’t matter how many nodes we have. N=100 results in the same thing. See what’s wrong with this? How can a single extra opinion in an otherwise split contest among many make you almost certain that the one node is right?

The Bayes eqn. works by modifying a prior probability with new evidence to generate a posterior probability. It doesn’t know how the prior was generated. The prior can be one experiment or the result of several. Therefore if the results of several experiments are uncertain (a 50/50 split) they will cancel and the new experiment will carry all the weight. It’s as if you just had the one experiment.

However, Bayes works when it is based on properly sampled experiments, each of which is independent. If 100 nodes are 100% certain of an outcome and another 100 are 100% certain of the opposite outcome, it means they are incorrectly sampling their own space or introducing some other error. The macro result can’t contradict the micro results without someone being wrong.

In cases where there is a clear majority opinion then that opinion will combine, via Bayes, to produce an almost certain result. If we take the above example and add 5 more nodes in the Sunny direction, we will be almost 100% certain of Sunny weather tomorrow. One more node in either direction won’t change that so, in a sense, we’ve converged and don’t have the problem mentioned above. But we still have a serious problem. If 105 nodes think Sunny and 100 think Cloudy, are you really 100% certain of Sunny weather?

Again, Bayes is only useful with rigorously derived probabilities based on independent and correctly sampled experimental results. Most opinions are not that. The probabilities people generally come up with are just made up or sampled from invalid groups (eg friends who have the same opinion) or are repetitions of the same study (eg the same weather report on TV).

It seems we will need to provide a different model than Bayes to account for this. One idea would be to simply average the probabilities and weight them using trust. Stay tuned.

@@ Line 1: / Line 1: @@
 {{Main|Technical overview of the ratings system}}
-<h2>Error Bars for Incomplete Analyses</h2>
+== Error Bars for Incomplete Analyses ==
 Error bars can be placed around incomplete results as they stream in. To do this we can compute the Pcomb for all the nodes in the network, whether they’ve answered or not, by putting P=50% on the nodes that haven’t answered. This way they don’t affect the calculation and the calculation can remain as simple as possible: just compute as if you have all the information. As nodes update their probabilities, better values will result. User’s can set their update time intervals to whatever they want.
-To calculate the error bars we would perform three calculations: 1) As above, with P=50% on the nodes that haven’t answered and P for the nodes that have. 2) With P for the nodes that have answered and the remaining nodes at P=100% adjusted for the [[trust]] (which we presumably have). 3) With P for the nodes that have answered and the remaining nodes at P=0% adjusted for trust. Calculations 2 and 3 will then give us the max and min error around Calculation 1. At every time interval the user will see a graph with all their points and error bars around each.
+To calculate the [[wikipedia:Error bar|error bars]] we would perform three calculations: 1) As above, with P=50% on the nodes that haven’t answered and P for the nodes that have. 2) With P for the nodes that have answered and the remaining nodes at P=100% adjusted for the [[trust]] (which we presumably have). 3) With P for the nodes that have answered and the remaining nodes at P=0% adjusted for trust. Calculations 2 and 3 will then give us the max and min error around Calculation 1. At every time interval the user will see a graph with all their points and error bars around each.
-<h2>A Problem with Bayesian Modeling</h2>
+== A Problem with Bayesian Modeling ==
 One of the reasons to do this is that outstanding nodes, even just one, can have a huge influence on the answer. Given that, it would be good to see what the potential error is to prevent people from stopping their calculation prematurely.
@@ Line 15: / Line 15: @@
 It doesn’t matter how many nodes we have. N=100 results in the same thing. See what’s wrong with this? How can a single extra [[opinion]] in an otherwise split contest among many make you almost certain that the one node is right?
-The Bayes eqn. works by modifying a prior probability with new evidence to generate a posterior probability. It doesn’t know how the prior was generated. The prior can be one experiment or the result of several. Therefore if the results of several experiments are uncertain (a 50/50 split) they will cancel and the new experiment will carry all the weight. It’s as if you just had the one experiment.
+The [[Bayes' theorem|Bayes]] eqn. works by modifying a [[wikipedia:Prior probability|prior probability]] with new evidence to generate a [[wikipedia:Posterior probability|posterior probability]]. It doesn’t know how the prior was generated. The prior can be one experiment or the result of several. Therefore if the results of several experiments are uncertain (a 50/50 split) they will cancel and the new experiment will carry all the weight. It’s as if you just had the one experiment.
-However, Bayes works when it is based on properly sampled experiments, each of which is independent. If 100 nodes are 100% certain of an outcome and another 100 are 100% certain of the opposite outcome, it means they are incorrectly sampling their own space or introducing some other error. The macro result can’t contradict the micro results without someone being wrong.
+However, Bayes works when it is based on properly sampled experiments, each of which is [[wikipedia:Independence (probability theory)|independent]]. If 100 nodes are 100% certain of an outcome and another 100 are 100% certain of the opposite outcome, it means they are incorrectly [[wikipedia:Sampling (statistics)|sampling]] their own space or introducing some other error. The macro result can’t contradict the micro results without someone being wrong.
-In cases where there is a clear majority opinion then that opinion will combine, via Bayes, to produce an almost certain result. If we take the above example and add 5 more nodes in the Sunny direction, we will be almost 100% certain of Sunny weather tomorrow. One more node in either direction won’t change that so, in a sense, we’ve converged and don’t have the problem mentioned above. But we still have a serious problem. If 105 nodes think Sunny and 100 think Cloudy, are you really 100% certain of Sunny weather?
+In cases where there is a clear majority [[Opinion|opinion]] then that opinion will combine, via Bayes, to produce an almost certain result. If we take the above example and add 5 more nodes in the Sunny direction, we will be almost 100% certain of Sunny weather tomorrow. One more node in either direction won’t change that so, in a sense, we’ve converged and don’t have the problem mentioned above. But we still have a serious problem. If 105 nodes think Sunny and 100 think Cloudy, are you really 100% certain of Sunny weather?
-Again, Bayes is only useful with rigorously derived probabilities based on independent and correctly sampled experimental results. Most opinions are not that. The probabilities people generally come up with are just made up or sampled from invalid groups (eg friends who have the same opinion) or are repetitions of the same study (eg the same weather report on TV).
+Again, Bayes is only useful with rigorously derived [[Technical overview of the ratings system|probabilities]] based on independent and correctly sampled experimental results. Most opinions are not that. The probabilities people generally come up with are just made up or sampled from invalid groups (eg friends who have the same opinion) or are repetitions of the same study (eg the same weather report on TV).
 It seems we will need to provide a different model than Bayes to account for this. One idea would be to simply average the probabilities and weight them using trust. Stay tuned.