Other possible algorithms for calculating binary predicates: Difference between revisions
More actions
Created page with "{{Main|Aggregation techniques}} This is a work in progress, for now just look at it as a collection of notes. This is the original graph from example 1, just modified to show opinions as true/false with confidence values to avoid confusion. <kroki lang="graphviz"> digraph G { fontname="Helvetica,Arial,sans-serif" node [fontname="Helvetica,Arial,sans-serif"] edge [fontname="Helvetica,Arial,sans-serif"] layout=dot 0 [label="0, no opinion"] 1 [labe..." |
No edit summary |
||
| Line 1: | Line 1: | ||
{{Main|Aggregation techniques}} |
{{Main|Aggregation techniques}} |
||
This is a work in progress, for now just look at it as a collection of notes. |
This is a work in progress, for now just look at it as a collection of notes. |
||
Revision as of 20:33, 30 August 2024
Main article: Aggregation techniques
This is a work in progress, for now just look at it as a collection of notes.
This is the original graph from example 1, just modified to show opinions as true/false with confidence values to avoid confusion.
HTTP-Response:
Error 400: Internal Server Error
Diagram-Code:
digraph G {
fontname="Helvetica,Arial,sans-serif"
node [fontname="Helvetica,Arial,sans-serif"]
edge [fontname="Helvetica,Arial,sans-serif"]
layout=dot
0 [label="0, no opinion"]
1 [label="1, no opinion"]
2 [label="2, no opinion"]
3 [label="3, true, confidence 20%"]
4 [label="4, true, confidence 40%"]
5 [label="5, true, confidence 60%"]
6 [label="6, true, confidence 80%"]
0 -> 1 [label="T=0.9"];
0 -> 2 [label="T=0.9"];
1 -> 3 [label="T=0.9"];
1 -> 4 [label="T=0.9"];
2 -> 5 [label="T=0.9"];
2 -> 6 [label="T=0.9"];
}
Previous results
In Pete’s analysis, he gives the values:
Pure Baysean Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\textstyle P_{bay} = 0.964}
(true, with 92.8% confidence)
Simple average: Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\textstyle P_{ave} = 0.616}
(true, with 23.2% confidence)
Weighted Average: Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\textstyle P_{ave,w0} = 0.634}
(true, with 26.8% confidence)
Derating
At each step, we scale the confidence by our trust in the source. Working our way up, node 1 derates the answers from 3 and 4 using the trust factors, then averages the results: Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle PT_1 = \frac{0.9 \cdot 0.2 + 0.9 \cdot 0.4}{2} = \frac{0.18 + 0.36}{2} = 0.27}
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle PT_2 = \frac{0.9 \cdot 0.6 + 0.9 \cdot 0.8}{2} = \frac{0.54 + 0.72}{2} = 0.63}
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle PT_0 = \frac{0.9 \cdot 0.27 + 0.9 \cdot 0.63}{2} = \frac{0.243 + 0.567}{2} = 0.405}
Which is to say node 0’s computed opinion is true, with a 40.5% level of confidence (normally we’d express this as 0.7025). This makes sense, because if you look at all the nodes with an opinion, they all think it’s true with an average of 50% confidence, but that confidence is watered down by two layers of middlemen.
Cons:
With this system, if I’m node number two and I only have node 6 as a source (pretend 5 doesn’t exist), my computed opinion will be: true, with 72% confidence (6’s 80% derated by their .9 trust). Fine. But when I add node 5 back in, my computed opinion falls to 63% confidence. This seems wrong. Adding a new node with the same opinion but a smaller Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\textstyle confidence \times trust}
value seems like it shouldn’t decrease my computed opinion, regardless of whether it’s low because we have low trust in the source or if they have low trust in themselves. If I ask a medical question, and a doctor I have high trust in answers true, I shouldn’t reduce my confidence just because an accupuncturist I have low trust in also answers true. We’re assuming 0 trust means we think the party doesn’t know, not that they’re always wrong.
Possible Fixes:
We could simply choose the max of the Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\textstyle confidence \times trust}
values, instead of trying to average them. We haven’t discussed the case where we have conflicting answers, but my current mental model is look at the Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\textstyle max(confidence\times trust)}
of the true responders and the Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\textstyle max(confidence\times trust)}
of the false responders separately. Our resulting answer is whichever is larger, and our confidence is the difference.
Pros: Additional answers don’t pull the most trusted/confident answer down, which seems good. Additional answers also don’t boost your confidence, which could be good if the answers are all just reporting the same source of information. When you have a lot of conflicting trusted answers, your confidence will be low.
Cons: If you have ten trusted sources that say true and one trusted source that says false, your confidence will be low. This could be ok if the true responders were all just echoing the same source, or bad if they were providing unique data.
Simple weighted averaging
An alternate algorithm (different behavior) would be to take average of 3 and 4’s answers, weighting using the trust factor. Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle PT_{1_{alt}} = \frac{0.9 \cdot 0.2 + 0.9 \cdot 0.4}{0.9 + 0.9} = \frac{0.18+0.36}{1.8} = 0.3}
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle PT_{2_{alt}} = \frac{0.9 \cdot 0.6 + 0.9 \cdot 0.8}{0.9 + 0.9} = \frac{0.54 + 0.72}{1.8} = 0.7}
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle PT_{0_{alt}} = \frac{0.9 \cdot 0.3 + 0.9 \cdot 0.7}{0.9 + 0.9} = \frac{0.27+0.63}{1.8} = 0.5}
So with this method, node 0’s computed opinion would be true with a 50% confidence, which is what you’d get from a straight average of all the nodes with an opinion. In other words, in this algorithm, nodes with no opinion just pass on the opinion/confidence of their sources without derating them. The trust factor’s only effect is when a node has multiple sources and needs to value one opinion more highly than the other. I think during discussion, we decided that we actually prefer to have opinions become less potent the further away they are, so that argues against this algorithm.