More actions
Created page with "<span id="dans-proposal-for-trust-weighted-histograms"></span> === Dan’s proposal for trust-weighted histograms === This is my (@efrias) interpretation of an algorithm Dan proposed after the July 24th meeting. It behaves differently from previous algorithms: * the output of this algorithm is a histogram that could be presented to the user as-is * previous algorithms used the trust factor to pull stronger opinions towards the center. In other words, previous algorithm..." |
No edit summary |
||
Line 1: | Line 1: | ||
{{Main|Aggregation techniques}} |
|||
<span id="dans-proposal-for-trust-weighted-histograms"></span> |
<span id="dans-proposal-for-trust-weighted-histograms"></span> |
||
=== Dan’s proposal for trust-weighted histograms === |
=== Dan’s proposal for trust-weighted histograms === |
Revision as of 20:42, 30 August 2024
Main article: Aggregation techniques
Dan’s proposal for trust-weighted histograms
This is my (@efrias) interpretation of an algorithm Dan proposed after the July 24th meeting. It behaves differently from previous algorithms:
- the output of this algorithm is a histogram that could be presented to the user as-is
- previous algorithms used the trust factor to pull stronger opinions towards the center. In other words, previous algorithms would treat a source who had 50% confident in their own opinion (personal opinion of 25% or 75%) but whom we trusted at 50% entirely equal to a source who was 25% confident in their own opinion but whom we trusted at 100% – both would be weighted at 37.5% or 62.5%). This algorithm preserves the individual sources’ “confidence” in the final result – if every source in your graph has a 100% confidence in one outcome or the other, the resulting histogram will only have nonzero values for the bins containing the extreme values of 0% and 100%.
setup
Assume a simple “a or b” predicate, and any node that has a personal opinion expresses it as a probability from 0% to 100%. If a personal opinion is 0%, it means they are completely confident that the answer is “a”. If their personal opinion is 100%, it means that they are completely confident that the answer is “b”. If it’s 50%, they have no idea and probably shouldn’t be wasting your time by answering.
Instead of reporting a single answer for their computed opinion, each node will report a histogram
algorithm
We decide on a number of “bins”, let’s assume we choose 10, defined the obvious way:
- bin 1:
- bin 2:
- etc.
To generate a computed answer, each node will generate a trust-weighted histogram as follows:
- start with a computed opinion of ten zeros
- if they have a personal opinion, set the bin containing their personal answer to (the node trusts itself completely)
- iterate through the sources the node trusts. Each of those nodes will be providing their own trust-weighted histogram. For each source:
- for each bin:
- multiply the source’s value for that bin by your trust factor for that source. Add it to the bin
- for each bin:
- normalize the histogram by dividing all bins by the highest value in any bin. if the highest value is zero, you may want to skip this step. Note: this “normalize to make the highest value = 1” was my initial assumption, but we should investigate alternatives, like “normalize to make the area under the histogram = 1”
example
Let’s work Pete’s example:
Starting at the leaf nodes, the nodes will just return a histogram containing in the bin with their personal opinion, elsewhere:
On the middle layer of the graph, it gets more interesting:
Node 2 starts with its own personal opinion, in the form of a histogram: . Then it takes the opinion of node , scales it by to get , and adds it to its personal opinion to get: . It repeats the process with node , scaling ’s histogram down to and adding it to the running total to get . Finally, it would scale the histogram, but the highest value is already so no scaling is necessary, so the final result is:
Node 3 and 4 repeat the same process to get:
Finally, node gets their say. They start with . Then they scale node ’s computed opinion by the trust factor to get: , and add it to their personal opinion to get . They do the same with ’s computed opinion, scaled to , and accumulating it to get . Scaling ’s yields , and adding it gives: . Finally, they scale their answer back down, dividing by to get the final answer of:
{ "$schema": "https://vega.github.io/schema/vega-lite/v5.json", "description": "Final Normalized Result", "width": 640, "data": { "values": [ {"bin": "0%", "weight": 0}, {"bin": "10%", "weight": 0}, {"bin": "20%", "weight": 0.56}, {"bin": "30%", "weight": 0.5}, {"bin": "40%", "weight": 1}, {"bin": "50%", "weight": 0.45}, {"bin": "60%", "weight": 0.9}, {"bin": "70%", "weight": 0.45}, {"bin": "80%", "weight": 0.45}, {"bin": "90%", "weight": 0.45} ] }, "mark": "bar", "encoding": { "x": {"field": "bin", "type": "nominal", "axis": {"labelAngle": 0}}, "y": {"field": "weight", "type": "quantitative"} } }
This shows a slight preference for the interval, because two sources favored it. Two sources also favored the interval, but those sources were more distant so their effect was watered down.