Sapienza trust model derivation showing equivalence with random answers

So far we have stated that the trust model in Sapienza ( https://ceur-ws.org/Vol-1664/w9.pdf) is the same as modeling the untrustworthy part of a source as random. What does this mean? If I trust my source 90%, then 10% of its reporting is untrustworthy, meaning it answers the question randomly for that 10%.

Let's take a look at some examples and then try to derive a general equation which we will then equate to the trust equation in Sapienza.

We will use the real/fake node case because it is simple to follow. We have 100 new nodes and a source that reports with 60% confidence that the nodes are real (or fake). For purposes of review we'll take the case of 100% trust first. For a single source we would now have a confidence of 60% that the node is real. We just believe the source and no calculation is required.

For two sources at 60% we can model the situation as follows:

100 N
        60%        60%
50 R ==> 30 Tr ==>  18 Tr  (ie, for 50 Real nodes, our first 60% confident source reports 30 to be real (Tr) and 20 to be fake (Tf). Our second 
source independently does the same.
                   12 Tf

        20 Tf ==>  12 Tr
                    8 Tf


50 F ==> 30 Tf ==>  18 Tf (same as above for 50 Fake nodes. It is important to note that if our source is 60% confident about real nodes it is also 
60% confident about fake nodes)
                   12 Tr

        20 Tr ==>  12 Tf
                    8 Tr

This is two 60% tests in a row. If two tests in a row say it's real, what is the probability of it being real? Ie how confident should we now be?

18+8 nodes tested real twice. Of those 18 are actually real ==> 18/(18+8) = 0.692. Now we are 69.2% sure the node is real. This is the same, btw, as the Bayes eqn in Sapienza: 0.6*0.6/(0.6*0.6+0.4*0.4) = 0.692.

Now let's build the trust factor into the eqn. Let's suppose I trust Source 1 90%. This means 10% of the time my source will report randomly that the node is real or fake no matter what their test says it is.

SOURCE 1 WITH TRUST:

100 Nodes With Trust
         S1 60% confidence 90% trust (90% of reported answers are same as test and 10% are random)
                                                   Totals
50 R ==> 30 Tr (27 rTr, 1.5 rTr, 1.5 rTf)   ==>    29.5 rTr                                                 
         20 Tf (18 rTf, 1 rTr, 1 rTf)              20.5 rTf  
                                              

50 F ==> 30 Tf (27 rTf, 1.5 rTf, 1.5 rTr)   ==>    29.5 rTf
         20 Tr (18 rTr, 1 rTr, 1 rTf)              20.5 rTr

If the first test REPORTS that the node is real, what is the probability of it really being real? Well we have, 27+1.5+1 = 29.5 reportedly real tests which are actually real. And we have a total of 29.5 + 1 + 18 + 1.5 = 50 reportedly real tests in total. 29.5 / 50 = 0.59 = 59%. The trust factor of 90% should make us 59% confident in our results, down from 60%.

Now let's show that this is equivalent to the Trust equation in Sapienza paper:

   27    +      1.5     +      1         /     27   +      1.5     +      1       +   18   +      1       +      1.5       = 0.59
( 30*0.9 + 30*(1-0.9)/2 + 20*(1-0.9)/2 ) / ( 30*0.9 + 30*(1-0.9)/2 + 20*(1-0.9)/2 + 20*0.9 + 20*(1-0.9)/2 + 30*(1-0.9)/2 ) = 0.59

( 50*0.6*0.9 + 50*0.6*(1-0.9)/2 + 50*0.4*(1-0.9)/2 ) / ( 50*0.6*0.9 + 50*0.6*(1-0.9)/2 + 50*0.4*(1-0.9)/2 + 50*0.4*0.9 + 50*0.4*(1-0.9)/2 + 50*0.6* 
(1-0.9)/2 )

We can cancel out the 50:

( 0.6*0.9 + 0.6*(1-0.9)/2 + 0.4*(1-0.9)/2 ) / ( 0.6*0.9 + 0.6*(1-0.9)/2 + 0.4*(1-0.9)/2 + 0.4*0.9 + 0.4*(1-0.9)/2 + 0.6*(1-0.9)/2 )

( 0.6*T + 0.6*(1-T)/2 + 0.4*(1-T)/2 ) / ... (just look at numerator for now)

( 0.6*T + 0.3 - 0.3*T + 0.2 - 0.2*T )

( 0.6*T + 0.5 - 0.5*T )

( 0.5 + (0.6 - 0.5)*T )
( Pnom + (P - Pnom)*T ) / ( Pnom + (P - Pnom)*T + 0.4*T + 0.4*(1-T)/2 + 0.6*(1-T)/2 )    ... add the denominator back in

                                                 0.4*T + 0.2 - 0.2*T + 0.3 - 0.3*T
                                                 0.2*T - 0.3*T + 0.5
                                                 0.5 - 0.1*T
                                                 Pnom + (P - Pnom)*T where P = 0.4

(the 2nd prob, call it Pb)
( Pnom + (Pa - Pnom)*T ) / ( Pnom + (Pa - Pnom)*T   +      Pnom + (Pb - Pnom)*T )

Same as Pa/(Pa+Pb) except with all probabilities adjusted by Trust.

This shows that the Trust eqn. in Sapienza is equivalent to answering randomly for the untrusted part