Modification to the Sapienza probability adjustment for trust to include random lying, bias, and biased lying: Difference between revisions

Content deleted Content added

Inline

Revision as of 17:55, 26 September 2024

Main article: Trust

In the last iteration on this subject we modified the probability adjustment for trust to include lying and bias. We noted that the lying was random in nature, that is, the lie would be distributed evenly among all options that were not the truth. Here we modify this assumption by allowing, in addition, lying that is biased toward a particular outcome. In other words, if the source lies then it tries to lie by favoring a particular outcome, as long as that outcome is not the truth. If the outcome is the truth it falls back to random lying. Keep in mind that this is additional to the already established random lying which we will keep in the model for generality.

The Equation

The equation presented here is a little easier to follow because it accounts explicitly for all possibilities: the bias can be toward any of the options and so can the lies.

Let’s begin by defining some terms, many of which are familiar from the last iteration. Again we will assume a choice between three outcomes: Red, Blue and Green.

${\textstyle C}$ = Number of choices (eg 3 in this case)

${\textstyle N}$ = number of samples (this ultimately cancels out so is only useful in the derivation)

${\textstyle P_{R}}$ = Probability of Red

${\textstyle P_{B}}$ = Probability of Blue

${\textstyle P_{G}}$ = Probability of Green

${\textstyle P_{R}+P_{B}+P_{G}=1}$ (just to be clear)

${\textstyle T}$ = Trust, ie percent of time the reported outcome is the Truth

${\textstyle T_{r}}$ = Random portion of ${\textstyle 1-T}$ (same as before)

${\textstyle T_{l}}$ = Lying at random portion of ${\textstyle 1-T}$ (same as before)

${\textstyle T_{lr}}$ = Lying with a bias to red (the source will lie this % of time by trying to say Red as long as Red is a lie. If Red is the truth then this portion will be random lying)

${\textstyle T_{lb}}$ = Lying with a bias to blue

${\textstyle T_{lg}}$ = Lying with a bias to green

${\textstyle T_{br}}$ = Bias to red (same as before – the source will answer Red this % of time no matter what)

${\textstyle T_{bb}}$ = Bias to blue (same as before)

${\textstyle T_{bg}}$ = Bias to green (same as before)

${\textstyle T+T_{r}+T_{l}+T_{lr}+T_{lb}+T_{lg}+T_{br}+T_{bb}+T_{bg}=1}$ (just to be clear)

The equation is as follows for any particular option, eg Green:

P_{mod,g}=P_{G}T+{T_{r} \over C}+{T_{l} \over C-1}(1-P_{G})+T_{lg}(1-P_{G})+T_{bg}+{P_{R}T_{lr}+P_{B}T_{lb} \over C-1}

In general, for any particular choice ${\textstyle g}$ among all the other choices ${\textstyle i}$ , we can write:

P_{mod,g}=P_{g}T+{T_{r} \over C}+{T_{l} \over C-1}(1-P_{g})+T_{lg}(1-P_{g})+T_{bg}+{\Sigma P_{i}T_{li} \over C-1}

To be clear, the subscript ${\textstyle i}$ represents any choice except ${\textstyle g}$ , the one we’re doing the ${\textstyle P_{mod}}$ for.

Derivation

This derivation follows the last one we did in that we begin with a sample size of ${\textstyle N}$ and a number of choices ${\textstyle C}$ representing colors as above: Red, Blue, and Green. Instead of using a numerical example we will perform the derivation from the outset using the symbolic variable names. The following table can be created to represent the breakdown of actual and reported choices:

We choose all the cases where Green is reported, for instance. The ${\textstyle P_{mod,g}}$ is then the addition of all the reported Green cases divided by the total number of samples:

P_{mod,g}={NP_{G}T+{NP_{G}T_{r} \over C}+NP_{G}T_{bg}+{NP_{B}T_{r} \over C}+{NP_{B}(T_{l}+T_{lb}) \over (C-1)}+NP_{B}T_{lg}+NP_{B}T_{bg}+{NP_{R}T_{r} \over C}+{NP_{R}(T_{l}+T_{lr}) \over (C-1)}+NP_{R}T_{lg}+NP_{R}T_{bg} \over N}

The N cancels and we can collect terms to obtain:

P_{mod,g}=P_{G}T+{T_{r} \over C}(P_{G}+P_{B}+P_{R})+{P_{B}(T_{l}+T_{lb})+P_{R}(T_{l}+T_{lr}) \over C-1}+T_{lg}(P_{B}+P_{R})+T_{bg}(P_{G}+P_{B}+P_{R})

${\textstyle P_{G}+P_{B}+P_{R}=1}$ and ${\textstyle P_{B}+P_{R}=1-P_{G}}$ so,

P_{mod,g}=P_{G}T+{T_{r} \over C}+{P_{B}(T_{l}+T_{lb})+P_{R}(T_{l}+T_{lr}) \over C-1}+T_{lg}(1-P_{G})+T_{bg}

Rearranging terms leads to:

P_{mod,g}=P_{G}T+{T_{r} \over C}+T_{lg}(1-P_{G})+T_{bg}+{T_{l}(P_{B}+P_{R}) \over C-1}+{P_{B}T_{lb} \over C-1}+{P_{R}T_{lr} \over C-1}

Again noting that ${\textstyle P_{B}+P_{R}=1-P_{G}}$ and some further rearrangement leads to:

P_{mod,g}=P_{G}T+{T_{r} \over C}+{{T_{l} \over C-1}(1-P_{G})}+T_{lg}(1-P_{G})+T_{bg}+{P_{B}T_{lb}+P_{R}T_{lr} \over C-1}

This is the same as the equation presented above. We note that the first four terms are the same terms from the equation presented earlier, the one with just random lying and bias. The next term represents the lie toward the Green outcome. The last term represents the lie toward the Blue and Red outcome that turn into random lies that favor Green as one outcome because they aren’t lies when applied to the Blue and Red outcomes.

Simple Numerical Example

To apply this equation we use a familiar example in keeping with the Red, Blue, and Green choices we’ve been using all along.

${\textstyle C}$ = 3 = Number of choices

${\textstyle N}$ = 120 = number of samples (this ultimately cancels out so is only useful in the derivation)

${\textstyle P_{R}}$ = 60% = Probability of Red

${\textstyle P_{B}}$ = 30% = Probability of Blue

${\textstyle P_{G}}$ = 10% = Probability of Green

${\textstyle T}$ = 70% = Trust, ie percent of time the reported outcome is the Truth

${\textstyle T_{r}}$ = 8% = Random portion of ${\textstyle 1-T}$

${\textstyle T_{l}}$ = 5% = Lying at random portion of ${\textstyle 1-T}$

${\textstyle T_{lr}}$ = 2% = Lying with a bias to red (the source will lie this % of time by trying to say Red as long as Red is a lie. If Red is the truth then this portion will be random lying)

${\textstyle T_{lb}}$ = 3% = Lying with a bias to blue

${\textstyle T_{lg}}$ = 5% = Lying with a bias to green

${\textstyle T_{br}}$ = 1% = Bias to red (the source will answer Red this % of time no matter what)

${\textstyle T_{bb}}$ = 3% = Bias to blue

${\textstyle T_{bg}}$ = 3% = Bias to green

${\textstyle P_{mod,g}}$ = 0.7(0.1) + (0.08/3) + (0.05/2)(1-0.1) + 0.05(1-0.1) + 0.03 + [0.3(0.03)+0.6(0.02)]/2 = 0.20467

Our initial 10% probability for the Green outcome now becomes 20.467% because of Trust, randomness, bias toward the Green outcome, and lies toward the Green outcome.

The following snippet reproduces this, or any similar, calculation:

https://gitlab.syncad.com/peerverity/trust-model-playground/-/snippets/139

The problem of user input

The equation above increases the amount of user input considerably. The untrustworthy part of Trust could previously be calculated as ${\textstyle 1-T}$ , thus requiring no user input. Now the 3 choice problem above requires 8 new pieces of information from the user beyond ${\textstyle T}$ and the probability array ${\textstyle P}$ .

In practice there should be ways to reduce this. The user could be asked first if there are any biases and what they are. It is likely that the bias will only be toward a few of the choices, so the remaining ones will be zero. The next question could be whether the lying always follows the bias. If it does, no separate lying input is needed. The biased lying is then simply set equal to the bias. The next question could be how much random lying takes place, with the notion that many users will set this to zero since the biased lying is more likely. Once this is done the remaining quantity, ${\textstyle T_{r}}$ , can be calculated as follows:

T_{r}=(1-T)-(T_{l}+T_{lg}+T_{bg})

@@ Line 1: / Line 1: @@
 {{Main|Trust}}
-In the [[Internal:FromGitlab/Modification_to_the_Sapienza_probability_adjustment_for_trust_to_include_lying_and_bias|last iteration on this subject]] we modified the probability adjustment for [[trust]] to include lying and bias. We noted that the lying was random in nature, that is, the lie would be distributed evenly among all options that were not the truth. Here we modify this assumption by allowing, in addition, lying that is biased toward a particular outcome. In other words, if the source lies then it tries to lie by favoring a particular outcome, as long as that outcome is not the truth. If the outcome is the truth it falls back to random lying. Keep in mind that this is additional to the already established random lying which we will keep in the model for generality.
+In the [[Modification_to_the_Sapienza_probability_adjustment_for_trust_to_include_lying_and_bias|last iteration on this subject]] we modified the probability adjustment for [[trust]] to include lying and bias. We noted that the lying was random in nature, that is, the lie would be distributed evenly among all options that were not the truth. Here we modify this assumption by allowing, in addition, lying that is biased toward a particular outcome. In other words, if the source lies then it tries to lie by favoring a particular outcome, as long as that outcome is not the truth. If the outcome is the truth it falls back to random lying. Keep in mind that this is additional to the already established random lying which we will keep in the model for generality.
 <h2>The Equation</h2>