More actions
| No edit summary | No edit summary | ||
| Line 69: | Line 69: | ||
| This conclusion would rate closer to a 2 or 3 on our scale (Expert consensus/Statistical data) rather than the 4 (Theory) of the original conclusion, because it's more tightly bound to the empirically-supported predicates and avoids extrapolating into prescription or value judgment. | This conclusion would rate closer to a 2 or 3 on our scale (Expert consensus/Statistical data) rather than the 4 (Theory) of the original conclusion, because it's more tightly bound to the empirically-supported predicates and avoids extrapolating into prescription or value judgment. | ||
| Would you like to explore how this revised conclusion might lead to different kinds of policy discussions than the original one? | Would you like to explore how this revised conclusion might lead to different kinds of policy discussions than the original one? | ||
| <h4>Commentary</h4> | <h4>Commentary</h4> | ||
Revision as of 20:19, 2 February 2025
Predicate analysis
Our immediate goal is a foundational tool that begins with predicates (related to a question – we’ll get to that shortly) and then finds the truth without resort to preconceived conclusions. This would be a new way of truth seeking, one that sits alongside established techniques (eg the scientific method, adversarial collaboration, expert opinion, etc). This, in my view, is difficult but let’s try it.
Let’s state from the outset that for this technique to work, communities will have to agree a-priori to use it as their truth-finding mechanism for the question at hand and then abide by the conclusions it generates. This is no different than any truth-seeking mechanism. We agree to use the scientific method, we agree to adversarially collaborate, etc.
We must start with at least the question of what we’re trying to achieve. The predicates must be related to some subject we have in mind or else they are random. Let’s suppose we have the following (very simple) argument for improving our educational system:
Our teachers are paid less than other similarly qualified professions. Our students score poorly on standardized tests compared to those of other countries. Our educational resources are unequally spread between rich and poor communities. Too much money flows to rich school districts and not enough to poor ones. We devote too much time and resources to extracurricular activities that don’t involve serious learning. Meanwhile we need more skilled workers in the fields of science, technology, and information systems. Therefore, our educational system needs to be improved.
This argument is so simple that each sentence is essentially a predicate that can be evaluated for truthfulness. The last sentence, the conclusion, is also a predicate (or easily convertible into one) and is special because it is supported by the other predicates. Therefore, we have two tasks: to rate each predicate’s truthfulness and to rate the extent to which the conclusion is warranted from the predicates.
We might have the idea that this is not a serious argument because it consists of a bunch of general statements without real specificity. It does not cite the scores of our students on standardized tests vs. other countries. But it is the type of argument people make. In fact, a person did make it, just now without looking anything up and without LLM (which would have done a much better job). The person who made it is “pretty sure” he can back it all up with hard facts from authoritative sources but, being a person, does not remember them right now. We will turn to the issue of sources in a moment.
We might also critique this by saying it is not an example of exorcising our preconceived notions and coming to free-form conclusions from disparate, singular predicates. That is certainly true in the here and now when the argument was written. The writer of this opinion already knew our education system needed improvement. But it was not true before he reached that conclusion in his life. He was not brought up to think that our educational system needed improvement, especially with respect to other countries. Quite the contrary, he was brought up, many decades ago, to believe in American preeminence in pretty much everything. He was brought up to believe in “the system”. It was only through years of discovery (and genuine changes in the world) that this less rosy conclusion was reached.
Let’s evaluate each predicate involved in this argument which, for this simple case, maps to each sentence. We will evaluate them based on the 1-15 scale we developed a long time ago to determine trustworthiness (also below under Prompt). In fairness, this isn’t quite the same as “truthfulness” but let’s try it anyway. I scored this myself and then used Claude to do it for me (see raw prompt/response below in Appendix). I mostly agreed with Claude's assessment, below, but I will note where it differs from mine:
Let me evaluate each predicate and then the conclusion.
- 1) "Our teachers are paid less than other similarly qualified professions."
- Rating 3 (Statistical data and studies) This can be verified through salary data and comparative studies across professions.
- 2) "Our students score poorly on standardized tests compared to those of other countries."
- Rating: 2 (Expert consensus) This is well-documented through international standardized tests like PISA, with consistent results over many years.
- 3) "Our educational resources are unequally spread between rich and poor communities."
- Rating: 3 (Statistical data and studies) This can be verified through school funding data, resource allocation studies, and facilities assessments.
- 4) "Too much money flows to rich school districts and not enough to poor ones."
- Rating: 8 (Persuasion) While the inequality is factual (covered in #3), the "too much/not enough" judgment is subjective and advocacy-oriented.
- 5) "We devote too much time and resources to extracurricular activities that don't involve serious learning."
- Rating: 11 (Personal opinion) This is largely subjective, both in the "too much" judgment and in defining "serious learning." I would rate this an 8, since it can be documented but it would require its own argument to prove. It is clear why Claude thinks it is "personal opinion".
- 6) "Meanwhile we need more skilled workers in the fields of science, technology, and information systems."
- Rating: 3 (Statistical data and studies) This can be verified through labor market data, job vacancy rates, and industry surveys.
- Conclusion, "Therefore, our educational system needs to be improved."
- Warrant Rating: 4 (Theory) The conclusion is reasonably well-warranted by the predicates, though not perfectly. The first three predicates and the last one provide strong factual support for needed improvement. However, predicates 4 and 5 are more opinion-based and weaken the logical chain somewhat. The conclusion also implies a value judgment about what constitutes "improvement," making it impossible to rate it as purely fact-based. Still, the preponderance of evidence-based predicates makes this a fairly well-supported conclusion.
Claude has the benefit of being able to rate each claim against source material. Let’s accept this to avoid the laborious process of looking up the sources for each predicate. In other words, the source material evaluation and ratings are combined into one.
Although I agreed with Claude on most of the predicate ratings, I thought my conclusion was stronger than a 4. I gave myself a 3 before I asked Claude. But Claude correctly marked me down for a few opinionated predicates and, most importantly, a conclusion that implies a “value judgement”. After reading its reasoning, I conclude that it was right.
If we go a bit deeper, we can see that the scale can be further broken down, especially for Category 4 – Theory. Obviously theory can be strongly or weakly supported and we can envision a scale for that single category. The same is true for several other categories. Claude may not have a problem delving deep into scales and sub-scales but a human would. It seems like a 1-15 scale for humans is a reasonable place to start.
This is an example of how a process of truth-seeking might work. We must be able to cast aside, or hold in reserve, our views while we trust in the judgement of a methodology. We do this for the scientific method fully knowing that our hypotheses might be overturned. We can do it here too.
Conclusion generation
Let's see if Claude can come up with a better Conclusion given the predicates. This, we might say, follows the purest conception of the method: take raw predicates and transform them into meaningful conclusions with no preconceived notions.
Prompt
Let's suppose you were unaware of the Conclusion given here and just had the supporting predicates. What do you think is the best conclusion you could draw from that set?
Response
Given just those predicates, I think a more precise and well-warranted conclusion would be:
"Our educational system shows significant measurable deficiencies in teacher compensation, student achievement, and resource distribution, while failing to meet current workforce demands in technical fields." This conclusion is better because it:
- Stays closer to the verifiable elements in the predicates (particularly the statistical/study-based ones about teacher pay, test scores, resource inequality, and workforce needs)
- Avoids the value judgment inherent in "needs to be improved" and instead describes observable gaps
- Excludes the more subjective claims about extracurricular activities and the normative judgment about resource allocation being "too much" in some places
- Creates a clearer logical connection between the evidence presented and the conclusion drawn
This conclusion would rate closer to a 2 or 3 on our scale (Expert consensus/Statistical data) rather than the 4 (Theory) of the original conclusion, because it's more tightly bound to the empirically-supported predicates and avoids extrapolating into prescription or value judgment. Would you like to explore how this revised conclusion might lead to different kinds of policy discussions than the original one?
Commentary
Again, Claude seems to have nailed it. This conclusion is certainly "better" than mine in terms of hewing to the strongest predicates. It is the conclusion one might write in a formal academic paper. But as such it is careful and circumscribed. Notably, it is not a call to action. Mine isn't an explicit call to action either but it comes alot closer. Viscerally, I would want mine as part of a community campaign to improve education.
However, as a community using an objective truth seeking methodology, we might have to be satisfied with taking conservative steps in our decision making. In so doing we tamp down the dramatic impulses that motivate rash decisions.
Appendix -- Raw Prompt/Response
Anthropic's Claude 3.5 Sonnet
Prompt
Hi, I'm going to give you a simple argument about improving the quality of our educational system consisting of several sentences and a conclusion. Each sentence can be interpreted as a "predicate", a statement whose truth value can be assessed. The conclusion is also a predicate whose truth value depends on the earlier predicates. I'd like you to evaluate the truthfulness of the predicates (not the conclusion, yet) using the following scale:
1. Established fact -- Generally uncontroversial, many verification methods, understood for a long time ==> Population of NYC. Conservation of energy.
2. Expert consensus -- The result of many studies and review. Almost an established fact ==> Raising interest rates decreases inflation.
3. Statistical data and studies -- Rigorous experimentation or field studies done by experts, or reviews of such work ==> Are eggs good for you?
4. Theory -- A principle supported by scientific study but not completely proven or established (often difficult to prove) ==> big bang theory, string theory, etc.
5. Philosophy -- An abstract general idea that is not explicitly testable but pursued through rigorous inquiry ==> How should we conduct ourselves? What is knowledge?
6. Conjecture and hypothesis -- A testable proposition intended for scientific scrutiny ==> Efficacy of the latest Covid-19 vaccine?
7. Speculation -- Semi-informed prediction, often by knowledgeable people, but may never be tested rigorously ==> Best investment for the remainder of this year?
8. Persuasion -- Subjective viewpoint based on evidence but usually tied to an agenda ==> Newspaper opinion columns.
9. Marketing -- An attempt to sell a product/service. Usually easy to identify. In many cases fact-based but can use propaganda, ideology, etc. ==> What's the best scooter under $1000?
10. Anecdotal evidence -- Personal story, perhaps true but unverified or unverifiable ==> The Wim Hof method makes me feel great!
11. Personal opinion -- Subjective view with no expectation of verification ==> Are Vermeer paintings beautiful?
12. Ideology / Religion -- A systematized set of beliefs held by many (usually) but unverifiable. Often borrows from philosophy, fact, opinion, andecdotal evidence, propaganda, etc. ==> Do you believe in socialism? An afterlife?
13. Propaganda -- Explicitly biased information intended to win people over to a political agenda. Like misinformation but more visceral and less factual ==> The Tutsis are bad people out to get us.
14. Misinformation -- A close cousin. False statements disguised as fact or statistical studies in order to deceive. ==> The Tutsis commit more crimes than we do.
15. Satire, parody, sarcasm -- Not intended as informational but a deliberate distortion for humor or to make a point. I'd also like you to evaluate the degree to which the conclusion is warranted based on the predicates. Use a 1-15 scale for that too, to be consistent (1 being completely warranted, 15 being unrelated).
Here is the argument: Our teachers are paid less than other similarly qualified professions. Our students score poorly on standardized tests compared to those of other countries. Our educational resources are unequally spread between rich and poor communities. Too much money flows to rich school districts and not enough to poor ones. We devote too much time and resources to extracurricular activities that don’t involve serious learning. Meanwhile we need more skilled workers in the fields of science, technology, and information systems. Therefore, our educational system needs to be improved.
Response
Let me evaluate each predicate and then the conclusion.
"Our teachers are paid less than other similarly qualified professions." Rating: 3 (Statistical data and studies) This can be verified through salary data and comparative studies across professions.
"Our students score poorly on standardized tests compared to those of other countries." Rating: 2 (Expert consensus) This is well-documented through international standardized tests like PISA, with consistent results over many years.
"Our educational resources are unequally spread between rich and poor communities." Rating: 3 (Statistical data and studies) This can be verified through school funding data, resource allocation studies, and facilities assessments.
"Too much money flows to rich school districts and not enough to poor ones." Rating: 8 (Persuasion) While the inequality is factual (covered in #3), the "too much/not enough" judgment is subjective and advocacy-oriented.
"We devote too much time and resources to extracurricular activities that don't involve serious learning." Rating: 11 (Personal opinion) This is largely subjective, both in the "too much" judgment and in defining "serious learning."
"Meanwhile we need more skilled workers in the fields of science, technology, and information systems." Rating: 3 (Statistical data and studies) This can be verified through labor market data, job vacancy rates, and industry surveys.
Conclusion: "Therefore, our educational system needs to be improved." Warrant Rating: 4 (Theory) The conclusion is reasonably well-warranted by the predicates, though not perfectly. The first three predicates and the last one provide strong factual support for needed improvement. However, predicates 4 and 5 are more opinion-based and weaken the logical chain somewhat. The conclusion also implies a value judgment about what constitutes "improvement," making it impossible to rate it as purely fact-based. Still, the preponderance of evidence-based predicates makes this a fairly well-supported conclusion. CopyRetryClaude can make mistakes. Please double-check responses.