Probability Theory - The Logic of Science

202409212300 Status: #book Tags: Title: Probability Theory - The Logic of Science Author: ET Jaynes ---- # Probability Theory - The Logic of Science ## What Kind of Book Practical / Expository ## Summary ## Outline ### Preface - Jaynes emphasizes the he builds up probability theory as a form of *fuzzy logic*. It is a way of reasoning about events that "might" happen or not happen, whereas traditional logic is about events that will "certainly" or "certainly not" happen - Contrasts sharply with frequentist approaches, where probability is viewed as random independent samples from a distribution with no prior knowledge of the problem - Bayesian statistics is a bit closer but adds on too much machinery and is a special case of this "fuzzy logic" approach (for example, we need to specify a model, sampling distribution, prior distribution, etc) - Says that we will start off by building up probability from these logical first principles and finding that both Bayesian and frequentist statistics are special cases of this approach that have applications when certain conditions hold true - He also makes an interesting point in the introduction - that he starts every chapter with prose and a logical explanation because you *must* ground your mathematical rigor in sound reasoning. I like this approach a lot - a strong understanding of the subject in verbal, logical, and intuitive terms must precede a truly rigorous mathematical formalization. Otherwise the foundations of the formalization will be weak & faulty ### Ch. 1 - Plausible Reasoning - Standard logic as formulated by Aristotle is concerned with logical statements of two general forms: - If A is true, then B is true. We find that A is true and therefore B must be true - If A is true, then B is true. We find that B is false and there A must be false - These statements express *certainty*, which does not happen in many real-world scenarios. - Scientific reasoning is typically more along the following lines: - If A is true, then B is true. We observe that A is false. Therefore, B becomes less plausible. - We are ruling out one possible cause for event B. Some other event may cause event B, but by ruling out A, we must have a lower probability that B does actually occur. Imagine that we hypothesize that a company will fail (B) and have an exhaustive list of causes for a company's failure. If A is one such cause and we investigate and find that A is not true, we are less certain about the company's future failure but cannot rule it out entirely. - If A is true, then B is true. We observe that B is true. Therefore A becomes more plausible. - Imagine A is an explanatory framework proposed by scientists (e.g. general relativity) which makes a prediction B (e.g. gravitational waves). When we observe B, it makes the the explanatory framework A *more credible*, but since A is not the only explanatory framework that could have conceivably predicted B, we cannot be certain that A is true. Furthermore, there may be other predictions made by A that are *not* true, again making it so we cannot conclude that A is true. - Yet more reasoning rests on weaker syllogisms of the following type: - If A is true, then B becomes more plausible. We observe that B is true. Therefore, A becomes more plausible. - These "weak syllogisms" do not weaken logic. Instead, they *extend* logic into fuzzier domains where the traditional Aristotelian setup has nothing to say (i.e. scenarios with incomplete information or inherent randomness or where we learn information that does not give us certainty) - For example, if we have $A \implies B$, then we have certainty if we learn the $A$ is true (then $B$ must be true) or that $B$ is false (then $A$ must be false). If if we learn that $B$ is true or that $A$ is false, we cannot say anything about this statement with absolute certainty. We have to talk about *degrees of certainty or plausibility*, which is what the "weaker" syllogisms presented by Jaynes give us - Super important quote: "Polya showed that even a pure mathematician actually uses these weaker forms of reasoning most of the time. Of course, **on publishing a new theorem, the mathematician will try very hard to invent an argument which uses only the first kind; but the reasoning process which led to the theorem in the first place almost always involves one of the weaker forms (based, for example, on following up conjectures suggested by analogies).** The same idea is expressed in a remark of S. Banach (quoted by S. Ulam, 1957): 'Good mathematicians see analogies between theorems; great mathematicians see analogies between analogies.'" - Jaynes asserts, in alignment with von Neumann, that the only real imitations on making machines which think are limitations in not knowing exactly what thinking consists of. Says that his approach to fuzzy logic (which he calls the study of common sense) will lead to concrete ideas about the mechanism of thinking that can be implemented in a computer - **"Every time we can construct a mathematical model which reproduces a part of common sense by prescribing a definite set of operations, this shows us how to ‘build a machine’, (i.e. write a computer program) which operates on incomplete information and, by applying quantitative versions of the above weak syllogisms, does plausible reasoning instead of deductive reasoning."** - In line with this idea, Jaynes sets up a thought experiment where we construct a robot brain to think in a way that would be desirable in human brains (i.e. we think that a rational person, on discovering that they were violating on of the definite rules programmed into the robot, would wish to revise their thinking). - Important note: in logic, the word "implies" in the phrase "A implies B" does *not* mean that B is logically deducible from A. It simply means that $A = AB$ in the sense of truth values (that is, $A$ has the same truth value as $A$ AND $B$). Thus, if $A$ is true then $A = AB$ is true for any $B$ which is also true, regardless of if the statement $A \implies B$ has any deductive or causal connection. That is, a true proposition can imply ANY true conclusion! In addition, if $A$ is false, then $A = AB$ for *any* possible value of $B$, and so a false proposition can imply ANY conclusion! - All Boolean variables $A$ can take on one of two possible values: $T$ or $F$. If we define a function $f(A_1, \ldots, A_n)$ over a Boolean variables $A_1, \ldots, A_n$, we can say that it will either output $T$ or $F$ for any given specific set of values $A_1, \ldots A_n$. Given that there are $n$ variables with 2 possible choices ($T$ or $F$) for each variable, our function $f$ with $n$ inputs has an input space of $2^n$. For each of those $2^n$ possible inputs, the function can output 2 possible values ($T$ or $F$). Hence, there are $2^{2^n}$ possible functions of $n$ Boolean variables. Each of these functions can be defined in terms of NAND (not and) or NOR (not or) operations - either one of these can be used to represent any possible logical function. NAND is defined as $\overline{AB} = \overline{A} + \overline{B}$ - We can use these facts well in designing our thinking machine - we design logic gates, which are circuits that have a common ground, two input terminals, and one output terminal. The voltage relative to ground at any of these terminals can take on only two values: for example, +3 volts or 'up' representing $T$ and 0 volts or 'down' representing $F$. A NAND gate is thus one whose output is up if and only if at least one of the inputs is down; or, what is the same thing, down if and only if both inputs are up. For a NOR gate the output is up if and only if both inputs are down. One of the standard components of logic circuits is the 'quad NAND gate', an integrated circuit containing four independent NAND gates on one semiconductor chip. Given a sufficient number of these and no other circuit components, it is possible to generate any required logic function by interconnecting them in various ways ## Thoughts & Ideas ## Criticisms --- # References