A User's Guide to Bayes' Theorem

preview_player
Показать описание
What is Bayes' Theorem? How is it used in philosophy, statistics, and beyond? How should it NOT be used? Welcome to the ultimate guide to these questions and more.

OUTLINE

0:00 Intro and outline
3:10 What is Bayes’ Theorem?
4:50 Belief and credence
14:46 Interpretations of probability
24:52 Epistemic probability
28:32 Bayesian epistemology
30:50 Core normative rules
33:05 Propositional logic background
42:32 Kolmogorov’s axioms
54:25 Dutch Books
57:25 Ratio Formula
1:02:03 Conditionalization principle
1:10:47 Subjective vs. Objective Bayesianism
1:13:50 Bayes’ Theorem: Standard form(s)
1:39:25 Bayes’ Theorem: Odds form
1:53:06 Evidence
2:09:25 Visualizing Bayes’ Theorem
2:49:41 Common mistakes
2:49:53 Base rate fallacy
2:52:28 Evidence for H vs. Making H probable
2:53:43 Total evidence requirement
2:54:58 Fallacy of understated evidence
2:59:36 Confirmation is comparative
3:01:07 Evidential symmetry
3:07:56 Strength asymmetry
3:11:40 Falsifiability as a virtue
3:12:27 Likelihood ratio rigging
3:19:32 Conclusion and Resources

NOTES

(1) The argument I give around the ten-minute mark admittedly doesn't address the view that (i) credences don't exist, and yet (ii) we can still account for the relevant data about our doxastic lives by appeal to beliefs about probabilities. It also doesn't address the view that while credences exist and are distinct from beliefs, beliefs about probabilities suffice to account for the relevant data.

While I have independent reservations for these views, it's worth noting them nonetheless. The belief/credence part of the video was mainly an exercise in warming listeners up to talk of credences so they would be more receptive to the rest of the video. It's a necessary preamble to the main event: Bayes' Theorem. I grant that a proper defense of belief-credence dualism — and a proper defense of the overly-basic motivations I sketched at the ten-minute mark — would need to contend with these alternative proposals!

CORRECTIONS

(1) At 52:55, I meant to say that the SECOND claim entails the FIRST while the FIRST does not entail the SECOND. Oops!

(2) Thankfully, the audio improves at 28:32! Remind me to never record a solo presentation using Zoom…

(3) Here's an important clarification about the roommate/magical marker example given around 2:07:00. In the video, I was not clear about the content of the hypotheses in question and how this affects their likelihoods and priors. Here is how I should have spelled out the example.

​Consider two hypotheses:

H1: My friend wrote on my board
H2: My marker by itself wrote on my board

The data is:

D: "Don't forget to take out the trash!" is written on my board

Now, H1 renders D quite surprising, given that my friend knows all about my diligent habits of taking out the trash, etc. If H1 is true, he would most likely have written something on the board then erased it (since he knows I don't like him touching my board, etc.). And even if he didn't erase it, it would be very odd for him to write this given that he knows I'm super diligent about the trash.

But H2 renders D far, far more surprising. Of all the possible things the marker could conceivably have written on the board by magically floating upwards etc., only an absurdly small fraction are even coherent, let alone English words strung together to compose an intelligible, grammatical, contextually relevant English sentence.

Of course, H1 has higher prior than H2. But the point made in the video stands, since the likelihood of H1 (i.e., P(D|H1)) is very low, but it's still much greater than the likelihood of H2 (i.e., P(D|H2)), and hence data can still be evidence for a hypothesis even though the data is very surprising on that hypothesis.

LINKS

(1) Want the script? Become a patron :)

THE USUAL...

Рекомендации по теме
Комментарии
Автор

CORRECTIONS:

(1) At 52:55, I meant to say that the SECOND claim entails the FIRST while the FIRST does not entail the SECOND. Oops!

(2) Thankfully, the audio improves at 28:32! Remind me to never record a solo presentation using Zoom…

(3) Here's an important clarification about the roommate/magical marker example given around 2:07:00. In the video, I was not exactly clear about the content of the hypotheses in question and how this affects their likelihoods and priors. Here is how I should have spelled out the example in the clearest way possible.

​Consider two hypotheses:

H1: My friend wrote on my board
H2: My marker by itself wrote on my board

The data is:

D: "Don't forget to take out the trash!" is written on my board

Now, H1 renders D quite surprising, given that my friend knows all about my diligent habits of taking out the trash, etc. If H1 is true, he would most likely have written something on the board then erased it (since he knows I don't like him touching my board, etc.). And even if he didn't erase it, it would be very odd for him to write *this* given that he knows I'm super diligent about the trash.

But H2 renders D _far, far_ more surprising. Of all the possible things the marker could conceivably have written on the board by magically floating upwards etc., only an absurdly small fraction are even coherent, let alone English words strung together to compose an intelligible, grammatical, contextually relevant English sentence.

Of course, H1 has higher prior than H2. But the point made in the video stands, since the likelihood of H1 (i.e., P(D|H1)) is very low, but it's still much greater than the likelihood of H2 (i.e., P(D|H2)), and hence data can still be evidence for a hypothesis even though the data is very surprising on that hypothesis.

MajestyofReason
Автор

Joe released a 3 hour video about Bayes Theorem? Woohoo! Sounds like I know what I'm doing with my Friday night!

ShinyBaboon
Автор

This may be my favorite video you've ever made. It was a perfect intro to a topic that many people (me included) find difficult. I will be applying what I've learned in my own future videos. Thanks Joe!

TheAnalyticChristian
Автор

Literally just read **Bayes' Rule: A Tutorial Introduction to Bayesian Analysis** and was scoping out more information on Bayes. We’re on similar wavelengths, Joe!

christthinker
Автор

Love this video! Amazing job.

From the description: "The argument I give around the ten-minute mark admittedly doesn't address the view that (i) credences don't exist, and yet (ii) we can still account for the relevant data about our doxastic lives by appeal to beliefs about probabilities." True. That said, I actually think much of what you say later could still be true even if credences reduce to beliefs about probabilities, or credences don't exist and probability-beliefs play the relevant roles. Probabilism, conditionalization, (etc) would just be normative constraints on our probability-beliefs. (However: those "belief-first" or "belief-only" views are false.)

lizjackson
Автор

I love this channel, since it doesn’t push an agenda either way on theism or atheism and the comment section seems to be genuinely respectful towards both sides, unlike certain other channels which seem to demonise theists.

danielzhang
Автор

Good video Joe, this was fairly unbiased since even within the bayesian literature there is disagreement on things like intrinsic probabilities and the structure of probabilities. For example, the Orthodox view of bayesianism says that basic probabilities are the unconditional probabilities of complete worlds, while explanationism claims that basic probabilities are the atomic hypotheses conditional on potential direct explanations. This means that there is disagreement about if all probabilities are conditional or not. So when investigating any sort of evidence, There is always going to be conditional constraints on such and such evidence.

For example in my research on the problem of evil I have discovered that theism makes it's predictions based on axiology. If theism is the proposition that an all-good, all-powerful, all-knowing being created the world, then this proposition only makes predictions about what the world looks like in conjunction with a particular axiology (theory of the good). Crudely, the explanatory diagram is:

{Theism, Naturalism, Other theories of ultimate reality} --> Empirical facts about the world <-- {Axiology1, Axiology2, ...}

(This ignores any explanatory influence of theism on the correct axiology or vice-versa, but we can set that aside for present purposes.)

So the basic probabilities (or rather, the relatively basic probabilities, because obviously the above leaves out a lot of detail) here are the probabilities of particular empirical observations conditional on the conjunction of a theory of ultimate reality with an axiology: theism&axiology1, theism&axiology2, naturalism&axiology1, etc. With respect to evil, the conjunctions that most strongly predict the evils we observe will then be confirmed over the conjunctions that predict these evils less strongly.

The probability of evil given theism alone will then be a function of its probability given the conjuction of theism and each axiology, and the probability of each axiology given theism. (This is the theorem of total probability.) So if you think a particular axiology that makes the evils we observe quite likely has a high prior probability, then this will make the probability of these evils given theism high as well. (Of course, the atheist might also challenge the theist on the claim that that axiology is highly probable. But then at that point the PoE gets pushed back to an entirely separate question)

christianidealism
Автор

A lovely reason to procrastinate my history degree- grateful as always Joe!

calebp
Автор

The work you put into your videos is unfathomable

ILoveLuhaidan
Автор

This is a great start! If you end up with time to either add a part 2 or splice some things into this vid, here are some suggestions:

1. You mentioned why one might be an objective Bayesian but not why they might be a subjective Bayesian. A short discussion about Van Fraasen's (sp?) square thought experiment and the Principle of Indifference would be helpful, also a discussion about which levels of a hypothesis the Principle of Indifference could be applied to.

2. I've now reached a point where I'm jaded about the use of Bayes in philosophy, however, I make an exception of people are willing to express things via the Law of Total Probability rather than these abstract ratios which hide the fallacies mention in this video. Let's consider something like the FTA (I'd link to my blog post about it but sometimes youtube hides comments with links).

P(LPU) = P(LPU | R) x P(R) + P(LPU | NR) x P(NR)

This splits P(LPU) (Life Permitting Universe) into Random or Nonrandom causes.

P(LPU) = [P(LPU | R and D) x P(D | R) + P(LPU | R and ~D) x P(~D | R)] x P(R) + P(LPU | NR) x P(NR)

This splits P(LPU | R) into D and ~D, i.e. randomness following a statistical distribution or non statistically distributed randomness.

P(LPU) = [[P(LPU | R and D and U] x P(U | D and R) + P(LPU | R and D and ~U) x P(~U | D and R)] x P(D) + ...

This splits P(LPU | R and D) into a Unform Distribution (Principle of Indifference on constants) vs ~U (some non Uniform distribution on constants)

So basically if you take what "N" means in the FTA, it's actually: R and D and U (universe is random, obeys a statistical distribution, with uniformly distributed, independent constants.

Now what happens if this particularized idea of "N" vanishes for FTA? You still have everything else.

P(LPU) = [P(LPU | R and D and ~U) x P(~U | D and R) + P(LPU | R and ~D) x P(~D | R)] x P(R) + P(LPU | NR) x P(NR)

Note that FTA proponents could try to shrink this further by saying that P(~U) ought to be very small, but that's not so. Van Fraasen arguments, etc. may show that even if we have reason to favor P(U), there's no reason the preference ought to be overwhelming. Even if it's 70/30, that still leaves P(~U) at .3, and so if P(LPU | R and D and ~U) isn't super low (maybe it's .7), then you could get a reasonable probability like .21 out of all the P(LPU | R) space, especially relevant given that we don't know what P(LPU | R) is in a vacuum (it might be .4 for all we know and so this option yields over half the probability space).

3. The last issue I have with this philosophical use of Bayes is how to numerically evaluate non quantifiable propositions. Cancer tests are one thing, the probability that God wants morally relevant creatures, given that God exists, is quite another. I've been told so many times that one ought to think this probability is high, but I haven't been given any "objective" reason to think that it's high, important given the supposed "objectivity" of the prior. Without a systematic way to quantify propositions, free from disagreement, this project is DOA. Note that applied Bayes in the real world lacks this problem because different priors get iterated sequences of new information which should converge them to a true prior. But something like the FTA doesn't give iterated sequences of information to update priors and resolve disagreements. What you see is what you get, so you have to be 100% correct on the first try, or deal with an infinite recursion problem (the probability that my probability that my probability ... that my probability is correct is...)

logos
Автор

Cool video, I already knew Bayes' Theorem from high school and my CS degree, but the video made some things more intuitive, I also didn't know about the odds form.

MsJavaWolf
Автор

Statistician here! I absolutely love this take!

batesthommie
Автор

I have a strong prior that this video is great, I'll update you after I watch it.

STARSS
Автор

This is amazing. I have been studying Bayesian logic for 3 years now and this is the best summary+explanation for Bayesian Reasoning I have come across so far. It would have been so much easier if I had started with this instead of Math books and Philosphy Papers. If you ever do a re-work/edit of these videos I suggest to make 3 additional slides on:
1) The Handling of several pieces of evidence at once compared to consecutive updates. Although this does not change the outcome, so to speak, it does make a difference in how you present an argument or rather how easy it can be comprehended and thus scrutinized).
2) The common misconception that a high probability for E given H does not entail a low probability of E given not-H. This rarely happens with rough estimations of the Odds version, but easily happens when assessing some form of data based probabilities directly.
3) The use of BT to show what type of Evidence would indeed tip the scale. I think that this can also help to understand when an argument makes a stong case compared to an agnostic conclusion that can easily sway to one side or the other.

And my final comment relates to the principle of indifference. I think that you could put more emphasis on the notion that its not valid to start with 1:1 if you DO have information that tips the prior to one way or the other.

jenst.
Автор

I did not know that this could be applied to Philosophy but I have seen it applied to history by Richard Carrier. As another viewer, I have an engineering background so thanks for the proofs. Great presentation.

dougjordan
Автор

Dude, you need like 50 times the subscribers you've got; this was magnificent.

queencabbage
Автор

Ontology, metaphysics, epistemology, religion etc. Why is your domain of philosophical investigation incredibly vast? Thanks for the feeling of responsibility and conscientiousness for making these long videos. I wish you a great future. How about your own theories? tell us about them.

MiladTabasy
Автор

I've been waiting for this! I have a BS in Engineering, so I took a decent amount of higher level maths (up through Calc 4), and so I know just enough to know how much I don't know. But I can also tell when other people don't know haha, and I have been super skeptical of the way that some apologist types have been using Baye's theorum for some time now. Looking forward to watching this one - and likely re-watching, a few times!

Kevigen
Автор

I finally finished this video. I finally understand Bayes Theorem

gleon
Автор

This, right here? This is a treasure.

shanesullivan