|
CURRICULUM
SENSEI.LOG
STATUS:●ONLINE
RANK:LV.2
BONDS:♡ 5/30
AP:30/180
BULLETIN
[8/31] NIHAHAHAHAHAHAHA!
[8/31] NIHAHAHAHAHAHAHA! [8/31] NIHAHAHAHAHAHAHA! [8/31] NIHAHAHAHAHAHAHA! |
Chapter 01: Introduction
Nihaha, Sensei! Are you ready for our first lesson? Here's what we're covering. Try to pay attention, and I promise I'll try my best to stay on topic:
That's the whole chapter. Stick with me and I promise it'll be more fun than stamping documents all day. ...That's a low bar, I know. But still! Ready, Sensei? Try to keep up ~ I'm not slowing down for anyone. Nihaha! LESSON 1.1. What is probability?So! Sensei. Have you ever flipped a coin? (≖ˋ‿ˊ≖) Of course you have, everyone has. And you just kinda... assume it's 50/50, right? Heads or tails, no biggie. But hold on a sec ~ is it really random? Where does that "50/50" even come from? You didn't weigh the coin. You didn't measure your thumb-flick angle. You didn't check the air currents. You just... know. Or you think you know. Nihaha, weird, huh? These are the kinda questions probability is built on, Sensei! And trust me, they're WAY more interesting than the boring stuff the upperclassmen at the Seminar make me do. Way more interesting. Okay so imagine ~ just imagine! ~ you knew literally everything about the flip. The exact weight of the coin, the precise force of your thumb, every air current, the gravity pulling on it, every single atom around it. Could you predict the result perfectly? Maaaybe! And if you could, then that "50/50" was never about the coin at all ~ it was about you, and all the stuff you didn't know! Spooky, right? But wait! It gets weirder! Our best theory at the tiniest scale of the universe ~ quantum mechanics ~ is itself probabilistic at its core. Like, no matter how good your instruments are, you literally cannot know the exact state of every particle. So is the universe actually rolling dice?! Or is that just the limit of what silly observers of the universe like us can meaningfully measure? Hmmm... I dunno! Nobody really knows! That's part of the fun! So before we even write a single equation (I know, I know, I promise to go easy on the math), we've got a big juicy puzzle, Sensei: what the HECK even IS probability? Okay so there's basically two gangs arguing about this. Like rival student councils, kinda! Gang #1 ~ the Frequentists! These guys say probability is a real, physical thing in the world. Flip a coin a million times, you'll get roughly half heads. That 50/50 is out there, baby! It's the long-run frequency of the event. Clean idea, right? But here's where I poke holes in it ~ what if you only flip the coin ONCE in your whole life? Like, ever? What does "50% chance" even mean then? You can't observe a frequency from one single event! Hah! Gotcha! Gang #2 ~ the Bayesians! This is more my style, honestly. These guys say probability isn't a thing out in the world ~ it's a thing in your head. It's how confident you are about something, given what you know. The 50/50 lives in your brain, not in the coin! Think about it, Sensei: why do you think a coin is 50/50? Because it has two sides? But dice have six sides and loaded dice are TOTALLY a thing, trust me, I've, uh... heard. Because someone told you? Because you've flipped a bunch before? Now imagine you start flipping and it comes up heads ten times in a row. Twenty! Fifty! At some point you'd be like "okay, this coin is RIGGED." That's you updating your beliefs based on evidence! That right there is the heart of Bayesian thinking! So in this view, probability is basically a bet you're making about the world with the info you've got. Every event has some chance of happening, some chance of not ~ and those numbers reflect what you know, not some hidden secret of reality. In fact, you can even apply this to quantum mechanics ~ that's called QBism! Both sides are useful! But this class is gonna lean Bayesian, 'cause it's just more useful (and more honest, Nihaha!) when you're actually trying to make decisions when you don't know everything. Which is, y'know, basically all of life. Especially mine, when I'm escaping the Self-Reflection Room. Okay, enought philosophy! It's time for math!! Sensei, don't fall asleep! We're getting to the good stuff! No matter which gang you're in, everyone agrees on the math structure. And here's the cool part ~ it's not arbitrary! It's built on three super simple rules called the axioms of probability, and once you get why they exist, the rest of probability theory stops feeling like magic. So! We start by saying probability is a number we slap onto an "event." An event is just anything that could happen ~ "the coin lands heads," "it rains tomorrow," "Koyuki escapes detention again" (high probability, btw). We need a scale. And every good scale needs two anchors: One end means "no way, never gonna happen" ↔ The other end means "yep, 100% gonna happen" We pick 0 for impossible and 1 for certain. Why those? Honestly? It's a convention! We coulda picked 0 to 100 (that's percentages!) or 0 to 10, or even 6 to 7 just to be weird (sIiIix sEeEeven (~⁶ ̄▽ ̄)~⁷). But 0 and 1 are super clean for the math because they play nice with multiplication and fractions. 0.5? Halfway between "won't" and "will." 0.25? About a quarter as confident as fully certain. Easy! But why can't probabilities go above 1 or below 0? Oooh, that brings us right to the axioms! Axiom 1 ~ Non-negativity! > For any event A, P(A) ≥ 0. Translation: probabilities are never negative. Duh! Why? Think about it ~ what would "−30% chance of rain" even MEAN, Sensei? Like, the sky un-rains on you? Probability measures either how often something happens or how strongly you believe it. You can't have negative frequency (something can't happen less than zero times!) and you can't have negative belief ~ the worst you can do is believe it's totally impossible, which is just 0. There's no "below impossible." So this axiom isn't some rule handed down from on high ~ it's just the math admitting what the concept actually means! Axiom 2 ~ Normalization! > P(everything that could possibly happen) = 1. Translation: if you list out ALL the possible outcomes, the chance that something on that list happens is 1. Certain. Guaranteed. Why? 'Cause we need a ceiling, Sensei! If 0 is "impossible," we need a number for "totally guaranteed." We picked 1. And the only thing truly guaranteed in any situation is: "one of the possible things will happen." Flip a coin ~ it'll land heads, tails, or, okay fine, balanced on its edge in some weird freaky case. Those outcomes cover everything, so they gotta add up to 1. This is also why probabilities can't go ABOVE 1! Nothing is more certain than certain! If your math says probability is 1.2, you broke it, Sensei! You've claimed something is more-guaranteed-than-guaranteed and that's just nonsense. Go back and find your mistake!! Axiom 3 ~ Additivity! > If events A and B can't both happen at the same time, then P(A or B) = P(A) + P(B). Translation: for stuff that can't overlap, "either one" is just adding their chances! Easy example! Roll a fair six-sided die. Chance of a 1? 1/6. Chance of a 2? 1/6. Chance of "1 or 2"? They can't both happen on the same roll, so just add 'em: 1/6 + 1/6 = 2/6 = 1/3. Matches your gut, right? Right!
The "can't both happen" part ~ fancy word for it is mutually exclusive ~ is super important! If events CAN overlap, you'd be double-counting like a sloppy accountant (not naming names!). Like "chance it rains" + "chance it's cold" isn't the chance of "rain or cold" because plenty of days are both and you'd be counting those twice! There's a fix for that case, we'll get to it later, don't worry! Why's this axiom natural? Because "or" just should work that way when stuff doesn't overlap! If I told you a coin had 0.5 heads and 0.5 tails but the chance of "heads OR tails" wasn't 1, you'd be like "Koyuki, what the heck?!" And you'd be right! Additivity keeps probability from contradicting itself! So there we go, Sensei! Three rules. That's the whole foundation! EVERYTHING else in probability ~ all the Bayesian stuff we're gonna do ~ is built on top of these three little guys:
And none of them came outta nowhere! If you tried to make a system for reasoning under uncertainty and broke even ONE of these, the whole system would contradict itself. That's the real reason we use 'em! Not 'cause some mathematician said so (though, okay, fine, a guy named Andrey Kolmogorov did write 'em down like this in 1933) ~ but because any sensible measure of uncertainty HAS to obey them. Break the axioms and someone can trick you into bets you're guaranteed to lose. Follow them and at least your reasoning has a chance of being coherent! Nihaha! And now, Sensei, with this foundation locked in, we can FINALLY start asking the questions Bayesian statistics is really about: how do we update our probabilities when we learn new stuff? How do we go from "I dunno" to "okay I'm pretty sure" in a way that isn't just vibes? And how do we use all this to make better choices in a universe we'll never fully understand? That's where it gets really fun, Sensei! So pay attention next time, okay? And, uh... if anyone asks, you NEVER saw me teaching probability voluntarily. Got it? It'd ruin my reputation. Yahahaha! ...Wait, are you taking notes? Sensei?! You're actually taking notes?! No way! Maybe I'm better at this than I thought! Nihaha! LESSON 1.2. Bayes' TheoremOkay okay okay, Sensei! It's time for the good stuff. Like, the good good stuff~ The whole reason Bayesian whatever-it's-called even exists! The secret sauce, as they say. The jackpot! (≖ˋ‿ˊ≖) But! BUT! I'm NOT just gonna scribble it on the whiteboard and go "memorize this, Sensei!" Ugh, no way. That's how the upperclassmen at the Seminar teach, and you can see how GREAT that's worked out for me (stuck in the Self-Reflection Room, what, three times this month? ...okay fine, maybe four). Nope nope nope. We're gonna build it up! From stuff you already agree with! And by the end you'll hopefully feel like you came up with it yourself, nihaha! And then, as a super-special bonus ~ 'cause I like you, Sensei ~ I'm gonna use that exact same idea to swipe twenty bucks off somebody! For educational purposes. ...Probably. Okay! Step one. I brought a deck of playing cards. Don't ask where I got 'em. Ni·Ha·Ha. Warm-up question, and you GOTTA answer honestly, Sensei! No cheating! I'm about to flip the top card ~ remember, there's 52 cards in here. What're the chances it's a red queen? ...Yeah! Two red queens outta fifty-two cards. 2/52, like 3.8%. Easy peasy, right? Now! That number right there has a fancy name ~ they call it a joint probability! 'Cause the card's gotta be two things at the same time! Red AND a queen! Two conditions, one card, one number. We write it like this: $$P(\text{red},\,\text{queen}) = \frac{2}{52}$$ That silly little comma in there? It's just shorthand for "and." That's it! Nothing fancy! That's basically the whole notation lesson for today! Yahaha! Okay, now watch this, Sensei. Watch closely! I'm gonna get the EXACT same answer a totally different way, and that "different way" is gonna turn out to be like, the most important sentence in this whole class. So pay attention pay attention pay attention!! So! I split the deck into two piles. Reds here, blacks here. First question: what're the chances the top card is red in a full deck? Half the deck's red, so... 1/2! Duh! Now ~ given we're already inside the red pile ~ what're the chances the card I pull is a queen? 26 reds, 2 of 'em are queens, so 2/26, which is 1/13. That tiny little word "given"? It's doing a LOT of heavy lifting, by the way! It means: "we shrunk the world down! We're not lookin' at the whole deck anymore ~ only the reds!" We write it with a vertical bar like so: $$P(\text{queen}\mid\text{red}) = \frac{1}{13}$$ Read the bar out loud as "given." "Probability of queen given red." Vertical bar = "we zoomed in." Got it? Got it! Now multiply: 1/2 × 1/13 = 1/26 = 2/52. Same answer as before! And it's not a coincidence, Sensei ~ we're literally just describing what we did, but in two steps: $$P(\text{red},\,\text{queen}) = P(\text{red}) \cdot P(\text{queen}\mid\text{red})$$ In words ~ in Koyuki-words: "the chance of both things happening is the chance the first one happens, TIMES ~ once you're inside that smaller world ~ the chance the second one also happens." First you land in the red pile, then, standing inside that little pile, you ask about queens. Two steps. Same destination! That's it! That's the so-called "chain rule," and it's NOT some magical spell of math wizards, we're just describing the trip we already took! (Mathematicians give boring names to obvious things and pretend they're being super deep. Lame!) Okay, now here's where it gets spicy, Sensei. I coulda done the whole thing backwards! Start with queens. Chance the card's a queen? 4/52 = 1/13. Now, given it's a queen, chance it's red? 2 of the 4 queens are red, so 2/4 = 1/2. Multiply: 1/13 × 1/2 = 2/52. SAME answer again! Nihaha! $$P(\text{queen},\,\text{red}) = P(\text{queen}) \cdot P(\text{red}\mid\text{queen})$$ Which means... look at this, look at this, Sensei ~ $$P(\text{red}) \cdot P(\text{queen}\mid\text{red}) \;=\; P(\text{queen}) \cdot P(\text{red}\mid\text{queen})$$ Both sides equal the same joint probability, so they gotta equal each other! And why do they HAVE to? 'Cause the joint doesn't care which adjective you say first! "Red queen" and "queen that's red" are the same dang card! P(A, B) = P(B, A). Always! It's symmetric! Hold onto that, Sensei, 'cause in about ten minutes I'm gonna crack a fortune cookie outta this equation~ nihahaha... But first! Before I crack it open ~ I gotta show you why you should actually CARE. 'Cause if I just rearrange a bunch of symbols on a page you'll glaze right over and start thinking about lunch, I know how it is. So... Field trip! C'mon, Sensei, get up get up, we're going to that little café across from the school! Nope, you don't get to say no. Bring your notes. Yahahaha! ... Okay. We're at the café. I got iced cocoa, you got whatever boring grown-up thing you got, and ~ ohhh ho ho ho ~ look who's two tables over. Sensei. Sensei, look. Don't be obvious about it. Two o'clock. Yeah. Yes. Her. Reisa. From the Vigilante Crew. Nihaha, she's the kind of person who thanks the vending machine. Stays after class to wipe the whiteboard. Cries at dog commercials. And a walking bank account waiting to be drained for... um... educational purposes. Just sit tight, Sensei. Watch. Take notes if you want, but don't ~ don't ~ give me away with your face. You have a tell, y'know! We'll work on that later. "Reisaaaa! Over heeere!"
"Eh? Oh ~ Koyuki! And ~ S-Sensei?! Sensei is here too?! Wah, I ~ am I interrupting something?? Uzawa Reisa reporting, um ~ should I come back later ~?" "Nope nope, sit sit sit, you're perfect timing actually! We're doing a class thing and I need a real live student opinion. You're a real live student, right?" "I ~ I think so?? Last time I checked?? Okay, okay, I-I'll help! What do you need?!" "Easy one! At your school. The kids who get really, really good grades ~ the top of the class types ~ would you say a lot of 'em wear glasses?" "O-Oh! Hmm. Hmmmmm. Yeah! Yeah, actually, like ~ almost all of them? Shimiko-senpai wears them, and I've seen Sakurako-senpai wear them during tests... And! The whole top five at midterms had glasses, I think? S-So... most of them? Like eighty percent? Maybe more?" "Perfect~ Eighty percent. Got that, Sensei? Okay, Reisa, second question! If I told you some random kid at your school wears glasses ~ just, picked one at random, walking down the hall, wearing glasses ~ would you guess they're probably one of the top-grades kids? Like, same eighty percent kinda vibe?" "Eh? Oh! Y-Yeah! That makes sense, right?? If smart kids wear glasses then glasses-kids are smart kids! Th-That's logic! Yeah, eighty percent! Final answer! Hehe..." "Mmmmhm. Mmmmmm-hm. Reisa! You know what this means, right?!!" "...Wh-what does it mean??" "It means if YOU wore glasses... by your own logic... you'd basically BE a top-grades kid! Like ~ eighty percent of the way there! Just from the glasses ~ niha!!" "...Wait. Wait wait wait. R-Really?? Is ~ is that how it works?? J-Just putting them on?! B-But I'm at the bottom of remedial math, Koyuki, would it ~ would it really ~ " "I mean, that's what YOU said yourself, Reisa! I'm just repeating what you said!" "I-I-I!! I gotta go!! Don't move, either of you, don't move! Uzawa Reisa will return! Wait for meeeeee!" (Reisa dashes out the café nearly tripping on her way out the door.) And she's gone~ nihaha! Sensei! What? Don't look at me like that!! I'm not finished, y'know? Sip your drink, let's just wait for her to come back... (10 minutes go by and Reisa, breathless and beaming, returns wearing a pair of cheap plastic frames from the nearby hundred-yen store.)
"OKAY!! I'm back!! Test me!! Quiz me!! Anything!! I-I can already feel it, my brain feels ~ it feels CRISPER!! Sensei, look at me, do I look smarter?! I think I look smarter!!" "Nihahahaha! Those look great on you, Reisa! Do you really think they'll work??" "Y-Yes!! I-I can feel the smartness, Koyuki, it's, like, behind my eyes!! It's working!!" "Okay~ Tell you what. Wanna bet on it? (≖ˋ‿ˊ≖)" "...B-Bet?? Like, money bet??" "Three-thousand yen for one math problem! The kinda problem a top-grades kid would crush in three seconds flat! If the glasses really turned you into one, easy money, right? You walk away richer and I cry into my cocoa. But if you miss... nihaha... then you owe me three-thousand!" "H-Hmmm. Hmmmmm. I mean ~ I-I do feel smarter ~ and Sensei is my witness ~ okay! OKAY! Uzawa Reisa accepts the challenge!! Bring it on, Koyuki!!" "Nihehehe... Okay! Right. Here we go. A bat and a ball cost one-hundred and ten yen together. The bat costs a hundred yen MORE than the ball. How much does the ball cost? (≖ˋ‿ˊ≖)" "O-Ohh!! Easy!! TEN YEN!! Bat's a hundred, ball's ten, one-hundred-and-ten total, boom!! Pay up, Koyuki!! Uzawa Reisa is now officially a top-grades ~ " "NIHAHAHA! Wrong!" "...Eh?" "The ball's five yen, the bat is a hundred and five! And the difference? Exactly one hundred! In total, one hundred and ten yen! Count it on your fingers, niha!" "...wh... wha... b-but I have the GLASSES on, Koyuki, the glasses, I ~ th-the glasses!!" "Three-thousand yen, Reisa! A bet's a bet~ Sensei is the witness, you said it yourself~" "...Mmmmmgh ~ ~ ~ " "Nihaha~ Thank you~" "I-I'm going home!! I'm going home and I'm gonna ~ gonna study so hard that next time I'll ~ I'll win it back!! Just you wait, Koyuki!! UZAWA REISA WILL RETURN!!" (Reisa runs out of the café with a look of determination.) Aaaaaaand she's gone! Three-thousand yen, Sensei! That's a week of cocoa! That's TWO weeks of cocoa if I'm careful. Which I won't be. Yahaha! Okay okay okay. Don't look at me like that! I KNOW. I know you're doing the Sensei Face. The "Koyuki we need to talk about ethics" face. We will. Later! Right now I need you to look past my crimes and see what actually just happened, 'cause it's like, the entire lesson. Pull your chair closer. Reisa got hit by the same trick twice, and she didn't notice either time! Trick one was in the questions. I asked her two things that sound like they oughta have the same answer:
In symbols, she just told us: $$P(\text{glasses}\mid\text{good grades}) \approx 0.8 \quad\text{AND}\quad P(\text{good grades}\mid\text{glasses}) \approx 0.8$$ And those two numbers? They can't both be right! Not even close! Watch this. Pretend Reisa's school has 1000 students. Top-grades kids ~ let's say it's the top 10%, so 100 of 'em. Her first answer says 80% of THOSE wear glasses, which is 80 kids. Cool, fine, totally believable. But how many kids at the whole school wear glasses? Glasses aren't a smart-person thing, Sensei, they're a my-eyeballs-don't-work thing! Half of any classroom is squinting. Let's say 400 outta 1000. And of those 400 glasses-wearers, how many are top-grades kids? Well, only 80 ~ 'cause we already counted 'em, they're the same 80! So: $$P(\text{good grades}\mid\text{glasses}) = \frac{80}{400} = 0.2$$ TWENTY percent! Not eighty! Same eighty kids, totally different fraction! And the reason is the denominator! The "good grades" room has 100 people in it. The "glasses" room has 400. Same 80 kids are sitting in both rooms ~ but 80 outta 100 is a landslide, and 80 outta 400 is a minority! Same kids. Different room. Different story. And THIS, Sensei, is THE trick. The Big Trick. The capital-T Trick! When you flip the bar in P(A | B), you're not just rearranging words ~ you're walking outta one room and INTO a totally different room with a totally different number of people in it! P(glasses | good grades) means "stand inside the small room of top-grades kids and count glasses." P(good grades | glasses) means "stand inside the big room of glasses-wearers and count top-grades kids." Conditional probability has a direction, y'know? Like an arrow! It's not a comma! It's a one-way street! Flip the arrow and the number changes ~ sometimes a little, sometimes WAY catastrophically. Reisa never even noticed the flip. She just heard "80%" and figured it stuck to the situation no matter which way you read it. That was trick one! Trick two was even sneakier, and it's the one that cost her actual cash money. Once she believed P(good grades | glasses) was high, she did the most human thing in the world ~ she tried to make it work in REVERSE! "If glasses-wearers are smart, maybe putting on glasses will make ME smart!" She tried to give herself the effect by buying the thing. Which is silly for like, two whole reasons stacked on top of each other:
But for real, Sensei? Reisa's mistake is the SAME mistake people make EVERYWHERE! "Most pickpockets are men, so most men are pickpockets." "Most NBA players are tall, so most tall people are in the NBA." "If it's raining the sidewalk's wet, so if the sidewalk's wet it's raining." Flip the bar without checkin' the rooms, walk smack into a wall! Doctors do this! Juries do this! Newspapers do this like, every day! Our brains are pattern-matchy little gremlins and they just want P(A | B) to equal P(B | A). The math, though? It does not care what our brains want. Not even a little! And THAT is what's worth way more than twenty bucks, Sensei. Reisa paid for the demo so you didn't have to! We should send her a thank-you card. ...We will absolutely not send her a thank-you card. Okay, drink up, let's head back to the classroom ~ I wanna show you what happens when we take this exact same trick and turn it into a weapon for good instead of cocoa money. ... Okay! Back in the classroom! I've got cocoa money for a week, you've got a new appreciation for the word "given," and Reisa is somewhere studying ball prices with grim determination. Let's cash in! Remember this from before lunch? $$P(A) \cdot P(B\mid A) \;=\; P(A, B) \;=\; P(B, A) \;=\; P(B) \cdot P(A\mid B)$$ The middle bit's the joint, and it's symmetric ~ "red queen" = "queen that's red." But check out the outsides! The outsides are P(A) times one conditional, and P(B) times the OTHER conditional. The conditionals can be wildly different (we just watched Reisa lose three-thousand yen proving it!), but the products still gotta be equal, 'cause the joint doesn't care which way you sliced it. So just look at the two ends and ignore the middle: $$P(A) \cdot P(B\mid A) \;=\; P(B) \cdot P(A\mid B)$$ And now divide both sides by P(B). Just regular ol' algebra. Nothin' fancy: $$\boxed{\,P(A\mid B) \;=\; \frac{P(B\mid A)\cdot P(A)}{P(B)}\,}$$ And THAT, Sensei, is Bayes' theorem. That's it! That's the whole shebang. The "secret sauce" the entire field of Bayesian statistics is named after. It fell outta nothin' more than "the joint doesn't care about order," plus one division. No magic, no priesthood, no dusty professor saying "because I said so." Just bookkeeping! And if anyone ever tries to tell ya Bayes' theorem is mystical or hard, smack 'em with a deck of cards from me. But here's the thing, Sensei ~ knowing the formula isn't even the point. The POINT is what each piece of it is actually doing! 'Cause this formula? It's a little machine, kinda like a gachapon. You put your old beliefs in one side, you crank the handle, and your updated beliefs pop out the other side. So let's name the parts of the machine. (Don't worry, the names are WAY scarier than the things, I promise!) Read it like this: $$\underbrace{P(A\mid B)}_{\text{posterior}} \;=\; \frac{\overbrace{P(B\mid A)}^{\text{likelihood}}\cdot \overbrace{P(A)}^{\text{prior}}}{\underbrace{P(B)}_{\text{evidence}}}$$ Let's say A is "what I wanna know" (the hypothesis) and B is "what I just saw" (the evidence). Then:
So the formula, in plain English, says: "My new belief = my old belief × how well the evidence fits my belief, divided by how common the evidence is in general." Sit with that for a sec, Sensei. It's kinda beautiful, right? Look at how each piece does its job:
And here's where I show you the toy ~ scroll down a lil' bit, Sensei. You see that square? That's the whole formula, as a picture!
Evidence: P(B) =
0.30
Posterior: P(A|B) =
0.83
P(A|B) =
( P(B|A) ×
P(A) )
/
P(B)
Okay! Rules of the square are: the whole square is the entire universe of possibilities, every single thing that could possibly happen. P(everything) = 1. Total area = 1. (Remember Axiom 2 from last time? Normalization, baby! It's BAAACK!) Now I slice the square vertically. The left slice ~ the blue side ~ is "A is true" (let's say, "this person actually has the rare condition I'm scanning for"). That slice's width is P(A), the prior. Skinny slice = rare. Thick slice = common! Inside that left slice, I shade in a portion ~ the green part where the evidence B also happens. That fraction of the slice is P(B | A), the likelihood. "Given the condition's there, how often does the test ping?" Inside the right slice ~ the blank side ~ ("A is false"), I shade in a different portion ~ the yellow false-positive area where B happens outside of A. That's P(B | not A). "Given the condition isn't there, how often does the test ping anyway?" (Tests aren't perfect, Sensei! They never are!) Now ~ the magic! Click on any term in the equation above the square, and the matching region of the square lights right up! Click P(A) ~ the left, blue slice glows relative to the whole square. Click P(B | A) ~ only the green part (where blue and yellow overlap) glows relative to the blue region (the part where you have the thing AND the test pings). Click P(B) ~ BOTH shaded parts (the yellow and where it overlaps with blue) glow together relative to the whole square, 'cause "the test pinged" includes true pings AND false pings. And click P(A | B) ~ the posterior ~ and you'll see the killer: it's the ratio of the green true-ping area to the total yellow area! "Out of all the times the test goes off, what fraction were real?" And now grab the sliders, Sensei! Crank P(A) waaay down (make the thing super rare). Watch what happens to the posterior, even with a 99%-accurate test! The false-positive area absolutely dwarfs the true-positive area, 'cause there are SO many more "not-A" people, and even a small percent of a huge group is bigger than a big percent of a tiny group. The square is showing you, visually, why a positive test for a rare condition often still means "you're probably fine." Most of the lit-up area is false alarms! This is the classic Bayesian gotcha, Sensei! Imagine a test for some rare condition that's 99% accurate both ways ~ 99% chance it pings if you have it, 99% chance it stays quiet if you don't. Sounds airtight, right? But say only 1 in 1000 people actually has it. Out of 1000 people, 1 has it and probably pings (1 true positive). The other 999 don't have it, and 1% of them ping anyway (about 10 false positives). So if YOUR test pings ~ you're 1 outta 11 pinged people! P(have it | test pings) ≈ 1/11 ≈ 9%. NOT 99%! Same numbers Reisa fell for, just dressed in a lab coat! P(ping | have it) is huge. P(have it | ping) is small. Different rooms! Drag the sliders around and feel it out! Make the condition more common, watch the posterior climb! Make the test more accurate, watch the false-positive sliver shrink! Bayes isn't a formula you memorize, Sensei ~ it's this shape, this proportion-of-shaded-rectangles thing, and the formula is just the bookkeeping for what your eyes can already see! And see that "Update Prior" button? Hit it! Smack it! What it does is take your current posterior and copy it right into the prior slot. 'Cause here's the thing, Sensei ~ HERE's the thing that makes Bayesian thinking different from every other way of reasoning out there. You don't just do this once! Your after-picture becomes your before-picture for the next round. Got a positive test? Cool, your belief jumped from 0.1% to 9%. Now you take a second, independent test. Plug that 9% back in as your new prior. If the second one also pings, your posterior jumps WAY higher this time, 'cause you weren't starting from "rare" anymore ~ you were starting from "kinda suspicious." Stack enough evidence and you can drag a belief from "no way" all the way to "almost certain," one update at a time. That's not a trick! That's just running the same little machine over and over, feeding it what you learn! And THAT, Sensei, is the real reason this whole subject even exists. Probability isn't just for describin' dice rolls and card flips ~ it's a way of moving, y'know? A way of going from "pfft, I dunno" to "okay okay, I'm pretty sure" in steps that don't cheat. Every observation tugs your belief by exactly the amount it deserves to tug it ~ no more, no less ~ scaled by how rare the evidence is and how well it fits. You can't overreact (the denominator won't let ya!). You can't underreact (the likelihood won't let ya!). It's like, the most honest update rule we've got. Nihaha! And it all fell out of a deck of cards and a girl in fake glasses crying about five yen! Yahahaha! See, Sensei? I told you! This stuff is WAY more interesting than the boring junk the upperclassmen make me do. Way, way more! ...You're taking notes again. Sensei. SENSEI. Stop. People are gonna think I'm good at this! Oh ~ and if you see Yuka around campus? You never saw the glasses. You don't know anything about a bet. You especially don't know about the three-thousand yen. We cool? We cool. Nihaha! LESSON 1.3. Bayesian StatisticsSensei! Sensei sensei sensei! Over here over here! Yes! I knew you'd come back! After last lesson I've been bouncing off the walls thinkin' about this stuff ~ and also bouncing off the walls 'cause they finally let me out of the Self-Reflection Room and I have approximately seventeen hours of pent-up energy! Nihaha! Okay okay okay, sit. SIT. I have a thing for you. Look. Look at this. (≖ˋ‿ˊ≖) See this little deck? I made it myself. Don't ask with what scissors. Two kinds of cards in here ~ blue ones and gold ones. I'm not gonna tell you the mix. Could be half-and-half. Could be mostly blue. Could be mostly gold. I shuffled it and even I don't remember exactly what's in there anymore, that's how fair I am about this. Probably. Here's the game, Sensei. The whole game! Super simple:
No math required. No formulas. Just look at the cards, trust your gut, update your gut, repeat. Go on, Sensei ~ play a couple rounds. I'll wait. (Maybe play five or six rounds! Try lowballing the first bet on purpose just to see what happens. Try going aggressive. Mess around!) Done? Did you win? Did you lose? Did you feel that little adrenaline kick when the cards came up and you went, "ooh, more gold than I thought!" and bumped your guess? That's the feeling, Sensei. That is the whole subject. Hold onto it. Because here's my big confession: I tricked you. (≖ˋ‿ˊ≖) You just did Bayesian statistics. Like, the real thing. The thing people get PhDs in. You did it with your gut, with no formulas, just by playing a stupid card game I made in the Self-Reflection Room out of boredom. And the whole point of today's lesson is to peek behind the curtain and see exactly what you were doing ~ 'cause there was a tiny little Bayesian model running right alongside you the whole game, and you can pull up the reveal table in the results to see how it played against you, round by round. Look at it. The model has a "best guess" column (it calls this θ*, fancy-pants notation for "the value I think is most likely") and a "confidence" column (how much of its belief is stacked on that best guess). Sound familiar? Yeah. That's your guess and your bet. You were running the same machine in your head the whole time, just without notation. Let me show you how. Okay! Cast your mind back to last lesson. We had Bayes' theorem ~ $$\,P(A\mid B) \;=\; \frac{P(B\mid A)\cdot P(A)}{P(B)}\,$$ ~ and we used it on ONE hypothesis at a time. "Do I have the rare disease, yes or no?" A and not-A. Two rooms. Two possibilities. Update once, get a posterior, done. But the card game isn't like that, is it? It's not "is the deck gold-heavy, yes or no?" There are a bunch of possible answers ~ in the game I gave you nine of 'em: the gold percentage could be 10%, 20%, 30%, ..., all the way to 90%. Nine little rooms instead of two. Nine little competing stories about what the deck is. And here's the beautiful, beautiful thing, Sensei: Bayes' theorem doesn't care. It works exactly the same. You just apply it to each hypothesis. For every possible θ ("theta," the Greek letter we use for "the true gold fraction"): $$\,P(\theta \mid \text{cards}) \;=\; \frac{P(\text{cards} \mid \theta)\cdot P(\theta)}{P(\text{cards})}\,$$ Same formula! Same four pieces! Prior, likelihood, evidence, posterior. The only thing that changed is we're running the gachapon nine times in parallel ~ one for each hypothesis ~ and instead of getting back a single number, we get back a whole little distribution: a posterior probability for each θ. Some are big (the model thinks those are likely), some are tiny (the model thinks those are silly). That's the bar chart in the reveal panel! Let me walk you through the pieces, 'cause once you see them you'll see what you were doing:
See it, Sensei? Your guess each round = the model's posterior mode. Your bet each round = the model's posterior confidence. You were doing it in your head! Of course you were fuzzier and the model can crunch nine bars at once and you can't, but the structure was identical. You picked a starting belief, you saw evidence, you updated. The model just did the bookkeeping more neatly. And remember the "update prior → posterior" button with the square from last lesson? You hit that button after every round of the card game without even thinking about it! Round 1's posterior became Round 2's prior, which got hit with new cards, which made a new posterior, which became Round 3's prior, and on and on. That's what learning is, in Bayes-land. Same little machine, cranking, cranking, cranking. The cards just keep coming. Quick sanity check that this matches your feeling: did you ever draw a wild round (like 9 gold out of 10) and feel your guess SLAM toward the high end? That's high likelihood for high-θ hypotheses crushing all the low-θ ones in one update. Did you ever start super confident and then get a weird round and feel reluctant to budge? That's a sharp prior resisting moderate evidence ~ you'd need more data to overturn yourself. Did the rounds where you saw 5-out-of-10 feel kinda uninformative, like you didn't really learn much? That's because 5/10 is consistent with tons of θ values (40%, 50%, 60%, they all predict that pretty well!) ~ the likelihood didn't pick a clear winner, so the posterior barely moved. All of that is Bayes' theorem doing its job inside your head! You felt it. The math just names what you felt. Okay. Cool. So we leveled up from "one hypothesis vs its negation" to "a bunch of hypotheses competing." That's already a huge jump. But Sensei... why nine? Why those particular nine? Why not eight? Why not eighty? Like ~ I picked {10%, 20%, ..., 90%} because they're round and the game is more fun with discrete options. But the TRUE gold fraction in a real deck (or in any real-world thing!) doesn't have to be one of nine pretty numbers. It could be 73.4%. It could be 0.05%. It could be some weird messy decimal that goes on forever. Real parameters in the wild don't politely round themselves for our convenience! Yahaha! How rude of them! So let's just... not restrict it. Let every value between 0 and 1 be on the table. Every. Possible. Value. Infinity of them. "But Koyuki, that's INFINITY hypotheses, how can I spread a finite amount of belief across infinitely many things, that's impossible, my brain is melting, please stop ~" Nihaha, easy easy easy. We just trade the bar chart for a curve. Instead of nine bars with heights that add to 1, we use a smooth squiggly line where the area underneath adds to 1. Tall parts of the curve = values you think are likely. Low parts = values you think are silly. Want to know how much you believe θ is between 0.3 and 0.5? Shade in that chunk under the curve and check its area! That's it! That's the whole trick! It's just bars with infinity-many infinitely-thin columns, and the area takes the place of the height. Mathematicians call this a probability density and they wave their hands around like it's a big deal but it's literally just "bars but smooth." Don't let 'em intimidate you. Lame! And once we've made that little upgrade ~ everything else from the card game still works. Prior is a curve now. Posterior is a curve now. Likelihood evaluated at each θ is a curve now. We multiply, we renormalize, we get an updated curve. Same machine! Just smoother! Here, look at this one, Sensei. It's the card game's cousin ~ except this time we're flipping a coin one toss at a time instead of dealing ten cards in a batch, and we use curves instead of nine little bars:
0.8
Tosses so far: 0 (0 heads, 0 tails) Posterior mean: -- (sd: --) 95% CrI: [--, --] P(p > 0.5 | tosses): -- P(p ≤ 0.5 | tosses): -- Okay! Same idea! There's a hidden true probability of heads (the slider at the top ~ in the card game I hid this from you, but here you get to play god and set it yourself and watch what happens). And there's a prior ~ those two sliders, α and β, shape the starting curve for what you believe about p before you've seen any flips. (Why two sliders and not one? 'Cause to draw a curve you need to set its shape, not just a center. α turns up belief in heads-heavy worlds, β turns up belief in tails-heavy worlds. Crank both up and they fight ~ the curve gets sharper around the middle. Both at 1 = totally flat = "I have absolutely no idea, every p between 0 and 1 looks equally plausible to me." Don't sweat the exact mechanic ~ just play with it and watch the curve change!) Then you toss. Each heads tugs the curve a smidge to the right ("more belief in heads-heavy worlds!"). Each tails tugs it left. Lots of flips? Big tugs ~ the curve sharpens up into a tight spike right around the truth. Few flips? Lazy little tugs ~ the curve stays wide and squishy and uncertain. Now here's the cool part ~ and this is what was hidden from you in the card game! Because we have a whole CURVE now, not just a "best guess" and a "bet," we get to ask way richer questions:
Mess around, Sensei! Seriously, play! Try this stuff:
And that, Sensei ~ that is Bayesian statistics. That's the whole field. I am not joking and I am not exaggerating. Everything else you'll ever see in this subject is some flavor of this exact same dance: "Start with what you believe about an unknown thing. See some data. Update your belief in proportion to how well the data fits each possible value of the unknown thing. Repeat forever. Read the answer off the resulting distribution." Fancier problems have more unknowns (whole vectors of θ's!), or messier data, or curves that are too gnarly to compute by hand (we'll get to that ~ there's this thing called MCMC that's basically rolling dice to draw the curve, and it's extremely my type of math, but that's for later). But the heart of it is what you just did with the card game and the coin flips. Pinky promise. Okay! Soapbox time! Sensei, sit comfy, I'm about to be petty for like ten minutes. (≖ˋ‿ˊ≖) There's another way of doing statistics that you'll run into out in the wild ~ the frequentist way. It's the one that's in every intro textbook, every science paper, every "study says" headline. And it doesn't do this dance. It does something WEIRDER. And if you've followed along this far, you already have the tools to see why it's kinda... not great? Lemme show you. The big frequentist toy is the p-value. You've heard of these. "p < 0.05! Significant!" Everyone nods and feels smart. But here is what a p-value actually is, in plain Koyuki-words: "Pretend the boring 'nothing's happening' hypothesis (the 'null') is true. Given that pretend, how likely would I be to see data this extreme or more?" In symbols: $$\,\text{p-value} \;=\; P(\text{data this extreme or more} \mid \text{null hypothesis})\,$$ Sensei. Sensei sensei sensei. Look at that. Look at which side of the bar the null is on. Look at which side the data is on. That's P(data | hypothesis). That's the likelihood. That's the forward direction, the easy direction, the one the world hands you for free! But what does anybody actually want to know when they run an experiment? They want "given the data, is my hypothesis true?" That's P(hypothesis | data). The other direction! The posterior! And those two are NOT THE SAME NUMBER! Different rooms, different denominators! We literally watched Reisa lose three-thousand yen last lesson because she swapped one for the other! Sensei, an entire empirical science culture has been built on quietly assuming p-value = posterior, and it is the same flip Reisa fell for. (≖ˋ‿ˊ≖) Reisa would be furious if she found out. I'll let her know later. After she's had time to study. And it gets worse! Even if frequentists wanted to compute the right thing, they refuse to use priors ~ so they can't! Bayes' theorem literally requires P(A) on the right-hand side, and they've thrown it out, so they can't go from P(data | hypothesis) to P(hypothesis | data). They're stuck with the forward direction and they pretend that's the answer. ![]() Other things to grumble about while I'm here:
...Okay, deep breath. Sensei, you know who'd love frequentist statistics? Hm? Take a guess. *checks behind back and leans in* Yuuka! It's Yuuka! Of course it's Yuuka! She'd ADORE it. Picture it: she balances the Seminar's budget down to the last yen with zero uncertainty intervals ~ the number IS the number, even though half her line items are estimates from a vendor who "swears the invoice is in the mail." No posterior over the true expenditure. No credible interval on the snack budget. Just one number, stamped, final, frowned at me. Frequentist to her core! Nihaha! And don't even get me started on her priors over me. She insists she's "objective" and "lets the evidence speak for itself" but Sensei, that girl has a prior on Kurosaki Koyuki tighter than any prior I've ever drawn. Every harmless little thing I do gets interpreted as another data point under her null hypothesis of "Koyuki is up to something." P(slightly bent paperclip | Koyuki is up to something) = 1, apparently. She doesn't update, she just accumulates. If she'd just declare her prior I could at least argue with it, but nooo, she's a frequentist about me ~ she'll never admit she has one. Lame! ...Okay! Okay, lesson over! Big lesson! Let me wrap up before you glaze over: today you went from one little Bayes' theorem ~ updating belief about a single yes/no hypothesis ~ to updating belief about a bunch of competing hypotheses at once (the card game), to updating belief about a continuous unknown (the coin), which is the whole game of Bayesian statistics. Prior in, evidence multiplied by likelihood, renormalize, posterior out, use posterior as next prior, repeat 'til you've got enough data to make a confident bet. That's it. That's the whole thing. Anyone who tells you it's more mystical than that is selling you a textbook. Next time we'll start poking at a computer to do this stuff for us, 'cause crunching posteriors by hand is fine for nine bars but becomes a real pain when you've got fifty parameters and a Tuesday deadline. Don't worry, it's gonna be fun, and we might even do some hacking. Nihehehe! The computer doesn't tattle to Yuuka. Probably. ...Sensei. Sensei, you're still taking notes. STOP. People are gonna think we're actually studying in here. We're SUPPOSED to be slacking off! Have some pride! Yahahaha! . . .
Looks like Yuuka was listening... (Image taken from here.) Lessons in this Module |