★彡 bayes_academy.html — Arona Explorer ×

BAYES ACADEMY

Koyuki's Official Field Guide ♪ est. 2026
Nihahahaha~ ♪
  WARNING: System has been hacked! Too many Koyuki!  ✿  NIHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHA!!  ✿  Hoshino-senpai recommends you take a nap (◕‿◕✿)  

Chapter 01: Introduction

Nihaha, Sensei! Are you ready for our first lesson?

Here's what we're covering. Try to pay attention, and I promise I'll try my best to stay on topic:

  • What probability actually is ~ y'know, the thing that decides whether I win at cards or end up locked in the Self-Reflection Room. ...Honestly, kind of the same thing. Is it real, though? Who knows!
  • Bayes' theorem ~ some old guy's formula for updating what you believe when new stuff shows up. I do this in my head all the time, apparently. Who knew?
  • Bayesian statistics ~ the whole way of thinking that grows out of that one little formula. Less scary than it sounds. Probably.
  • Julia ~ the programming language we're using. Fast, kinda cute, and way less boring than the password-cracking junk the Seminar makes me do.
  • Probabilistic programming ~ basically, getting the computer to handle the gambling math for us. Which, between you and me, is the best kind of math there is.

That's the whole chapter. Stick with me and I promise it'll be more fun than stamping documents all day. ...That's a low bar, I know. But still!

Ready, Sensei? Try to keep up ~ I'm not slowing down for anyone. Nihaha!

LESSON 1.1. What is probability?

So!

Sensei.

Have you ever flipped a coin? (≖ˋ‿ˊ≖)

Of course you have, everyone has. And you just kinda... assume it's 50/50, right? Heads or tails, no biggie. But hold on a sec ~ is it really random? Where does that "50/50" even come from? You didn't weigh the coin. You didn't measure your thumb-flick angle. You didn't check the air currents. You just... know. Or you think you know. Nihaha, weird, huh?

These are the kinda questions probability is built on, Sensei! And trust me, they're WAY more interesting than the boring stuff the upperclassmen at the Seminar make me do. Way more interesting.

Okay so imagine ~ just imagine! ~ you knew literally everything about the flip. The exact weight of the coin, the precise force of your thumb, every air current, the gravity pulling on it, every single atom around it. Could you predict the result perfectly? Maaaybe! And if you could, then that "50/50" was never about the coin at all ~ it was about you, and all the stuff you didn't know! Spooky, right?

But wait! It gets weirder! Our best theory at the tiniest scale of the universe ~ quantum mechanics ~ is itself probabilistic at its core. Like, no matter how good your instruments are, you literally cannot know the exact state of every particle. So is the universe actually rolling dice?! Or is that just the limit of what silly observers of the universe like us can meaningfully measure? Hmmm... I dunno! Nobody really knows! That's part of the fun!

So before we even write a single equation (I know, I know, I promise to go easy on the math), we've got a big juicy puzzle, Sensei: what the HECK even IS probability?

Okay so there's basically two gangs arguing about this. Like rival student councils, kinda!

Gang #1 ~ the Frequentists! These guys say probability is a real, physical thing in the world. Flip a coin a million times, you'll get roughly half heads. That 50/50 is out there, baby! It's the long-run frequency of the event.

Clean idea, right? But here's where I poke holes in it ~ what if you only flip the coin ONCE in your whole life? Like, ever? What does "50% chance" even mean then? You can't observe a frequency from one single event! Hah! Gotcha!

Gang #2 ~ the Bayesians! This is more my style, honestly. These guys say probability isn't a thing out in the world ~ it's a thing in your head. It's how confident you are about something, given what you know. The 50/50 lives in your brain, not in the coin!

Think about it, Sensei: why do you think a coin is 50/50? Because it has two sides? But dice have six sides and loaded dice are TOTALLY a thing, trust me, I've, uh... heard. Because someone told you? Because you've flipped a bunch before? Now imagine you start flipping and it comes up heads ten times in a row. Twenty! Fifty! At some point you'd be like "okay, this coin is RIGGED." That's you updating your beliefs based on evidence! That right there is the heart of Bayesian thinking!

So in this view, probability is basically a bet you're making about the world with the info you've got. Every event has some chance of happening, some chance of not ~ and those numbers reflect what you know, not some hidden secret of reality. In fact, you can even apply this to quantum mechanics ~ that's called QBism!

Both sides are useful! But this class is gonna lean Bayesian, 'cause it's just more useful (and more honest, Nihaha!) when you're actually trying to make decisions when you don't know everything. Which is, y'know, basically all of life. Especially mine, when I'm escaping the Self-Reflection Room.

Okay, enought philosophy! It's time for math!!

Sensei, don't fall asleep! We're getting to the good stuff! No matter which gang you're in, everyone agrees on the math structure. And here's the cool part ~ it's not arbitrary! It's built on three super simple rules called the axioms of probability, and once you get why they exist, the rest of probability theory stops feeling like magic.

So! We start by saying probability is a number we slap onto an "event." An event is just anything that could happen ~ "the coin lands heads," "it rains tomorrow," "Koyuki escapes detention again" (high probability, btw).

We need a scale. And every good scale needs two anchors:

One end means "no way, never gonna happen" ↔ The other end means "yep, 100% gonna happen"

We pick 0 for impossible and 1 for certain. Why those? Honestly? It's a convention! We coulda picked 0 to 100 (that's percentages!) or 0 to 10, or even 6 to 7 just to be weird (sIiIix sEeEeven (~⁶ ̄▽ ̄)~⁷). But 0 and 1 are super clean for the math because they play nice with multiplication and fractions. 0.5? Halfway between "won't" and "will." 0.25? About a quarter as confident as fully certain. Easy!

But why can't probabilities go above 1 or below 0? Oooh, that brings us right to the axioms!

Axiom 1 ~ Non-negativity!

> For any event A, P(A) ≥ 0.

Translation: probabilities are never negative. Duh!

Why? Think about it ~ what would "−30% chance of rain" even MEAN, Sensei? Like, the sky un-rains on you? Probability measures either how often something happens or how strongly you believe it. You can't have negative frequency (something can't happen less than zero times!) and you can't have negative belief ~ the worst you can do is believe it's totally impossible, which is just 0. There's no "below impossible." So this axiom isn't some rule handed down from on high ~ it's just the math admitting what the concept actually means!

Axiom 2 ~ Normalization!

> P(everything that could possibly happen) = 1.

Translation: if you list out ALL the possible outcomes, the chance that something on that list happens is 1. Certain. Guaranteed.

Why? 'Cause we need a ceiling, Sensei! If 0 is "impossible," we need a number for "totally guaranteed." We picked 1. And the only thing truly guaranteed in any situation is: "one of the possible things will happen." Flip a coin ~ it'll land heads, tails, or, okay fine, balanced on its edge in some weird freaky case. Those outcomes cover everything, so they gotta add up to 1.

This is also why probabilities can't go ABOVE 1! Nothing is more certain than certain! If your math says probability is 1.2, you broke it, Sensei! You've claimed something is more-guaranteed-than-guaranteed and that's just nonsense. Go back and find your mistake!!

Axiom 3 ~ Additivity!

> If events A and B can't both happen at the same time, then P(A or B) = P(A) + P(B).

Translation: for stuff that can't overlap, "either one" is just adding their chances!

Easy example! Roll a fair six-sided die. Chance of a 1? 1/6. Chance of a 2? 1/6. Chance of "1 or 2"? They can't both happen on the same roll, so just add 'em: 1/6 + 1/6 = 2/6 = 1/3. Matches your gut, right? Right!

The "can't both happen" part ~ fancy word for it is mutually exclusive ~ is super important! If events CAN overlap, you'd be double-counting like a sloppy accountant (not naming names!). Like "chance it rains" + "chance it's cold" isn't the chance of "rain or cold" because plenty of days are both and you'd be counting those twice! There's a fix for that case, we'll get to it later, don't worry!

Why's this axiom natural? Because "or" just should work that way when stuff doesn't overlap! If I told you a coin had 0.5 heads and 0.5 tails but the chance of "heads OR tails" wasn't 1, you'd be like "Koyuki, what the heck?!" And you'd be right! Additivity keeps probability from contradicting itself!

So there we go, Sensei! Three rules. That's the whole foundation! EVERYTHING else in probability ~ all the Bayesian stuff we're gonna do ~ is built on top of these three little guys:

  • Non-negativity: the scale doesn't dip below "impossible."
  • Normalization: the scale doesn't go above "certain," and something always happens. (sorry chuds...)
  • Additivity: non-overlapping chances add up like "or" should.

And none of them came outta nowhere! If you tried to make a system for reasoning under uncertainty and broke even ONE of these, the whole system would contradict itself. That's the real reason we use 'em! Not 'cause some mathematician said so (though, okay, fine, a guy named Andrey Kolmogorov did write 'em down like this in 1933) ~ but because any sensible measure of uncertainty HAS to obey them. Break the axioms and someone can trick you into bets you're guaranteed to lose. Follow them and at least your reasoning has a chance of being coherent! Nihaha!

And now, Sensei, with this foundation locked in, we can FINALLY start asking the questions Bayesian statistics is really about: how do we update our probabilities when we learn new stuff? How do we go from "I dunno" to "okay I'm pretty sure" in a way that isn't just vibes? And how do we use all this to make better choices in a universe we'll never fully understand?

That's where it gets really fun, Sensei! So pay attention next time, okay? And, uh... if anyone asks, you NEVER saw me teaching probability voluntarily. Got it? It'd ruin my reputation. Yahahaha!

...Wait, are you taking notes? Sensei?! You're actually taking notes?! No way! Maybe I'm better at this than I thought! Nihaha!

LESSON 1.2. Bayes' Theorem

Okay okay okay, Sensei! It's time for the good stuff. Like, the good good stuff~ The whole reason Bayesian whatever-it's-called even exists! The secret sauce, as they say. The jackpot! (≖ˋ‿ˊ≖)

But! BUT! I'm NOT just gonna scribble it on the whiteboard and go "memorize this, Sensei!" Ugh, no way. That's how the upperclassmen at the Seminar teach, and you can see how GREAT that's worked out for me (stuck in the Self-Reflection Room, what, three times this month? ...okay fine, maybe four). Nope nope nope. We're gonna build it up! From stuff you already agree with! And by the end you'll hopefully feel like you came up with it yourself, nihaha!

And then, as a super-special bonus ~ 'cause I like you, Sensei ~ I'm gonna use that exact same idea to swipe twenty bucks off somebody! For educational purposes. ...Probably.

Okay! Step one. I brought a deck of playing cards. Don't ask where I got 'em. Ni·Ha·Ha.

Warm-up question, and you GOTTA answer honestly, Sensei! No cheating! I'm about to flip the top card ~ remember, there's 52 cards in here. What're the chances it's a red queen?

...Yeah! Two red queens outta fifty-two cards. 2/52, like 3.8%. Easy peasy, right? Now! That number right there has a fancy name ~ they call it a joint probability! 'Cause the card's gotta be two things at the same time! Red AND a queen! Two conditions, one card, one number. We write it like this:

$$P(\text{red},\,\text{queen}) = \frac{2}{52}$$

That silly little comma in there? It's just shorthand for "and." That's it! Nothing fancy! That's basically the whole notation lesson for today! Yahaha!

Okay, now watch this, Sensei. Watch closely! I'm gonna get the EXACT same answer a totally different way, and that "different way" is gonna turn out to be like, the most important sentence in this whole class. So pay attention pay attention pay attention!!

So! I split the deck into two piles. Reds here, blacks here. First question: what're the chances the top card is red in a full deck? Half the deck's red, so... 1/2! Duh!

Now ~ given we're already inside the red pile ~ what're the chances the card I pull is a queen? 26 reds, 2 of 'em are queens, so 2/26, which is 1/13. That tiny little word "given"? It's doing a LOT of heavy lifting, by the way! It means: "we shrunk the world down! We're not lookin' at the whole deck anymore ~ only the reds!" We write it with a vertical bar like so:

$$P(\text{queen}\mid\text{red}) = \frac{1}{13}$$

Read the bar out loud as "given." "Probability of queen given red." Vertical bar = "we zoomed in." Got it? Got it!

Now multiply: 1/2 × 1/13 = 1/26 = 2/52. Same answer as before! And it's not a coincidence, Sensei ~ we're literally just describing what we did, but in two steps:

$$P(\text{red},\,\text{queen}) = P(\text{red}) \cdot P(\text{queen}\mid\text{red})$$

In words ~ in Koyuki-words: "the chance of both things happening is the chance the first one happens, TIMES ~ once you're inside that smaller world ~ the chance the second one also happens." First you land in the red pile, then, standing inside that little pile, you ask about queens. Two steps. Same destination! That's it! That's the so-called "chain rule," and it's NOT some magical spell of math wizards, we're just describing the trip we already took! (Mathematicians give boring names to obvious things and pretend they're being super deep. Lame!)

Okay, now here's where it gets spicy, Sensei. I coulda done the whole thing backwards! Start with queens. Chance the card's a queen? 4/52 = 1/13. Now, given it's a queen, chance it's red? 2 of the 4 queens are red, so 2/4 = 1/2. Multiply: 1/13 × 1/2 = 2/52. SAME answer again! Nihaha!

$$P(\text{queen},\,\text{red}) = P(\text{queen}) \cdot P(\text{red}\mid\text{queen})$$

Which means... look at this, look at this, Sensei ~

$$P(\text{red}) \cdot P(\text{queen}\mid\text{red}) \;=\; P(\text{queen}) \cdot P(\text{red}\mid\text{queen})$$

Both sides equal the same joint probability, so they gotta equal each other! And why do they HAVE to? 'Cause the joint doesn't care which adjective you say first! "Red queen" and "queen that's red" are the same dang card! P(A, B) = P(B, A). Always! It's symmetric! Hold onto that, Sensei, 'cause in about ten minutes I'm gonna crack a fortune cookie outta this equation~ nihahaha...

But first! Before I crack it open ~ I gotta show you why you should actually CARE. 'Cause if I just rearrange a bunch of symbols on a page you'll glaze right over and start thinking about lunch, I know how it is. So...

Field trip! C'mon, Sensei, get up get up, we're going to that little café across from the school! Nope, you don't get to say no. Bring your notes. Yahahaha!

...

Okay. We're at the café. I got iced cocoa, you got whatever boring grown-up thing you got, and ~ ohhh ho ho ho ~ look who's two tables over. Sensei. Sensei, look. Don't be obvious about it. Two o'clock. Yeah. Yes. Her.

Reisa. From the Vigilante Crew. Nihaha, she's the kind of person who thanks the vending machine. Stays after class to wipe the whiteboard. Cries at dog commercials. And a walking bank account waiting to be drained for... um... educational purposes.

Just sit tight, Sensei. Watch. Take notes if you want, but don't ~ don't ~ give me away with your face. You have a tell, y'know! We'll work on that later.

"Reisaaaa! Over heeere!"

"Eh? Oh ~ Koyuki! And ~ S-Sensei?! Sensei is here too?! Wah, I ~ am I interrupting something?? Uzawa Reisa reporting, um ~ should I come back later ~?"

"Nope nope, sit sit sit, you're perfect timing actually! We're doing a class thing and I need a real live student opinion. You're a real live student, right?"

"I ~ I think so?? Last time I checked?? Okay, okay, I-I'll help! What do you need?!"

"Easy one! At your school. The kids who get really, really good grades ~ the top of the class types ~ would you say a lot of 'em wear glasses?"

"O-Oh! Hmm. Hmmmmm. Yeah! Yeah, actually, like ~ almost all of them? Shimiko-senpai wears them, and I've seen Sakurako-senpai wear them during tests... And! The whole top five at midterms had glasses, I think? S-So... most of them? Like eighty percent? Maybe more?"

"Perfect~ Eighty percent. Got that, Sensei? Okay, Reisa, second question! If I told you some random kid at your school wears glasses ~ just, picked one at random, walking down the hall, wearing glasses ~ would you guess they're probably one of the top-grades kids? Like, same eighty percent kinda vibe?"

"Eh? Oh! Y-Yeah! That makes sense, right?? If smart kids wear glasses then glasses-kids are smart kids! Th-That's logic! Yeah, eighty percent! Final answer! Hehe..."

"Mmmmhm. Mmmmmm-hm. Reisa! You know what this means, right?!!"

"...Wh-what does it mean??"

"It means if YOU wore glasses... by your own logic... you'd basically BE a top-grades kid! Like ~ eighty percent of the way there! Just from the glasses ~ niha!!"

"...Wait. Wait wait wait. R-Really?? Is ~ is that how it works?? J-Just putting them on?! B-But I'm at the bottom of remedial math, Koyuki, would it ~ would it really ~ "

"I mean, that's what YOU said yourself, Reisa! I'm just repeating what you said!"

"I-I-I!! I gotta go!! Don't move, either of you, don't move! Uzawa Reisa will return! Wait for meeeeee!"

(Reisa dashes out the café nearly tripping on her way out the door.)

And she's gone~ nihaha! Sensei! What? Don't look at me like that!! I'm not finished, y'know? Sip your drink, let's just wait for her to come back...

(10 minutes go by and Reisa, breathless and beaming, returns wearing a pair of cheap plastic frames from the nearby hundred-yen store.)

"OKAY!! I'm back!! Test me!! Quiz me!! Anything!! I-I can already feel it, my brain feels ~ it feels CRISPER!! Sensei, look at me, do I look smarter?! I think I look smarter!!"

"Nihahahaha! Those look great on you, Reisa! Do you really think they'll work??"

"Y-Yes!! I-I can feel the smartness, Koyuki, it's, like, behind my eyes!! It's working!!"

"Okay~ Tell you what. Wanna bet on it? (≖ˋ‿ˊ≖)"

"...B-Bet?? Like, money bet??"

"Three-thousand yen for one math problem! The kinda problem a top-grades kid would crush in three seconds flat! If the glasses really turned you into one, easy money, right? You walk away richer and I cry into my cocoa. But if you miss... nihaha... then you owe me three-thousand!"

"H-Hmmm. Hmmmmm. I mean ~ I-I do feel smarter ~ and Sensei is my witness ~ okay! OKAY! Uzawa Reisa accepts the challenge!! Bring it on, Koyuki!!"

"Nihehehe... Okay! Right. Here we go. A bat and a ball cost one-hundred and ten yen together. The bat costs a hundred yen MORE than the ball. How much does the ball cost? (≖ˋ‿ˊ≖)"

"O-Ohh!! Easy!! TEN YEN!! Bat's a hundred, ball's ten, one-hundred-and-ten total, boom!! Pay up, Koyuki!! Uzawa Reisa is now officially a top-grades ~ "

"NIHAHAHA! Wrong!"

"...Eh?"

"The ball's five yen, the bat is a hundred and five! And the difference? Exactly one hundred! In total, one hundred and ten yen! Count it on your fingers, niha!"

"...wh... wha... b-but I have the GLASSES on, Koyuki, the glasses, I ~ th-the glasses!!"

"Three-thousand yen, Reisa! A bet's a bet~ Sensei is the witness, you said it yourself~"

"...Mmmmmgh ~ ~ ~ "

"Nihaha~ Thank you~"

"I-I'm going home!! I'm going home and I'm gonna ~ gonna study so hard that next time I'll ~ I'll win it back!! Just you wait, Koyuki!! UZAWA REISA WILL RETURN!!"

(Reisa runs out of the café with a look of determination.)

Aaaaaaand she's gone! Three-thousand yen, Sensei! That's a week of cocoa! That's TWO weeks of cocoa if I'm careful. Which I won't be. Yahaha!

Okay okay okay. Don't look at me like that! I KNOW. I know you're doing the Sensei Face. The "Koyuki we need to talk about ethics" face. We will. Later! Right now I need you to look past my crimes and see what actually just happened, 'cause it's like, the entire lesson. Pull your chair closer.

Reisa got hit by the same trick twice, and she didn't notice either time! Trick one was in the questions. I asked her two things that sound like they oughta have the same answer:

  • "Of the top-grades kids, how many wear glasses?" ~ she said 80%.
  • "Of the glasses-wearers, how many are top-grades kids?" ~ she said 80%.

In symbols, she just told us:

$$P(\text{glasses}\mid\text{good grades}) \approx 0.8 \quad\text{AND}\quad P(\text{good grades}\mid\text{glasses}) \approx 0.8$$

And those two numbers? They can't both be right! Not even close! Watch this. Pretend Reisa's school has 1000 students. Top-grades kids ~ let's say it's the top 10%, so 100 of 'em. Her first answer says 80% of THOSE wear glasses, which is 80 kids. Cool, fine, totally believable.

But how many kids at the whole school wear glasses? Glasses aren't a smart-person thing, Sensei, they're a my-eyeballs-don't-work thing! Half of any classroom is squinting. Let's say 400 outta 1000. And of those 400 glasses-wearers, how many are top-grades kids? Well, only 80 ~ 'cause we already counted 'em, they're the same 80! So:

$$P(\text{good grades}\mid\text{glasses}) = \frac{80}{400} = 0.2$$

TWENTY percent! Not eighty! Same eighty kids, totally different fraction! And the reason is the denominator! The "good grades" room has 100 people in it. The "glasses" room has 400. Same 80 kids are sitting in both rooms ~ but 80 outta 100 is a landslide, and 80 outta 400 is a minority! Same kids. Different room. Different story.

And THIS, Sensei, is THE trick. The Big Trick. The capital-T Trick! When you flip the bar in P(A | B), you're not just rearranging words ~ you're walking outta one room and INTO a totally different room with a totally different number of people in it! P(glasses | good grades) means "stand inside the small room of top-grades kids and count glasses." P(good grades | glasses) means "stand inside the big room of glasses-wearers and count top-grades kids." Conditional probability has a direction, y'know? Like an arrow! It's not a comma! It's a one-way street! Flip the arrow and the number changes ~ sometimes a little, sometimes WAY catastrophically.

Reisa never even noticed the flip. She just heard "80%" and figured it stuck to the situation no matter which way you read it. That was trick one!

Trick two was even sneakier, and it's the one that cost her actual cash money. Once she believed P(good grades | glasses) was high, she did the most human thing in the world ~ she tried to make it work in REVERSE! "If glasses-wearers are smart, maybe putting on glasses will make ME smart!" She tried to give herself the effect by buying the thing. Which is silly for like, two whole reasons stacked on top of each other:

  • The two conditionals weren't equal in the first place (different denominators, like we just saw!), and
  • Even if they had been ~ correlation isn't a steering wheel. You don't get smart by buying smart-people accessories. If you could, Sensei, I'd own seventeen lab coats and rule the world.

But for real, Sensei? Reisa's mistake is the SAME mistake people make EVERYWHERE! "Most pickpockets are men, so most men are pickpockets." "Most NBA players are tall, so most tall people are in the NBA." "If it's raining the sidewalk's wet, so if the sidewalk's wet it's raining." Flip the bar without checkin' the rooms, walk smack into a wall! Doctors do this! Juries do this! Newspapers do this like, every day! Our brains are pattern-matchy little gremlins and they just want P(A | B) to equal P(B | A). The math, though? It does not care what our brains want. Not even a little!

And THAT is what's worth way more than twenty bucks, Sensei. Reisa paid for the demo so you didn't have to! We should send her a thank-you card. ...We will absolutely not send her a thank-you card. Okay, drink up, let's head back to the classroom ~ I wanna show you what happens when we take this exact same trick and turn it into a weapon for good instead of cocoa money.

...

Okay! Back in the classroom! I've got cocoa money for a week, you've got a new appreciation for the word "given," and Reisa is somewhere studying ball prices with grim determination. Let's cash in!

Remember this from before lunch?

$$P(A) \cdot P(B\mid A) \;=\; P(A, B) \;=\; P(B, A) \;=\; P(B) \cdot P(A\mid B)$$

The middle bit's the joint, and it's symmetric ~ "red queen" = "queen that's red." But check out the outsides! The outsides are P(A) times one conditional, and P(B) times the OTHER conditional. The conditionals can be wildly different (we just watched Reisa lose three-thousand yen proving it!), but the products still gotta be equal, 'cause the joint doesn't care which way you sliced it.

So just look at the two ends and ignore the middle:

$$P(A) \cdot P(B\mid A) \;=\; P(B) \cdot P(A\mid B)$$

And now divide both sides by P(B). Just regular ol' algebra. Nothin' fancy:

$$\boxed{\,P(A\mid B) \;=\; \frac{P(B\mid A)\cdot P(A)}{P(B)}\,}$$

And THAT, Sensei, is Bayes' theorem.

That's it! That's the whole shebang. The "secret sauce" the entire field of Bayesian statistics is named after. It fell outta nothin' more than "the joint doesn't care about order," plus one division. No magic, no priesthood, no dusty professor saying "because I said so." Just bookkeeping! And if anyone ever tries to tell ya Bayes' theorem is mystical or hard, smack 'em with a deck of cards from me.

But here's the thing, Sensei ~ knowing the formula isn't even the point. The POINT is what each piece of it is actually doing! 'Cause this formula? It's a little machine, kinda like a gachapon. You put your old beliefs in one side, you crank the handle, and your updated beliefs pop out the other side. So let's name the parts of the machine. (Don't worry, the names are WAY scarier than the things, I promise!)

Read it like this:

$$\underbrace{P(A\mid B)}_{\text{posterior}} \;=\; \frac{\overbrace{P(B\mid A)}^{\text{likelihood}}\cdot \overbrace{P(A)}^{\text{prior}}}{\underbrace{P(B)}_{\text{evidence}}}$$

Let's say A is "what I wanna know" (the hypothesis) and B is "what I just saw" (the evidence). Then:

  • Prior, P(A). What you believed about A before you saw any evidence. Your starting bet! Just vibes? Sure, sometimes! Educated guess? Better! Hard data from previous experience? Best! But you always got one, even if it's just "I dunno, 50/50." It's your before-picture!
  • Likelihood, P(B | A). If A were true, how likely is it you'd see the evidence B? This one's a forward question, easy to reason about. "If it's raining, how likely is it the sidewalk is wet?" Super likely! Now, notice this is NOT the question we usually wanna answer ~ we usually want the other direction ~ but it's the one the world hands us for free.
  • Evidence, P(B). How likely is the evidence in general, across all possibilities? This is the "size of the room you're zooming into." Remember ~ 80 smart kids in glasses outta 400 glasses-wearers total. That 400 is your P(B). Big rooms make conditionals shrink, little rooms make 'em grow!
  • Posterior, P(A | B). What you NOW believe about A, after accounting for B. Your after-picture. The thing you actually wanted the whole time!

So the formula, in plain English, says:

"My new belief = my old belief × how well the evidence fits my belief, divided by how common the evidence is in general."

Sit with that for a sec, Sensei. It's kinda beautiful, right? Look at how each piece does its job:

  • If the evidence fits your hypothesis way better than it fits other explanations, the top of the fraction is bigger than the bottom, and your posterior goes UP. Belief strengthens! "Huh, this lines up. I'm more sure now."
  • If the evidence is super common no matter what (big P(B)!), then it's not really evidence at all, is it? Sidewalk's wet ~ could be anything ~ doesn't tell you much about rain specifically! The big denominator drags your posterior right back down. The math is gently reminding you, "don't get excited, this happens all the time."
  • And your prior is the anchor. If you started out thinkin' A was super rare (tiny P(A)), then even decent evidence might only nudge you a little. To overturn a strong prior, you need really damning, specific, hypothesis-fitting evidence! That's why one weird symptom doesn't mean you have a rare disease. And that's why one weird coincidence isn't proof of a conspiracy! Bayes is keeping you honest, Sensei!

And here's where I show you the toy ~ scroll down a lil' bit, Sensei. You see that square? That's the whole formula, as a picture!

Evidence: P(B) = 0.30 Posterior: P(A|B) = 0.83
P(A|B) = ( P(B|A) × P(A) ) / P(B)

Okay! Rules of the square are: the whole square is the entire universe of possibilities, every single thing that could possibly happen. P(everything) = 1. Total area = 1. (Remember Axiom 2 from last time? Normalization, baby! It's BAAACK!)

Now I slice the square vertically. The left slice ~ the blue side ~ is "A is true" (let's say, "this person actually has the rare condition I'm scanning for"). That slice's width is P(A), the prior. Skinny slice = rare. Thick slice = common!

Inside that left slice, I shade in a portion ~ the green part where the evidence B also happens. That fraction of the slice is P(B | A), the likelihood. "Given the condition's there, how often does the test ping?"

Inside the right slice ~ the blank side ~ ("A is false"), I shade in a different portion ~ the yellow false-positive area where B happens outside of A. That's P(B | not A). "Given the condition isn't there, how often does the test ping anyway?" (Tests aren't perfect, Sensei! They never are!)

Now ~ the magic! Click on any term in the equation above the square, and the matching region of the square lights right up! Click P(A) ~ the left, blue slice glows relative to the whole square. Click P(B | A) ~ only the green part (where blue and yellow overlap) glows relative to the blue region (the part where you have the thing AND the test pings). Click P(B) ~ BOTH shaded parts (the yellow and where it overlaps with blue) glow together relative to the whole square, 'cause "the test pinged" includes true pings AND false pings. And click P(A | B) ~ the posterior ~ and you'll see the killer: it's the ratio of the green true-ping area to the total yellow area! "Out of all the times the test goes off, what fraction were real?"

And now grab the sliders, Sensei! Crank P(A) waaay down (make the thing super rare). Watch what happens to the posterior, even with a 99%-accurate test! The false-positive area absolutely dwarfs the true-positive area, 'cause there are SO many more "not-A" people, and even a small percent of a huge group is bigger than a big percent of a tiny group. The square is showing you, visually, why a positive test for a rare condition often still means "you're probably fine." Most of the lit-up area is false alarms!

This is the classic Bayesian gotcha, Sensei! Imagine a test for some rare condition that's 99% accurate both ways ~ 99% chance it pings if you have it, 99% chance it stays quiet if you don't. Sounds airtight, right? But say only 1 in 1000 people actually has it. Out of 1000 people, 1 has it and probably pings (1 true positive). The other 999 don't have it, and 1% of them ping anyway (about 10 false positives). So if YOUR test pings ~ you're 1 outta 11 pinged people! P(have it | test pings) ≈ 1/11 ≈ 9%. NOT 99%! Same numbers Reisa fell for, just dressed in a lab coat! P(ping | have it) is huge. P(have it | ping) is small. Different rooms!

Drag the sliders around and feel it out! Make the condition more common, watch the posterior climb! Make the test more accurate, watch the false-positive sliver shrink! Bayes isn't a formula you memorize, Sensei ~ it's this shape, this proportion-of-shaded-rectangles thing, and the formula is just the bookkeeping for what your eyes can already see!

And see that "Update Prior" button? Hit it! Smack it! What it does is take your current posterior and copy it right into the prior slot. 'Cause here's the thing, Sensei ~ HERE's the thing that makes Bayesian thinking different from every other way of reasoning out there. You don't just do this once! Your after-picture becomes your before-picture for the next round.

Got a positive test? Cool, your belief jumped from 0.1% to 9%. Now you take a second, independent test. Plug that 9% back in as your new prior. If the second one also pings, your posterior jumps WAY higher this time, 'cause you weren't starting from "rare" anymore ~ you were starting from "kinda suspicious." Stack enough evidence and you can drag a belief from "no way" all the way to "almost certain," one update at a time. That's not a trick! That's just running the same little machine over and over, feeding it what you learn!

And THAT, Sensei, is the real reason this whole subject even exists. Probability isn't just for describin' dice rolls and card flips ~ it's a way of moving, y'know? A way of going from "pfft, I dunno" to "okay okay, I'm pretty sure" in steps that don't cheat. Every observation tugs your belief by exactly the amount it deserves to tug it ~ no more, no less ~ scaled by how rare the evidence is and how well it fits. You can't overreact (the denominator won't let ya!). You can't underreact (the likelihood won't let ya!). It's like, the most honest update rule we've got. Nihaha!

And it all fell out of a deck of cards and a girl in fake glasses crying about five yen! Yahahaha! See, Sensei? I told you! This stuff is WAY more interesting than the boring junk the upperclassmen make me do. Way, way more!

...You're taking notes again. Sensei. SENSEI. Stop. People are gonna think I'm good at this!

Oh ~ and if you see Yuka around campus? You never saw the glasses. You don't know anything about a bet. You especially don't know about the three-thousand yen. We cool? We cool. Nihaha!

LESSON 1.3. Bayesian Statistics

Sensei! Sensei sensei sensei! Over here over here! Yes! I knew you'd come back! After last lesson I've been bouncing off the walls thinkin' about this stuff ~ and also bouncing off the walls 'cause they finally let me out of the Self-Reflection Room and I have approximately seventeen hours of pent-up energy! Nihaha!

Okay okay okay, sit. SIT. I have a thing for you. Look. Look at this. (≖ˋ‿ˊ≖)

See this little deck? I made it myself. Don't ask with what scissors. Two kinds of cards in here ~ blue ones and gold ones. I'm not gonna tell you the mix. Could be half-and-half. Could be mostly blue. Could be mostly gold. I shuffled it and even I don't remember exactly what's in there anymore, that's how fair I am about this. Probably.

Here's the game, Sensei. The whole game! Super simple:

  • You make a guess about the gold percentage of the deck ~ what fraction of cards in there are gold.
  • You place a bet on how sure you are. Small bet = "ehh, I'm just kinda guessing." Big bet = "I'd stake my allowance on it."
  • Then I deal you ten cards off the top. Face up. You count the gold.
  • Now you get to revise! New guess, new bet, based on what you just saw.
  • Then ten more cards! Then revise again! Repeat until you're feeling brave.
  • When you're really confident, you slap down the $100 bet ~ that locks in your final answer and we reveal the truth. Get it right and you walk away rich. Get it wrong... well. We don't talk about that. Nihaha!

No math required. No formulas. Just look at the cards, trust your gut, update your gut, repeat. Go on, Sensei ~ play a couple rounds. I'll wait. (Maybe play five or six rounds! Try lowballing the first bet on purpose just to see what happens. Try going aggressive. Mess around!)

Done? Did you win? Did you lose? Did you feel that little adrenaline kick when the cards came up and you went, "ooh, more gold than I thought!" and bumped your guess? That's the feeling, Sensei. That is the whole subject. Hold onto it.

Because here's my big confession: I tricked you. (≖ˋ‿ˊ≖)

You just did Bayesian statistics. Like, the real thing. The thing people get PhDs in. You did it with your gut, with no formulas, just by playing a stupid card game I made in the Self-Reflection Room out of boredom. And the whole point of today's lesson is to peek behind the curtain and see exactly what you were doing ~ 'cause there was a tiny little Bayesian model running right alongside you the whole game, and you can pull up the reveal table in the results to see how it played against you, round by round.

Look at it. The model has a "best guess" column (it calls this θ*, fancy-pants notation for "the value I think is most likely") and a "confidence" column (how much of its belief is stacked on that best guess). Sound familiar? Yeah. That's your guess and your bet. You were running the same machine in your head the whole time, just without notation. Let me show you how.

Okay! Cast your mind back to last lesson. We had Bayes' theorem ~

$$\,P(A\mid B) \;=\; \frac{P(B\mid A)\cdot P(A)}{P(B)}\,$$

~ and we used it on ONE hypothesis at a time. "Do I have the rare disease, yes or no?" A and not-A. Two rooms. Two possibilities. Update once, get a posterior, done.

But the card game isn't like that, is it? It's not "is the deck gold-heavy, yes or no?" There are a bunch of possible answers ~ in the game I gave you nine of 'em: the gold percentage could be 10%, 20%, 30%, ..., all the way to 90%. Nine little rooms instead of two. Nine little competing stories about what the deck is.

And here's the beautiful, beautiful thing, Sensei: Bayes' theorem doesn't care. It works exactly the same. You just apply it to each hypothesis. For every possible θ ("theta," the Greek letter we use for "the true gold fraction"):

$$\,P(\theta \mid \text{cards}) \;=\; \frac{P(\text{cards} \mid \theta)\cdot P(\theta)}{P(\text{cards})}\,$$

Same formula! Same four pieces! Prior, likelihood, evidence, posterior. The only thing that changed is we're running the gachapon nine times in parallel ~ one for each hypothesis ~ and instead of getting back a single number, we get back a whole little distribution: a posterior probability for each θ. Some are big (the model thinks those are likely), some are tiny (the model thinks those are silly). That's the bar chart in the reveal panel!

Let me walk you through the pieces, 'cause once you see them you'll see what you were doing:

  • Prior, P(θ). Before you saw a single card, what did you think the gold fraction was? That's your starting guess. But it's not just one number ~ it's how you'd spread your belief across all nine possibilities. And here's where your bet sneaks in! When you guessed "30%" with a $10 bet, you were saying "I lean 30%, but eh, could really be anywhere from 10% to 70%, who's to say." That's a wide, spread-out prior. When you guessed "30%" with a $90 bet, you were saying "no, really, it's 30%, I'm SURE." That's a narrow, pointy prior. The game literally translates your bet into a sharpness for the bell-shape ~ small bet = squishy bell, big bet = needle. Look at the "Initial Bayesian Prior" panel at the end ~ that's the model interpreting your guess+bet as a probability distribution!
  • Likelihood, P(cards | θ). Now the cards come down. Say you drew 7 gold out of 10. The likelihood asks, for each θ: "if the truth really were θ, how often would I see 7-out-of-10 like this?" Under θ=0.7? Super common, this is exactly what you'd expect! Likelihood is BIG. Under θ=0.1? Wildly improbable, you'd basically need a miracle. Likelihood is TINY. Under θ=0.3? Possible but unusual. Likelihood is small-ish. The likelihood is a "fit score" ~ each hypothesis gets graded on how well it predicted what you actually saw.
  • Evidence, P(cards). Same job as last lesson! It's "how likely was this outcome overall, across all the hypotheses, weighted by how much I believed in each one." It's the total. The grand denominator. In Reisa's glasses example it was "the size of the glasses-wearer room." Here it's "the total weight of everyone's bid for explaining this." We use it to renormalize ~ to make sure after all the multiplying, our posterior probabilities still add up to 1 (Axiom 2! Normalization! Yahaha!). That's all it does. It doesn't pick favorites. It just makes sure the books balance.
  • Posterior, P(θ | cards). Multiply prior by likelihood for each θ, divide everyone by the total, and BAM ~ you've got an updated distribution. Hypotheses that fit the data well grew. Hypotheses that didn't, shrank. The new "best guess" is wherever the biggest bar is. The new "confidence" is how tall that biggest bar is compared to the others.

See it, Sensei? Your guess each round = the model's posterior mode. Your bet each round = the model's posterior confidence. You were doing it in your head! Of course you were fuzzier and the model can crunch nine bars at once and you can't, but the structure was identical. You picked a starting belief, you saw evidence, you updated. The model just did the bookkeeping more neatly.

And remember the "update prior → posterior" button with the square from last lesson? You hit that button after every round of the card game without even thinking about it! Round 1's posterior became Round 2's prior, which got hit with new cards, which made a new posterior, which became Round 3's prior, and on and on. That's what learning is, in Bayes-land. Same little machine, cranking, cranking, cranking. The cards just keep coming.

Quick sanity check that this matches your feeling: did you ever draw a wild round (like 9 gold out of 10) and feel your guess SLAM toward the high end? That's high likelihood for high-θ hypotheses crushing all the low-θ ones in one update. Did you ever start super confident and then get a weird round and feel reluctant to budge? That's a sharp prior resisting moderate evidence ~ you'd need more data to overturn yourself. Did the rounds where you saw 5-out-of-10 feel kinda uninformative, like you didn't really learn much? That's because 5/10 is consistent with tons of θ values (40%, 50%, 60%, they all predict that pretty well!) ~ the likelihood didn't pick a clear winner, so the posterior barely moved. All of that is Bayes' theorem doing its job inside your head! You felt it. The math just names what you felt.

Okay. Cool. So we leveled up from "one hypothesis vs its negation" to "a bunch of hypotheses competing." That's already a huge jump. But Sensei... why nine? Why those particular nine? Why not eight? Why not eighty?

Like ~ I picked {10%, 20%, ..., 90%} because they're round and the game is more fun with discrete options. But the TRUE gold fraction in a real deck (or in any real-world thing!) doesn't have to be one of nine pretty numbers. It could be 73.4%. It could be 0.05%. It could be some weird messy decimal that goes on forever. Real parameters in the wild don't politely round themselves for our convenience! Yahaha! How rude of them!

So let's just... not restrict it. Let every value between 0 and 1 be on the table. Every. Possible. Value. Infinity of them.

"But Koyuki, that's INFINITY hypotheses, how can I spread a finite amount of belief across infinitely many things, that's impossible, my brain is melting, please stop ~"

Nihaha, easy easy easy. We just trade the bar chart for a curve. Instead of nine bars with heights that add to 1, we use a smooth squiggly line where the area underneath adds to 1. Tall parts of the curve = values you think are likely. Low parts = values you think are silly. Want to know how much you believe θ is between 0.3 and 0.5? Shade in that chunk under the curve and check its area! That's it! That's the whole trick! It's just bars with infinity-many infinitely-thin columns, and the area takes the place of the height. Mathematicians call this a probability density and they wave their hands around like it's a big deal but it's literally just "bars but smooth." Don't let 'em intimidate you. Lame!

And once we've made that little upgrade ~ everything else from the card game still works. Prior is a curve now. Posterior is a curve now. Likelihood evaluated at each θ is a curve now. We multiply, we renormalize, we get an updated curve. Same machine! Just smoother!

Here, look at this one, Sensei. It's the card game's cousin ~ except this time we're flipping a coin one toss at a time instead of dealing ten cards in a batch, and we use curves instead of nine little bars:

0.8

Tosses so far: 0 (0 heads, 0 tails)

Posterior mean: -- (sd: --)

95% CrI: [--, --]

P(p > 0.5 | tosses): --

P(p ≤ 0.5 | tosses): --

Okay! Same idea! There's a hidden true probability of heads (the slider at the top ~ in the card game I hid this from you, but here you get to play god and set it yourself and watch what happens). And there's a prior ~ those two sliders, α and β, shape the starting curve for what you believe about p before you've seen any flips. (Why two sliders and not one? 'Cause to draw a curve you need to set its shape, not just a center. α turns up belief in heads-heavy worlds, β turns up belief in tails-heavy worlds. Crank both up and they fight ~ the curve gets sharper around the middle. Both at 1 = totally flat = "I have absolutely no idea, every p between 0 and 1 looks equally plausible to me." Don't sweat the exact mechanic ~ just play with it and watch the curve change!)

Then you toss. Each heads tugs the curve a smidge to the right ("more belief in heads-heavy worlds!"). Each tails tugs it left. Lots of flips? Big tugs ~ the curve sharpens up into a tight spike right around the truth. Few flips? Lazy little tugs ~ the curve stays wide and squishy and uncertain.

Now here's the cool part ~ and this is what was hidden from you in the card game! Because we have a whole CURVE now, not just a "best guess" and a "bet," we get to ask way richer questions:

  • "What's my best single guess?" ~ the posterior mean. The center of mass of the curve. (App shows it!)
  • "How uncertain am I?" ~ the posterior standard deviation, and the 95% credible interval. That CrI is the killer feature, Sensei! It literally says: "given everything I've seen, there's a 95% chance the true p is somewhere inside this range." That is an HONEST uncertainty statement. Not "if I repeated this experiment a thousand times..." ~ just, "right now, here, this is where I think p lives, with this much confidence." It's a real probability statement about the parameter! Hold onto that, I'm about to roast someone over it.
  • "Is the coin biased toward heads?" ~ that's just P(p > 0.5 | tosses). Take all the area under the curve to the right of 0.5. Done. That's a direct, interpretable answer to a yes/no question, with a probability attached to it. No tortured logic. No squinting at thresholds.

Mess around, Sensei! Seriously, play! Try this stuff:

  • Set a strong prior that's WAY off from the truth. Crank α up to like 40 and leave β at 1 ~ now your prior screams "I'm SURE p is near 1!" Then set the true probability to 0.2. Start flipping. Watch the curve kicking and screaming as it drags itself toward the truth. It takes a lot of flips to budge, doesn't it? That's a stubborn prior fighting the evidence. (And that's also a sneak peek of something really important: strong priors need strong evidence to be overturned. The math doesn't let you flip your mind on a whim. It also doesn't let you cling to nonsense forever ~ given enough data, the truth always wins out. Eventually. Bayes is patient!)
  • Set the true probability close to 0.5. Then try 0.9. Notice anything? When the truth is close to 0.5, the data looks like noisy garbage ~ tosses come out roughly half-and-half but with random wiggle, and your posterior takes FOREVER to sharpen. When the truth is way out at 0.9, every toss is screaming "HEADS!" at you and the curve narrows down fast. Why? Because under p=0.9, the data 9-out-of-10-heads is super likely; under p=0.5 it's borderline impossible. The likelihood discriminates HARD between hypotheses when the truth is extreme, and discriminates WEAKLY when the truth is close to where alternative hypotheses look about as plausible. Evidence is only evidence when it can tell hypotheses apart! Burn that into your brain, Sensei. Whole arguments hinge on it.
  • Set both prior sliders to 1 (totally flat). Now you're letting the data do all the talking, no opinion injected. This is what people do when they want to be "objective" about a single unknown. (Spoiler: you're never truly opinion-less ~ even "flat" is an opinion! *audibly slaps chest* ~ but it's a way of being as humble as the math will let you be.)
  • Compare two scenarios with the same number of flips but different priors. Notice that with enough data, two reasonable priors converge to almost the same posterior! That's another deep, deep thing: when there's lots of evidence, your starting opinion stops mattering. When evidence is scarce, your prior does the heavy lifting. Bayes is honest about which regime you're in.

And that, Sensei ~ that is Bayesian statistics. That's the whole field. I am not joking and I am not exaggerating. Everything else you'll ever see in this subject is some flavor of this exact same dance:

"Start with what you believe about an unknown thing. See some data. Update your belief in proportion to how well the data fits each possible value of the unknown thing. Repeat forever. Read the answer off the resulting distribution."

Fancier problems have more unknowns (whole vectors of θ's!), or messier data, or curves that are too gnarly to compute by hand (we'll get to that ~ there's this thing called MCMC that's basically rolling dice to draw the curve, and it's extremely my type of math, but that's for later). But the heart of it is what you just did with the card game and the coin flips. Pinky promise.

Okay! Soapbox time! Sensei, sit comfy, I'm about to be petty for like ten minutes. (≖ˋ‿ˊ≖)

There's another way of doing statistics that you'll run into out in the wild ~ the frequentist way. It's the one that's in every intro textbook, every science paper, every "study says" headline. And it doesn't do this dance. It does something WEIRDER. And if you've followed along this far, you already have the tools to see why it's kinda... not great? Lemme show you.

The big frequentist toy is the p-value. You've heard of these. "p < 0.05! Significant!" Everyone nods and feels smart. But here is what a p-value actually is, in plain Koyuki-words: "Pretend the boring 'nothing's happening' hypothesis (the 'null') is true. Given that pretend, how likely would I be to see data this extreme or more?" In symbols:

$$\,\text{p-value} \;=\; P(\text{data this extreme or more} \mid \text{null hypothesis})\,$$

Sensei. Sensei sensei sensei. Look at that. Look at which side of the bar the null is on. Look at which side the data is on. That's P(data | hypothesis). That's the likelihood. That's the forward direction, the easy direction, the one the world hands you for free!

But what does anybody actually want to know when they run an experiment? They want "given the data, is my hypothesis true?" That's P(hypothesis | data). The other direction! The posterior!

And those two are NOT THE SAME NUMBER! Different rooms, different denominators! We literally watched Reisa lose three-thousand yen last lesson because she swapped one for the other! Sensei, an entire empirical science culture has been built on quietly assuming p-value = posterior, and it is the same flip Reisa fell for. (≖ˋ‿ˊ≖) Reisa would be furious if she found out. I'll let her know later. After she's had time to study.

And it gets worse! Even if frequentists wanted to compute the right thing, they refuse to use priors ~ so they can't! Bayes' theorem literally requires P(A) on the right-hand side, and they've thrown it out, so they can't go from P(data | hypothesis) to P(hypothesis | data). They're stuck with the forward direction and they pretend that's the answer.

Other things to grumble about while I'm here:

  • No native uncertainty estimates. Frequentists have "confidence intervals," which sound like credible intervals but are completely different. A 95% credible interval, like the one you've been staring at in the coin-flip app, means "there's a 95% chance the truth is in this range." Honest! Direct! What you want! A 95% frequentist confidence interval means... brace yourself, Sensei... "if I were to repeat this exact experiment a zillion times, 95% of the intervals I'd construct using this procedure would contain the true value." Which not only hurts your brain trying to make sense of it, but it says NOTHING about the interval you actually have in front of you! Practically every scientist on Earth misreads their own confidence intervals as credible intervals. They want the Bayesian thing! They just don't know they do!
  • Assumptions everywhere, hidden in the fine print. Every frequentist test has a million little "we assume the data is normal," "we assume independence," "we assume equal variances" and other fancy words baked into the choice of test, and they're rarely explained or checked. Pick a different test, get a different p-value off the SAME data. Spooky. In Bayes-land you write your model down explicitly, every assumption stated out loud, and if you don't like an assumption you change it and rerun. No hiding!
  • "I don't use priors." Yes you do! Don't lie!! Your choice of test is a prior. Your significance threshold is a prior. Your decision to even run the experiment in the first place is a prior! Everyone has priors, Sensei ~ the only difference is Bayesians WRITE THEM DOWN where you can argue with them, and frequentists pretend theirs don't exist while still using them. I know which one sounds more honest to me.

...Okay, deep breath. Sensei, you know who'd love frequentist statistics? Hm? Take a guess.

*checks behind back and leans in* Yuuka! It's Yuuka! Of course it's Yuuka! She'd ADORE it. Picture it: she balances the Seminar's budget down to the last yen with zero uncertainty intervals ~ the number IS the number, even though half her line items are estimates from a vendor who "swears the invoice is in the mail." No posterior over the true expenditure. No credible interval on the snack budget. Just one number, stamped, final, frowned at me. Frequentist to her core! Nihaha!

And don't even get me started on her priors over me. She insists she's "objective" and "lets the evidence speak for itself" but Sensei, that girl has a prior on Kurosaki Koyuki tighter than any prior I've ever drawn. Every harmless little thing I do gets interpreted as another data point under her null hypothesis of "Koyuki is up to something." P(slightly bent paperclip | Koyuki is up to something) = 1, apparently. She doesn't update, she just accumulates. If she'd just declare her prior I could at least argue with it, but nooo, she's a frequentist about me ~ she'll never admit she has one. Lame!

...Okay! Okay, lesson over! Big lesson! Let me wrap up before you glaze over: today you went from one little Bayes' theorem ~ updating belief about a single yes/no hypothesis ~ to updating belief about a bunch of competing hypotheses at once (the card game), to updating belief about a continuous unknown (the coin), which is the whole game of Bayesian statistics. Prior in, evidence multiplied by likelihood, renormalize, posterior out, use posterior as next prior, repeat 'til you've got enough data to make a confident bet. That's it. That's the whole thing. Anyone who tells you it's more mystical than that is selling you a textbook.

Next time we'll start poking at a computer to do this stuff for us, 'cause crunching posteriors by hand is fine for nine bars but becomes a real pain when you've got fifty parameters and a Tuesday deadline. Don't worry, it's gonna be fun, and we might even do some hacking. Nihehehe! The computer doesn't tattle to Yuuka. Probably.

...Sensei. Sensei, you're still taking notes. STOP. People are gonna think we're actually studying in here. We're SUPPOSED to be slacking off! Have some pride! Yahahaha!

. . .

Looks like Yuuka was listening... (Image taken from here.)

Lessons in this Module