The Monty Hall Paradox - SAS vs. Python
Recently, one of sons came to me and asked about something called “The Monty Hall Paradox.” They had discussed it in school and he was having a hard time understanding it (as you often do with paradoxes).
For those of you who may not be familiar with the Monty Hall Paradox, it is named for the host of a popular TV game show called “ Let’s Make a Deal .” On the show, a contestant would be selected and shown a valuable prize. Monty Hall would then explain that the prize is located just behind one of three doors and asked the contestant to pick a door. Once a door was selected, Monty would then tease the contestant with cash to get him/her to either abandon the game or switch to another door. Invariably, the contestant would stand firm and then Monty would proceed to show the contestant what was behind one of the other doors. Of course, it wouldn’t be any fun if the prize was behind the revealed door, so after showing the contestant an empty door Monty would then ply them with even more cash, in the hopes that they would abandon the game or switch to the remaining door.
Almost without fail, the contestant would stand firm in their belief that their chosen door was the winner and would not switch to the other door.
So where’s the paradox?
When left with two doors, most people assume that they've got a 50/50 chance at winning. However, the truth is that the contestant will double his/her chance of winning by switching to the other door.
After explaining this to my son, it occurred to me that this would be an excellent exercise for coding in python and in SAS to see how the two languages compared. Like many of you reading this blog, I’ve been programming in SAS for years so the struggle for me was coding this in Python.
I kept it simple. I generated my data randomly and then applied simple logic to each row and compared the results. The only difference between the two is in how the languages approach it. Once we look at the two approaches then we can look at the answer.First, let's look at SAS:
data choices (drop=max); do i = 1 to 10000; u=ranuni(1); u2=ranuni(2); max=3; prize = ceil(max*u); choice = ceil(max*u2); output; end; run;
I started by generating two random numbers for each row in my data. The first random number will be used to randomize the prize door and the second will be used to randomize the choice that the contestant makes. The result is a dataset with 10,000 rows each with columns ‘prize’ and ‘choice’ to represent the doors. They will be random integers between 1 and 3. Our next task will be to determine which door will be revealed and determine a winner.
If our prize and choice are two different doors, then we must reveal the third door. If the prize and choice are the same, then we must choose a door to reveal. (Note: I realize that my logic in the reveal portion is somewhat flawed, but given that I am using an IF…ELSE IF and the fact that the choices are random and there isn’t any risk of introducing bias, this way of coding it was much simpler.)
data results; set choices; by i; if prize in (1,2) and choice in (1,2) then reveal=3; else if prize in (1,3) and choice in (1,3) then reveal=2; else if prize in (2,3) and choice in (2,3) then reveal=1;
Once we reveal a door, we must now give the contestant the option to switch. Switch means they always switch, neverswitch means they never switch.
if reveal in (1,3) and choice in (1,3) then do; switch = 2; neverswitch = choice; end; else if reveal in (2,3) and choice in (2,3) then do; switch = 1; neverswitch = choice; end; else if reveal in (1,2) and choice in (1,2) then do; switch = 3; neverswitch = choice; end;
Now we create a column for the winner. 1=win 0=loss.
switchwin = (switch=prize); neverswitchwin = (neverswitch=prize); run;
Next, let’s start accumulating our results across all of our observations. We’ll take a running tally of how many times a contestant who switches win as well as for the contestant who never switches.data cumstats; set results; format cumswitch cumnever comma8.; format pctswitch pctnever percent8.2; retain cumswitch cumnever; if _N_ = 1 then do; cumswitch = 0; cumnever = 0; end; else do; cumswitch = cumswitch+switchwin; cumnever = cumnever+neverswitchwin; end; pctswitch = cumswitch/i; pctnever = cumnever/i; run; proc means data=results n mean std; var switchwin neverswitchwin; run; legend1
本文开发（python）相关术语:python基础教程 python多线程 web开发工程师 软件开发工程师 软件开发流程