Monte Carlo Club

From QiuLab
Revision as of 20:43, 25 February 2017 by imported>Weigang
Jump to navigation Jump to search

(This week) Feb 24, 2017 "Idiots" (Due 3/3/2017)

  • Source: Paul Nahin (2000), "Dueling Idiots". Problem #2: "When Idiots Duel"
  • Game: Idiots A and B decide to duel with a gun in the following way: They will insert a single bullet into the gun's cylinder. The cylinder has a total of 6 slots. Idiot A will spin the cylinder and shoot at B. If the gun doesn't fire, then A will give the gun to B, who will spin the cylinder and then shoot at A. This back-and-forth duel will continue until one fool shoots (and kills) the other.
  • Questions: (1) what is the probability that A will win (and B dies); (2) What is the average number of trigger pulls before someone dies?

(Last week) Feb 17, 2017 "Birthday" (Due 2/24/2017)

B-day.png
  • Problem: What is the probability NONE of the N people in a room sharing a birthday?
  1. Randomly select N individuals and record their B-days
  2. Count the B-days NOT shared by ANY two individuals
  3. Repeat (for each N) 1000 times, and obtain probability by averaging the previous counts (i.e., divided by 1000)
  4. Vary N from 10 to 100, increment by 10
  5. Plot probability of no-shared B-Day (Y-axis) versus N (x-axis), with either a stripchart or boxplot, or both
Submitted Codes
  • By weigang
days <- 1:365;

find.overlap <- function(x) { return(length(which(table(x)>1))) }

output <- sapply(1:100, function(x) { # num of people in the room
  ct.no.overlap <- 0;
  for (k in 1:100) {
    bdays <- sample(days, x, replace = T);
    ct <- find.overlap(bdays);
    if (!ct) { ct.no.overlap <- ct.no.overlap + 1}
  }
  return(ct.no.overlap);
})
 
plot(1:100, output/100, xlab="Group size", ylab = "Prob (no shared b-day)", las=1, main="B-day (sim=100 times)", type="l")
abline(h=0.5, lwd=2, col=2, lty=2)
abline(v=seq(0, 100, 5), col="gray")

(Previous week) Feb 10, 2017 "Dating" (Valentine's Day Special; Due 2/17/2017)

Dating.png
  • Source: Paul Nahin (2008), "Digital Dice". Problem #20: "An Optimal Stopping Problem"
  • Problem: What is the optimal time point when one should stop dating more people and settle on a mate choice (and live with the decision)
  • Your best strategy is to date an initial sample of N individuals, rejecting all, and marry the next one ranked higher than any of your N individuals. The question is what is the optimal number for N.
  1. The problem could be investigated by simulating a pool of 10 individuals, ranked from 1-10 (most desirable being 1) and then take a sample of N
  2. You may only date one individual at a time
  3. You cannot go back to reach previously rejected candidates
  4. Simulate N from 0 to 9 (zero means marrying the first date, a sample size of zero)
  5. For each N, obtain the probability of finding the perfect mate (i.e., ranked 1st) by running simulation 1000 times
  6. Plot barplot of probability versus sample size N.
  7. Expected answer: N=4
Submitted Codes
  • By Weigang
pick.candidate <- function(min, array) {
  for (i in 1:length(array)) {
    if (array[i] < min) {
      return(array[i])
    } else {next}
  }
  return(0) # No 1. has been sampled and rejected
}
candidates <- 1:10;
output <- sapply(0:9, function(x) {
  ct <- 0;
  for(k in 1:1000) {
    if (x==0) { # no sample, marry the 1st guy 
      sampled <- sample(candidates, 1);
      if (sampled == 1) {ct <- ct+1}
    } else {
      sampled <- sample(candidates, x);
      not.sampled <- candidates[-sampled];
      not.sampled <- sample(not.sampled);
      if (pick.candidate(min =  min(sampled), array = not.sampled) == 1) {ct <- ct+1}
    }
  }
  return(ct);
})
barplot(output/1e3, names.arg = 0:9, xlab = "number of sampled dates", las=1, main = "Optimal stopping for dating (N=10 candidates)", ylab = "Prob(marrying No.1)")

(Previous week) Feb 3, 2017 "US Presidents" (Due 2/10/2017)

Sim-presidents.png
  • Download : 1st column is the order, 2nd column is the name, the 3rd column is the year of inauguration; tab-separated
  • Your job is to create an R, Perl, or Python script called “us-presidents”, which will
  1. Read the table
  2. Store the original/correct order
  3. Shuffle/permute the rows and record the new order
  4. Count the number of matching orders
  5. Repeat Steps 3-4 for a 1000 times
  6. Plot histogram or barplot (better) to show distribution of matching counts
  7. Hint: For R, use the sample() function. For Perl, use the rand() function.
Submitted Codes
  • By Mei
pres.list <- lapply(1:1000, function(x) pres[sample(nrow(pres)),])
cts <- sapply(pres.list, function(x) {
  ct.match <- 0;
  for (i in 1:45){
    if (pres$order[i] == x[i,1]){
      #cat(as.character(x[i,2]), x[i,1], "\n")  #as.character to avoid the factor info
      ct.match <- ct.match + 1;
      #cat(ct.match)
    }
  }
  ct.match #return
})
barplot(table(cts),xlab = "Number of matches per shuffle", border = "hotpink", col = "pink", ylab = paste("Frequency total of:",length(pres.list)), main = "US Presidents")
  • By John
import pandas as pd
import numpy as np
from pandas import DataFrame
import matplotlib.pyplot as plt

df = pd.read_table("presidents.txt", names=["num", "name", "presidency"])

name_list = list(df.name)

# create a list to store matches after each shuffle
shuffle_record = []

# create a function in which the first argument is the original dataset; second argument is number of shuffles
def shuffler2(original, n):
    record = []
    for i in range(n):
        num = 0
        each_shuffle = {}
        temp = original.reindex(np.random.permutation(original.index)) # do shuffling for each
        compare = original.num == temp.num
        matched_df = original.ix[compare[compare == True].index]
        for i in matched_df.index:
            each_shuffle[i] = matched_df.name[i]
        shuffle_record.append(each_shuffle)
        try:
            num = compare.value_counts()[1]
        except:
            pass
        record.append(num)
    return(record)
result2 = shuffler2(df, 1000)

plt.hist(result2, color="yellow")
plt.title("Histgram for President Data")
plt.show()
  • By Weigang
p <- read.table("Presidents.txt", sep="\t", header=F)
colnames(p) <- c("order", "name", "inaug.year")
# Use "sapply" or "lapply" for loops: no need to pre-define a vector to store results
p.sim <- sapply(1:10000, function (x) {
  length(which(sample(p$order) == p$order))
  }) 
barplot(table(p.sim)/1e4, las=1)
p.exp <- rpois(1e4, 1) # draw 10000 Poisson random deviates
mp <- barplot(table(p.exp)/1e4, las=1, xlab = "Num of matching presidents") # mid-point on x-axis
lines(mp[1:7], table(p.sim)/1e4, type="b", col=2) # add a line (Poisson-expected) to the barplot (simulated)
legend("topright", c("Simulated", "Poisson expectation"), col=1:2, lty=1)

(Future weeks) March 3, 2017 (Due March 10, 2017) Pi & E=