Study some basic properties of elementary distributions as


Bitcoin

In this Lab, we are going to study some basic properties of elementary distributions. As a motivating example, we are going to examine a simplified version of the crypto currency Bitcoin.

README

For questions where you are asked to plot a histogram, unless otherwise specified, please submit 3 graphs corresponding to ρ = 1, 3, 6. Failure to include graphs will result in a lower grade.

Question 1. Recall that we assume SHA256 is indistinguishable from a random oracle. Under this assumption, let p be the probability that a randomly chosen nonce rj confirms that some block Bj comes after block Bj-1. Determine p in terms of ρ. Hint- recall that a when a random oracle (on {0, 1}256 is queried on a new input, its response is uniformly chosen from {0, 1}256 and becomes fixed for that input.

Question 2. Let X be a random variable that for a single trial takes on 1 if a coin is mined and 0 otherwise. Explain why a X is a Bernoulli random variable. Write a python function function Bernoulli mine simulates X in terms of ρ.

Question 3. Lets prove some basic properties of a Bernoulli random variable. Let X be a Bernoulli random variable with probability of success p. Prove that the distribution on X is a valid probability distribution. For this lab, when I say prove that a certain discrete RV has a valid distribution, you need only show

x∈Range(X) Pr[X = x] = 1

The remaining properties are either tedious or painfully obvious.

Question 4. Now, lets try actually mining Bitcoin. To accomplish this, we will be using SHA256 from python's hashlib library and the code below to create generate nonces. Write a function called hash mine that takes as input the old block, the proposed block and a nonce. The function should either return [False,"] which indicates the mine was unsuccessful, or [True, the nonce] indicating the a successful mine.

import hashlib , binascii
from numpy . random import randint
def hash mine ( b old , b new , rho ) :
pass
#generatearandomnonce l ength=8#d e f u l t
nonce= ' ' . join ( [ str ( randint ( 0 , 9 )) for i inrange ( l ength ) ] )

Question 5. For this problem, lets assume that block Bj-1 has value wubba lubba (note the space) and block Bj has value dub dub. Let X be a random variable that takes on value 1 if we successfully found a nonce, and 0 otherwise Write a function called bernoulli hist(n,ρ) that does the following:

- Simulates the PMF of X where the number of leading zeroes needed for a success is ρ.
- Plot a histogram of frequencies of success and failures. Scale the y-axis so that we see frequencies
like we did with the function dieRoll() from Lab 1.
- Under the assumption that each mine can be modeled by a Bernoulli random variable, plot the theoretical PMF on top of the histogram. Note that there are only two points to plot, so a good way to ensure both graphs are visible is to use a scatter plot for the theoretical PMF and use a different color.

To accomplish this, generate a random nonce and append it to Bj-1||Bj, hash it using SHA256 and count the number of leading zeroes. If the number of leading zeroes is greater than or equal to ρ, this is a success (1) and a failure otherwise (0). In order to get a good estimate for the PMF, repeat this process 100,000 times and plot the resulting histogram. Submit your code.

Question 6. Prove that X ∼ G(p) is a probability distribution. (Hint: use the Geometric Sum formula)

Question 7. Prove that the expected value of the geometric random variable is the expected value E[X] = 1/p. (Hint, try taking the derivative of ∑i=1(1 - p)i. If you are skeptical about differentiating an infinite sum, Google Uniform Convergence. Alternatively, you could use the law of total expectation )

Question 8. Prove that for any s, t ∈ N and XT ∼ G(p),

Pr[XT ≥ s + t|XT ≥ t] = Pr[XT ≥ s]

Or in English, if you observed t failures in a string of Bernoulli Trials, the probability of (for the same Geometric random variable) observing a success on the s + tth index for some s is the same as the probability of observing a success on the sth time. This is commonly called the Lack of Memory Property.

Question 9. For this problem, lets assume that block Bj-1 has value 'wubba lubba ' (note the space) and block Bj has value dub dub. Complete the python function geometrix hash mine that takes as input an integer ρ, the block B = Bj-1||Bj and finds a nonce r such that SHA256(Bj-1||Bj||r) has at least ρ leading zeros. This function should return [num trials, r] where num trials is the number of trials it took to successfully find a verifying nonce r.

Question 10. Let X be a random variable that denotes the first index at which a verifying nonce was found. In this question you will be estimating the PMF of X. Write a function called geometric hist(b old, b new, ρ) that takes as input the number of leading zeroes ρ and B = Bj-1||Bj (old block and new block)and plots the histogram of 100,000 calls to geometric hash mine. Note you only need the first element of the output for each call (IE don't plot the nonce, only the number of tries it took successfully find a verifying nonce). On top of the histogram, plot the actual PMF of X based on ρ. Explain why the Geometric distribution is a good model for this experiment.

Question 11. Write a function called binomial mine(ρ, n) that simulates n attempts at finding a verifying nonce r according to a binomial model where the probability of mining a coin on any attempt is depends on ρ (again, the number of leading zeroes). This function should output the number of successes. Use bernoulli mine(ρ) as subroutine in your function. Submit your code.

Question 12. Write a function binom hash mine(b_old, b_new, ρ, n) which takes parameters B = Bj-1, Bj, n, ρ, and returns the number of coins successfully mined in n calls to hash mine Submit your code.

Question 13. Write a function called binom hist(b_old, b_new, ρ, n)) which plots the result of 1,000 calls to binom hash mine. Scale the y-axis so we see probabilities like with the dieRoll() graph. Plot the theoretical PMF of the Binomial distribution with parameters n and p where p depends on ρ. Explain why the Binomial distribution is a good model for this experiment. Submit your code and the plots you create by calling

Question 14. Prove that for X ∼ B(n, p), then E[X] = np. (Hint, we are counting the number of success of a sequence of n Bernoulli trials. Use the linearity of expectation.)

Question 15. Verify that an exponential distribution is indeed a probability distribution.

Question 16. Let X ∼ Exp(λ). Compute E(X) (Hint: a direct method would be to use integration by parts)

Question 17. Explain conceptually why {Y < n} is the same event as {Xn > 1} Since we do not know the exact distribution of Y , there is nothing to compute. Simply give an explanation for why this is at least plausible.

Question 18. Lets return to the Binomial distribution. We can model a block being accepted as a Bernoulli trial where a success is a new block being accepted. Further, since there are roughly 2000000∗1000000000000 hashes a second, these are Bernoulli trials with n >> 0 and p << 0 (p is near 0, n is very large). Let X be the random variable counting the number of new blocks an hour, and let λ = E(X). Prove that

limn→∞ (n/k)(λ/n)k(1 - λ/n)n-k = e λk/k!

In particular, X ∼ Pois(λ) where the PMF is

fX (k) = eλk/k!

(Hint, limn→∞ (1 + x/n)n = ex)

Question 19. Now we will show that the number of Bitcoins mined in a particular period of time is well approximated by a Poisson distribution.

Get the file real bitcoin.csv. This file contains a list of 60 entries where the first column is the date and the second column is the number of Bitcions in circulation at the beginning of that day. Using python, open the csv file. For this question, you need to determine how "good of a fit" the Poisson distribution is to the process of mining Bitcoins.

The data here shows the total number of Bitcoins in circulation each day. Notice that this number is monotonically increasing. This is because new bitcoins were mined each day. Recall that each time a block is confirmed, new bitcoins are created (mined). For this dataset, each mined block creates exactly 12.5 bitcoins.

Use this dataset to determine how many blocks were confirmed each day.

Then, plot a (scaled) histogram showing the number of blocks confirmed per day. Use 6 buckets.

This histogram should be well approximated by a Poisson distribution with some parameter λ. If you estimate λ by the average number of blocks confirmed per day, what is λ? Plot the pmf with parameter λ on top of the histogram. Be sure to scale the time interval. Submit your code and plot.

Question 20. Finish the python program poiss hist. Note that it should be identical to Binomial hist, with the only difference being the theoretical PMF plot. For what values of rho ∈ {1, 3, 6, 10} does the Poisson adequately approximate the empirical results? Justify your answer. Submit all plots.

Question 21. Does your header have all the requested information include your header?

Attachment:- Distributions.zip

Request for Solution File

Ask an Expert for Answer!!
Computer Engineering: Study some basic properties of elementary distributions as
Reference No:- TGS02255582

Expected delivery within 24 Hours