Introduction to Monte Carlo Method Kadi Bouatouch IRISA Email: kadi@irisa.fr Continuous Probability • A continuous random variable x is a variable that randomly takes on a value from its domain • The behavior of x is completely described by the distribution of values it take. Why Monte Carlo Integration? • To generate realistic looking images, we need to solve integrals of 2 or higher dimension – Pixel filtering and lens simulation both involve solving a 2 dimensional integral – Combining pixel filtering and lens simulation requires solving a 4 dimensional integral • Normal quadrature algorithms don’t extend well beyond 1 dimension Continuous Probability • The distribution of values that x takes on is described by a probability distribution function p – We say that x is distributed according to p, or x~p – p(x) ≥ 0 ∞ – ∫− ∞ p ( x ) dx = 1 • The probability that x takes on a value in b the interval [a, b] is: P( x ∈ [a, b ]) = ∫a p ( x ) dx Multi-Dimensional Random Variables Expected Value • The expected value of x~p is defined as E ( x) = ∫ xp( x)dx – As a function of a random variable is itself a random variable, the expected value of f(x) is E ( f ( x)) = ∫ f ( x) p( x)dx • For some space S, we can define a pdf p:S →ℜ • If x is a random variable, and x~p, the probability that x takes on a value in Si ⊂ S is: P( x ∈ Si ) = ∫ p( x)dµ Si • The expected value of a sum of random variables is the sum of the expected values: E(x + y) = E(x) + E(y) Monte Carlo Integration • Expected value of a real valued function f :S →ℜ extends naturally to the multidimensional case: E ( f ( x )) = ∫ f ( x ) p ( x ) dµ S Monte Carlo Integration • Suppose we have a function f(x) defined over the domain x є [a, b] • We would like to evaluate the integral • Finding the estimated value of the estimator we get: E[ I ] = E ⎡⎢ 1 ∑ f ( x ) ⎤⎥ N i ⎣ N i =1 p ( xi ) ⎦ 1 N ⎡ f ( xi ) ⎤ = ∑ E⎢ ⎥ N i =1 ⎣ p ( xi ) ⎦ f ( xi ) 1 = N∫ p ( xi )dx N p ( xi ) b I = ∫ f ( x)dx a • The Monte Carlo approach is to consider N samples, selected randomly with pdf p(x), to estimate the integral 1 N f ( xi ) • We get the following estimator: I = ∑ N i =1 p ( xi ) = ∫ f ( xi )dx = I • In other words: 1 lim N →∞ N N f ( xi ) ∑ p( x ) = I i =1 i Variance Variance Reduction • Increase number of samples • Choose p such that f/p has low variance (f and p should have similar shape) • The variance of the estimator is σ2 = 1 N ⎛ f ( xi ) ∫ ⎜⎜⎝ p( x ) − I i 2 ⎞ ⎟⎟ p ( xi )dx ⎠ – This is called importance sampling because if p is large when f is large, and small when f is small, there will be more sample in important regions • As the error in the estimator is proportional to σ, the error is proportional to 1 • Partition the domain of the integral into several smaller regions and evaluate the integral as a the sum of integrals over the smaller regions N – So, to halve the error we need to use four times as many samples – This is called stratified sampling Variance Reduction: Stratified sampling • Region D divided into disjoint sub-regions ∑iPi=1 • Di=[xi,xi+1]: sub-region • I = Variance Reduction: Stratified sampling f (x) • Ni samples per sub-region i • ∫ 0 x f ( x ) dx = ∫ ( f ( x ) / p ( x )) p ( x ) dx + 1 0 x2 ∫ ( f ( x ) / p ( x )) p ( x ) dx + x1 ... + 1 ∫( x n −1 f ( x ) / p ( x )) p ( x ) dx n sub-regions • p(x)/Pi : pdf for a sub-region i ( x) / p ( x)) p ( x)dx = Pi ∫Di ( f ( x ) / Pip ( x )) p ( x ) dx = 1 ∫Di ( f 1 f (x) 0 1 ( x) / p ( x)) p ( x)dx = Pi ∫Di ( f ( x ) / p ( x ))( p ( x ) / Pi ) dx = 1 ∫Di ( f I= n −1 ∑ Pi i =0 f ( x) f ( x) ∫Di p( x). Pi 0 i = n −1 dx I≈∑ i =0 1 Pi Ni f ( Xki ) ∑ p( X Ni ki = 0 ki ) Variance Reduction: Stratified sampling • 2D example • • • • • • • • • → N2 samples • • • • • • • Example • Sample the pdf p(x)=3x2/2, x є [-1, 1]: – First we need to find P(x): P( x) = ∫ 3x 2 / 2dx = x 3 / 2 + C Sampling Random Variables • Given a pdf p(x), defined over the interval [xmin, xmax], we can sample a random variable x~p from a set of uniform random numbers ξi є x [0,1) prob(α < x) = P ( x) = ∫ xmin • To do this, we need the cumulative probability distribution function: – To get xi we transform ξi: xi = P-1(ξi) – P-1 is guaranteed to exist for all valid pdfs Sampling 2D Random Variables • If we have a 2D random variable α=(αx, αy) with pdf p(αx, αy), we need the two dimensional cumulative pdf: prob(α x < x & α y < y ) = P( x, y ) = ∫ y ∫ x y min xmin – Choosing C such that P(-1)=0 and P(1)=1, we get P(x)=(x3+1)/2 and P-1(ξ)= 3 2ξ − 1 – Thus, given a random number ξi є [0,1), we can ’warp’ it according to p(x) to get xi: xi = 3 2ξ i − 1 p ( µ ) dµ p ( µ x , µ y )dµ x dµ y – We can choose xi using the marginal distribution pG(x) and then choose yi according to p(y|x), where y max p ( x, y ) pG ( x) = ∫ p ( x, y )dy and p ( y | x) = y min pG ( x) – If p is separable, that is p(x, y)=q(x,)r(y), the one dimensional technique can be used on each dimension instead 2D example: Uniform Sampling of Triangles 2D example: Uniform Sampling of Triangles • To uniformly sample a triangle, we use barycentric coordinates in a parametric space • Let A,B,C the 3 vertices of a triangle • Then a point P on the triangle is expressed as: C – P= αA + βB + γC 1 – α+ β+ γ= 1 – α = 1- γ - β P – P =(1- γ - β ) A + βB + γC γ 0 A 2D example: Uniform Sampling of Triangles • To uniformly sample a triangle, we use barycentric coordinates • Integrating the constant 1 across the 1 1−γ triangle gives ∫γ =0 ∫β =0 dβdγ =0.5 – Thus our pdf is p(β,γ)=2 • Since β depends on γ (or γ depends on β), we use the marginal density for γ, pG(γ): 1−γ ' pG (γ ' ) = ∫ 0 2dβ = 2 − 2γ ' • To uniformly sample a triangle, we use barycentric coordinates in a parametric space (0,1) β B (0,0) γ (1,0) β +γ =1 2D example: Uniform Sampling of Triangles • From pG(γ’), we find p(β’|γ’) = p(γ’, β’)/pG(γ’) = 2/(2-2γ’) = 1/(1-γ’) • To find γ’ we look at the cummulative pdf for γ: γ' γ' 0 0 ξ1 = PG (γ ' ) = ∫ pG (γ )dγ = ∫ 2 − 2γdγ = 2γ '−γ '2 • Solving for γ’ we get • We then turn to β’: γ ' = 1 − 1 − ξ1 β' β' 0 0 ξ 2 = P( β ' | γ ' ) = ∫ p(β ' | γ ' )dβ = ∫ 1 β' dβ = 1− γ ' 1− γ ' 2D example: Uniform Sampling of Triangles • Solving for β’ we get: β ' = ξ 2 (1 − γ ' ) = ξ 2 (1 − (1 − 1 − ξ1 )) = ξ 2 1 − ξ1 • Thus given a set of random numbers ξ1 and ξ2, we warp these to a set of barycentric coordinates sampling a triangle: ( β , γ ) = (ξ 2 1 − ξ1 ,1 − 1 − ξ1 ) 2D example: Uniform Sampling of Discs 2D example: Uniform Sampling of Discs • Disc – Generate random point on unit disk with probability density: p(x) = 1/(π.R2) ϕ ∈ [0,2π ] et r ∈ [0, R ] dµ = dS = rdrdφ F (r , ϕ ) = ∫0ϕ ∫0r (r ' /(π .R 2 )dr ' dϕ ' – 2D CDF: 2D example: Uniform Sampling of Spheres • Sphere • Disc – Generate random point on unit disk with probability density: p(x) = 1/(π.R2) ϕ ∈ [0,2π ] et r ∈ [0, R ] dµ = dS = rdrdφ 2 – 2D CDF: F (r , ϕ ) = ∫0ϕ ∫0r (2r ' / R )(1 / 2π )dr ' dϕ ' – Compute marginal pdf and associated CDF – ζ 1 and ζ 2 ∈ [0,1] are uniform random numbers ϕ = 2πζ 1 and r = R ζ 2 θin π /2 dωΘ in θin 0 0 ϕin ϕ in 2π p(Θ)= cosθ ⇒θ =acos( ξ1 ) etϕ =2πξ2 π Θ = (θ , ϕ ) ⇒ dµ = dωΘ = sin θ ⋅ dθ ⋅ dϕ p(Θ)= n+1 cosnθ ⇒θ =acos(ξ1n+1) etϕ =2πξ2 2π 1 Summary • Given a function f(µ), defined over an n-dimensional domain S, we can estimate the integral of f over S by a sum: 1 N f (x i ) ( ) f d µ µ ≈ ∑ ∫ S N i =1 p(x i ) where x~p is a random variable and xi are samples of x selected according to p. • To reduce the variance and get faster convergence we: – Use importance sampling: p should have similar shape as f – Use Stratified sampling: Subdivide S into smaller regions, evaluate the integral for each region and sum these together