Sobol (Quasi Random) sequence simplified
A number of major investment banks, hedge funds and Investment research firms use Sobol as their primary Random number generator for pricing OTC products especially for low dimension problems.
There are several technical papers already available online on Sobol sequence generation algorithm. Few such links have been provided towards the end of the article. This article however, does not intend to be too technical but rather aims at providing a simplified 'Dummies' guide to implementing Sobol sequence generation.
Before getting into the heart of the algorithm, let us briefly describe the following concepts.
Bitwise XOR operator
Represented by a sign. BitXor is used to perform a logical exclusion on two expressions. For example,
Primitive polynomials over GS(2)
[GS(2) simply means that the coeffecients ak(s) in the polynomial below can either be 0 or 1]
To generate sobol quasi random sequence, we'll have to source in primitive polynomials over GS(2) as input to the algorithm. Various researchers have already generated and published these polynomials and can be used directly (and we recommend) while implementing the algorithm. One such primitive polynomial table is published at A011260.
So what if we have these primitive polynomials already available with us, it is nevertheless good to understand what these are. At this stage you could read on for further information or skip this section if you are in a hurry.
So what is a primitive polynomial?
Consider an irreducible polynomial over GS(2) of degree sj
xsj + a1xsj-1 + a2xsj-2+.....+ asj-1 x + 1
This polynomial is said to be primitive if it has the order 2sj - 1.
[Order of a polynomial??? The order of a polynomial P(x) with P(0) <> 0 is the smallest integer e for which P(x) completely divides xe + 1 . For example, consider 1 + x + x2 completely divides x3 - 1 and hence its order is 3 (3 = 22 - 1,Sj = 2). ]
Given below is a list of a few primitive polynomials over GS(2).
sj Primitive polynomials
1 1 + x
2 1 + x + x2
3 1 + x + x3 , 1 + x2 + x3
4 1 + x + x4 , 1 + x3 + x4
5 1 + x2 + x5 , 1 + x + x2 + x3 + x5 , 1 + x3 + x5 , 1 + x + x3 + x4 + x5 ,
1 + x2 + x3 + x4 + x5 , 1 + x + x2 + x4 + x5
Direction numbers vi(s) and initialisation number mi(s).
So now we understand what primitive polynomials are, what do we need next?
Direction numbers!!!!
To generate sobol low discrepancy quasi random sequence, we will now have to assume an initial set of initialisation numbers mi(s) - m1, m2, m3, . . . msj. (Remember sj is also the degree of primitive polynomial we discussed previously). Each assumed mi (ith initialisation number), however, must satisfy two criterias - it should be an odd integer and it should be less than 2i. That is, m1 < 2, m2 < 22, m3 < 23 and so on.
However, this does not mean that any integer that satisfies these two criterias would generate a good quality low discrepancy sequence. In fact, the quality of the random number depends quite a bit on the assumed set of initialisation numbers. There are a few authors who have researched and arrived at a set of initial numbers which produce quality random numbers. See [3] for example.
The values for msj+1 and beyond are determined using the following recursive equation:
mk = 2a1mk-1 22a2mk-2 ... 2sj-1asj-1mk-sj+1 2sj mk-sj mk-sj
where ak are the coeffecients in the primitive polynomial as represented in xsj + a1xsj-1 + a2xsj-2+.....+ ask-1 x + 1.
Example
Suppose we have a primitive polynomial x3 + x + 1. (a1 = 0 and a2 = 1) so that the recursive equation for mi simplifies to the following:
mi = 4mi-2 8mi-3 mi-3,
At this stage we assume the initialisation values m1 = 1, m2 = 3, and m3 = 7.
We calculate m4, m5 .. etc using the recursive equation above.
m4 = 12 8 1 = 5,
m5 = 28 24 3 = 7, and so on..
Now that we know how to calculate initialisation numbers mi(s), we shall move on to understand how direction numbers are calculated.
Direction numbers vk are simply defined by the equation vk = mk/2k
So that, if m1 = 1, m2 = 3 , m3 = 7, m4 = 5, m5 = 7 then the direction numbers v1 = 0.5, v2 = 0.75, v3 = 0.875, v4 = 0.3125 and v5 = 0.21875.
The Gray code
They gray code for a number n is defined by the equation:
G(n) = n[n/2] [.] represents the integral part function
Using the above equation we get, G(1) = 1, G(2) = 3, G(3) = 2, G(4) = 6, G(5) = 7 and so on..
The Gray code algorithm
The Sobol low discrepancy sequence can now be generated using the equation:
xn = g1v1 g2v2 g3v3 ........
where ....g3g2g1 is the binary representation of the G(n)
Using this we get, x1 = 0.5
x2 = 0.5 XORoperator0.75 = 0.25 [G(2) = 3]
x3 = 0.75 [G(3) = 2]
x4 = 0.75 0.875 = 0.125 [G(4) = 6] .. etc and so on.
A still faster method
So now we know how to generate the sequence. But there is still a more efficient way of implementing the alorithm. This was shown by Antonov and Saleev [2]. Before that let us define one more sequece {ci}
ci = index of the first 0 digit from the right in the binary representation of i = (. . . i3i2i1)2. We have c0 = 1, c1 = 2, c2 = 1, c3 = 3, c4 = 1, c5 = 2, etc.
The sequence can now be generated using the following equation :
xn+1 = xn vcn
As we will show, the above equation also generates the same sequence as that generated using the equation mentioned in the previous section.
The starting point of the sequence is assumed to be x0 = 0.
x1 = vc0 = v1 = 0.5
x2 = 0.5 vc1 = 0.5 v2 = 0.5 0.75 = 0.25
x3 = 0.25 vc2 = 0.25 v1 = 0.25 0.5 = 0.75
x4 = 0.75 vc3 = 0.75 v3 = 0.75 0.875 = 0.125
Higher dimension problems
We showed you how to generate sobol sequence for one dimension. So how do we generate this sequence for more dimensions?
The answer is simple. For each dimension use one primitive polynomial, assume a set of initialisation numbers based on the criteria above, generate rest of the initialisation numbers using the recursive equation described before, calculate direction numbers vi(s) and finally use the equation stated in the previous section to generate the sequence.
Using Sobol low discrepancy sequence for Monte Carlo simulations
A lot of Monte Carlo based pricers use sobol for spot path generations. It would thus be justified to make a final comment, before closing this article, on how to use sobol within the monte carlo closed pricer.
Lets suppose we need to simulate spot paths for 'm' underlyings with 'n' time steps. Simply generate the sobol sequences for mxn dimensions and use random numbers from a unique dimension for each time step and underlying.
Random Number Generator (xll) Excel addin
DeltaQuants' Random Number Generator library can be downloaded from here. The library has been implemented in C++. The library documentation as well as other excel libraries can be downloaded from here.
References
- Bratley, P. and Fox, B. L. (1988), "Algorithm 659: Implementing Sobol’s quasirandom sequence generator".
- Antonov, I.A. and Saleev, V.M. (1979) "An economic method of computing LPτ-sequences". Zh. Vych. Mat. Mat. Fiz. 19: 243–245 (in Russian); U.S.S.R Comput. Maths. Math. Phys. 19: 252–256 (in English).
- http://web.maths.unsw.edu.au/~fkuo/sobol/
- SOBOL’, I. M. Points which uniformly fill a multidimensional cube. Math. Cybern. 2 (1985) Znanie, Moscow (in Russian).
- S. Joe and F. Y. Kuo, Remark on Algorithm 659: Implementing Sobol's quasirandom sequence generator, ACM Trans. Math. Softw. 29, 49-57 (2003).