-
Notifications
You must be signed in to change notification settings - Fork 6
/
Copy pathrand31pmc.cpp
662 lines (607 loc) · 27.2 KB
/
rand31pmc.cpp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
#include <iostream>
#include <string>
#include <cmath>
using namespace std;
// Compile with the GCC C++ compiler:
//
// g++ rand31-park-miller-carta.cc -o rand31-pmc
// rand31-park-miller-carta.cc
// ---------------------------
//
// Robin Whittle [email protected] 2005 September 21
//
// For more information and updates: http://www.firstpr.com.au/dsp/rand31/
//
// This code contains two separate implementations of the 1988 Park Miller
// "minimal standard" 31 bit PRNG (Pseudo-Random Number Generator):
//
// rand31pm The conventional Park Miller "Integer Version 1" algorithm
// but implemented rather slowly in double-precision floating
// point to ensure accuracy and portability.
//
// This is intended as a reference.
//
// rand31pmc My own fast integer implementation of David G. Carta's
// approach to this PRNG, using only 32 bit integer maths
// and no divisions.
//
// My aim is to show that the two are functionally identical, to explain
// Carta's implementation, and to provide code which can be used by others.
//
// Please grab the rand31pmc class and use it in your own projects. A
// separate pair of .c and .h files are available for incorporation into
// ANSI C projects.
//
// When running the C versions of the Carta implementation and the
// plain floating point Park Miller aglorithm, on an 800MHz Pentium III with
// no compiler optimisation, the Carta version produced 13 million results a
// second (61 CPU clock-cycles per result) and the floating point version
// was about 1/4 this speed.
//
// The two classes below implement the same multiplicative linear
// congruential PRNG (pseudo-random number generator). This type of
// PRNG is not suitable for crytpography and does not produce sequences
// long enough or random enough for most simulation purposes.
//
// Most of the detailed analytical work on PRNGs has gone into those
// suitable for crypto and simulation. PRNGs well respected in these
// fields - such as the Mersenne Twister for simulation - are slow, tricky,
// or require excessive memory compared with a linear congruentual
// generator.
//
// This leaves people who are searching for a fast, well studied, linear
// congruential PRNG looking at a varied and often troublesome bunch of
// stuff, including algorithms written up in textbooks or incorporated in
// libraries, many of which produce poor results. Since it can be hard to
// recognise the problems in a PRNG, it is best to use one which is well
// studied and respected.
//
// Unfortunately, linear congruential generators have many bad algorithms
// and/or bad implementations. Most publicly available implementations
// of the good algorithms involve division. (The major exception I know
// of is Ray Gardner's rg_rand.c, which implements the Carta algorithm.)
//
//
// There is a widely used 32 bit integer implementation of Park-Miller's
// "minimal standard" PRNG, using Shrage's approach: Park Miller's
// "Integer Version 2". This requires a division. I don't have this
// implementation here. The "Integer Version 1" implementation exists
// to provide the reference sequence of pseudo-random numbers by which we
// test the second implementation.
//
//
// The second class "rand31pmc" is functionally identical to, and was
// inspired by, a snippet of C code by Ray Gardner which dates from as
// early as 1995 (according to Google searches in 2005):
//
// http://c.snippets.org/snip_lister.php?fname=rg_rand.c
//
// This cites Carta, and implements Carta's approach, but has no explanation
// of how this approach works. Carta's paper contains a far more complex
// explanation than the one I give below.
//
// I don't have contact details for Ray Gardner, or any of the people
// mentioned in the references. Eric Raymond mentions he wrote some
// hypertext-like code with Ray Gardner in the early 90s, and I think
// Ray Gardner may have moderated Usenet comp.lang.c.
//
// There are two important aims which are achieved both by David Carta's
// algorithm, Ray Gardner's code and class "rand31pmc":
//
// 1 - The code can run with 32 bit math operations - including a 16 x 16
// = 32 bit multiply.
//
// 2 - There is no need for a division.
//
// David Carta's 1990 paper gave no code examples.
//
// Stephen K. Park and Keith W. Miller rejected Carta's algorithm in
// 1993 "... the generator as implemented by Carta is ~not~ the minimal
// standard; it isn't even a full-period generator. We know of no good
// reason to use Carta's generator."
//
// But here, by compiling and running this program, you can see that
// David Carta's algorithm produces identical results to the Park Miller
// algorithm.
//
// Assuming Ray Gardner really did write his code (rather than copy
// it from somewhere else), my best guess is that he started off
// trying to implement Shrage's approach (as the comments and the
// unused Shrage-specific constants suggest) and then made his code
// do what David Carta suggested. I guess that since it worked,
// he left it at that, without updating the comments and constants.
//
// My explanation of why Carta's algorithm works is in the code, where it
// belongs. Please don't strip the comments out.
//
// It is reasonable to expect that Carta's approach would run faster
// than Shrage's approach in "Integer Version 2" - if only because
// there is no division.
//
// My main interest here is proving that Carta's approach works
// properly, because I want to implement it on CPUs such as the dsPIC
// which have a slow divide. Most CPUs have a slow divide anyway, as
// far as I know, so if you are in a hurry, Carta's approach is
// probably the best.
//
// References:
//
// Stephen K. Park and Keith W. Miller
// Random Number Generators: Good Ones are Hard to Find
// Communications of the ACM, Oct 1988, Vol 31 Number 10 1192-1201
//
// Like the other two papers, this one is normally only available
// from the ACM site via subscription. You should be able to
// access this paper electronically or in print at a university
// library. You may also be able to find the .PDF wild on the
// Net. Search for "p1192-park.pdf". For instance:
//
// http://www-scf.usc.edu/~csci105/links/p1192-park.pdf
//
// David F. Carta
// Two Fast Implementations of the "Minimal Standard" Random Number Generator
// Communications of the ACM, Jan 1990, Vol 33 Number 1 87-88 (p87-carta.pdf)
//
// George Marsaglia; Stephen J. Sullivan; Stephen K. Park, Keith W. Miller,
// Paul K. Stockmeyer
// Remarks on Choosing and Implementing Random Number Generators
// Communications of the ACM, Jul 1993, Vol 36 Number 7 105-110 (p105-crawford.pdf)
//
// The following code is public domain. If you use this code, I request that
// you keep the comments with it, to save some poor sod from having to figure
// out the history behind it.
//////////////////////////////////////////////////////////////////////////////
//
// rand31pm
//
// 31 bit Pseudo Random Number Generator based on Park Miller "Integer
// Version 1" - but done with double-precision floating point so we are not
// concerned with the limits of integer operations. This is not intended for
// fast operation - but *maybe* it would be faster than the integer
// implementation on some CPUs.
//
// Methods:
//
// seedi Set seed with a 31 bit unsigned integer between 1 and
// 0x7FFFFFFE inclusive. Don't use 0!
//
// ranlui Provides the next pseudorandom number as a long unsigned
// integer (31 bits).
//
// ranf Provides the next pseudorandom number as a float between
// nearly 0 and nearly 1.0.
class rand31pm {
// The sole item of state - a 32 bit
// integer.
long unsigned int seed31;
public:
// Constructor sets seed31 to 1,
// in case no seedi operation is
// used.
rand31pm() {seed31 = 1;}
// Declare methods.
void seedi(long unsigned int);
long unsigned int ranlui(void);
float ranf(void);
private:
// nextrand()
//
// Generate next pseudo-random number.
// Multiplier constant = 16807 = 7^5.
// Park and Miller in 1993 recommend
// 48271.
#define consta 16807
// #define consta 48271
// Modulus constant = 2^31 - 1 =
// 0x7FFFFFFFF. Use .0 to deter compiler
// from complaining about a very large
// integer constant.
#define constm 2147483647.0
long unsigned int nextrand()
{
double const a = consta;
double const m = constm;
// This is the linear congrentual
// generator:
//
// Multiply the old seed by constant a
// and take the modulus of the result
// (the remainder of a division) by
// constant m.
seed31 = (long)(fmod((seed31 * a), m));
return (seed31);
}
};
/////////////////////////////////////////
//
// Implementations of the methods.
// seedi()
//
// Set the seed from a long unsigned
// integer. If zero is used, then
// the seed will be set to 1.
void rand31pm::seedi(long unsigned int seedin)
{
if (seedin == 0) seedin = 1;
seed31 = seedin;
}
// ranlui()
//
// Return next pseudo-random value as
// a long unsigned integer.
long unsigned int rand31pm::ranlui(void)
{
return nextrand();
}
// ranf()
//
// Return next pseudo-random value as
// a floating point value.
float rand31pm::ranf(void)
{
return (nextrand() / constm);
}
//////////////////////////////////////////////////////////////////////////////
//////////////////////////////////////////////////////////////////////////////
//
//
// rand31pmc
//
// Robin Whittle 2005 September 20
//
// 31 bit pseudo-random number generator based on:
//
// Lehmer (1951)
// Lewis, Goodman & Miller (1969)
// Park & Miller (1983)
//
// implemented according to the optimisation suggested by David G. Carta
// in 1990 which uses 32 bit math and does not require division.
// Park and Miller rejected Carta's approach in 1993. Carta provided no
// code examples. Carta's approach produces identical results to Park
// and Miller's code.
//
// Copyright public domain . . . *but*:
//
// * Please leave the comments intact so inquiring minds have a chance of
// * understanding how this implementation works and chasing the
// * references to see the strengths and limitations of this particular
// * pseudo-random number generator.
//
// Output is a 31 bit unsigned integer. The range of values output is
// 1 to 2,147,483,646 and the seed must be in this range too. The
// output sequence repeats in a loop of this length = (2^31 - 2).
//
// The output stream has some predictable patterns. For instance, after
// a very low output, the next one or two outputs will be relatively low
// (compared to the 2 billion range) because the multiplier is only 16,807.
// Linear congruential generators are not suitable for cryptography or
// simulation work (such as Monte Carlo Method), but they are probably
// fine for many uses where the output is sound or vision for human
// perception.
//
// The particular generator implemented here:
//
// New-value = (old-value * 16807) mod 0x7FFFFFFF
//
// is probably the best studied linear congruentual PRNG. It is not the very
// best, but it is far from the worst.
//
// For the background on this implementation, and the Park Miller
// "Minimal Standard" linear congruential PRNG, please see:
//
// http://www.firstpr.com.au/dsp/rand31/
//
// Stephen K. Park and Keith W. Miller
// Random Number Generators: Good Ones are Hard to Find
// Communications of the ACM, Oct 1988, Vol 31 Number 10 1192-1201
//
// David G. Carta
// Two Fast Implementations of the "Minimal Standard" Random Number Generator
// Communications of the ACM, Jan 1990, Vol 33 Number 1 87-88
//
// George Marsaglia; Stephen J. Sullivan; Stephen K. Park, Keith W. Miller,
// Paul K. Stockmeyer
// Remarks on Choosing and Implementing Random Number Generators
// Communications of the ACM, Jul 1993, Vol 36 Number 7 105-110
//
// http://random.mat.sbg.ac.at has lots of material on PRNG quality.
//
//
// The sequence of values this PRNG should produce includes:
//
// Result Number of results after seed of 1
//
// 16807 1
// 282475249 2
// 1622650073 3
// 984943658 4
// 1144108930 5
// 470211272 6
// 101027544 7
// 1457850878 8
// 1458777923 9
// 2007237709 10
//
// 925166085 9998
// 1484786315 9999
// 1043618065 10000
// 1589873406 10001
// 2010798668 10002
//
// 1227283347 1000000
// 1808217256 2000000
// 1140279430 3000000
// 851767375 4000000
// 1885818104 5000000
//
// 168075678 99000000
// 1209575029 100000000
// 941596188 101000000
//
// 1207672015 2147483643
// 1475608308 2147483644
// 1407677000 2147483645
// 1 2147483646
// 16807 2147483647
//
// Carta refers to two registers p (15 bits) and q (31 bits) which
// together hold the 46 bit multiplication product:
//
// | | | |
// 4444 4444 3333 3333 3322 2222 2222 1111 1111 11
// 7654 3210 9876 5432 1098 7654 3210 9876 5432 1098 7654 3210
//
// q 31 qqq qqqq qqqq qqqq qqqq qqqq qqqq qqqq
// p 15 pp pppp pppp pppp p
//
// The maximum 46 bit result occurs
// when the seed is at its highest
// allowable value: 0x7FFFFFFE.
//
// 0x20D37FFF7CB2
//
// which splits up like this
//
// q 31 111 1111 1111 1111 0111 1100 1011 0010
// p 15 10 0000 1101 0011 0
// = 100 0001 1010 0110
//
// In hex, these maxiumum values are:
//
// q 31 7FFF7CB2 = 2^31 - (2 * 16807)
// p 15 41A6 = 16807 - 1
//
//
// The task is to combine the two partial products p and q as if they were
// both parts of a 46 bit number, with the final result being modulo:
//
// 0111 1111 1111 1111 1111 1111 1111 1111
//
// when we are actually only doing 32 bits at a time.
//
// Here I explain David G. Carta's trick - in a different and much simpler
// way than he does.
//
// We need to deal with the p bits "pp pppp pppp pppp p" shown above.
// These bits carry weights of bits 45 to 31 in the multiplication product
// of the usual Park Miller algorithm.
//
// David Carta writes that in order to calculate mod(0x7FFFFFFF) of the
// complete multiplication product (taking into account the total value
// of p and q) we should simply add the bits of p into the bit positions
// 14 to 0 of q and then do a mod(0x7FFFFFFF) on the result!
//
// | | | |
// 4444 4444 3333 3333 3322 2222 2222 1111 1111 11
// 7654 3210 9876 5432 1098 7654 3210 9876 5432 1098 7654 3210
//
// 31 qqq qqqq qqqq qqqq qqqq qqqq qqqq qqqq
// 15 + ppp pppp pppp pppp
// = Cxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx
//
// Highest possible value,
// for q, with a value for
// p which would allow it:
//
// 7FFFFFFF 111 1111 1111 1111 1111 1111 1111 1111
// + 41A5 100 0001 1010 0101
// = 8000041A4 1000 0000 0000 0000 0100 0001 1010 0100
//
// The result can't be larger than 2 * 0x7FFFFFFF = 0xFFFFFFFE. So when we
// do the modulus operation, we will have to subtract either nothing or just
// one 0x7FFFFFFF. With this model of addition, the subtraction only
// occurs very rarely.
//
// David Carta's explanation for why this produces the correct answer is too
// long to repeat here. Mine is easy to understand.
//
// Lets define some labels:
//
// Q = 31 bits 30 to 0.
// P = 15 bits 14 to 0.
//
// If we were doing 46 bit math, the multiplication product (seed * 16807)
// would be:
//
// Q
// + (P * 0x80000000)
//
// Observe that this is the same as:
//
// Q
// + (P * 0x7FFFFFFF)
// + (P * 0x00000001)
//
// However, we don't need or want a 46 bit result. We only want that result
// mod(0x7FFFFFFF). Therfore we can ignore the middle line above and use for
// our result:
//
// Q
// + P
//
// This is a lot snappier than using a division, as the Schrage technique
// requires.
//
// Methods:
//
// seedi Set seed with a 31 bit unsigned integer between 1 and
// 0x7FFFFFFE inclusive. Don't use 0!
//
// ranlui Provides the next pseudorandom number as a long unsigned
// integer (31 bits).
//
// ranf Provides the next pseudorandom number as a float between
// nearly 0 and nearly 1.0.
class rand31dc {
// The sole item of state - a 32 bit
// integer.
long unsigned int seed31;
public:
// Constructor sets seed31 to 1, in case
// no seedi operation is used.
rand31dc() {seed31 = 1;}
// Declare methods.
void seedi(long unsigned int);
long unsigned int ranlui(void);
float ranf(void);
private:
// nextrand()
//
// Generate next pseudo-random number.
// Multiplier constant = 16807 = 7^5.
// This is 15 bits.
//
// Park and Miller in 1993 recommend
// 48271, which they say produces a
// somewhat better quality of
// pseudo-random results.
//
// 48271 can't be used with the
// following implementation of Carta's
// algorithm, since it is 16 bits and
// would result in bit 31 potentially
// being set in lo in the first
// multiplication. (A similar problem
// occurs later with the higher bits of
// hi.)
#define consta 16807
// Modulus constant = 2^31 - 1 =
// 0x7FFFFFFF. We use this explicitly
// in the code, rather than define it
// somewhere, because this is a value
// which must not be changed and should
// always be recognised as a zero
// followed by 31 ones.
long unsigned int nextrand()
{
// Two 32 bit integers for holding
// parts of the (seed31 * consta)
// multiplication product which would
// normally need a 46 bit word.
//
// lo 31 bits 30 - 0
// hi 30 bits 45 - 16
//
// These overlap in their value.
//
// Bit 0 of hi has the same weight in
// the result as bit 16 of lo.
long unsigned int hi, lo;
// lo = 31 bits:
//
// low 16 bits (15-0) of seed31
// * 15 bit consta
lo = consta * (seed31 & 0xFFFF);
// hi = 30 bits:
//
// high 15 bits (30-16) of seed31
// * 15 bit consta
hi = consta * (seed31 >> 16);
// The new pseudo-random number is the
// 46 bit product mod(0x7FFFFFF). Our
// task is to calculate it with 32
// bit words and maths, and without
// division.
//
// The first section is easy to
// understand. We have a bunch of
// bits - bits 14 to 0 of hi -
// which overlap with and carry the
// same weight as bits 30 to 16 of
// lo.
//
// Add the low 15 bits of hi into
// bits 30-16 of lo.
lo += (hi & 0x7FFF) << 16;
// The result may set bit 31 of lo, but
// it will not overflow lo.
//
// So far, we got some of our total
// result in lo.
//
// The only other part of the result
// we need to deal with is bits
// 29 to 15 of hi.
//
// These bits carry weights of bits
// 45 to 31 in the value of the
// multiplication product of the usual
// Park-Miller algorithm.
//
// David Carta writes that in order
// to get the mod(0x7FFFFFF) of the
// multiplication product we should
// simply add these bits into the
// bit positions 14 to 0 of lo.
//
lo += hi >> 15; //
// In order to be able to get away with
// this, and to perform the following
// simple mod(0x7FFFFFFF) operation,
// we need to be sure that the result
// of the addition will not exceed:
//
// 2 * 0x7FFFFFFF = 0xFFFFFFFE
//
// This is assured as per the diagrams
// above.
// Note that in the vast majority of
// cases, lo will be less than
// 0x7FFFFFFFF.
if (lo > 0x7FFFFFFF) lo -= 0x7FFFFFFF;
// lo contains the new pseudo-random
// number. Store it to the seed31 and
// return it.
return ( seed31 = (long)lo );
}
};
/////////////////////////////////////////
//
// Implementations of the methods.
// seedi()
//
// Set the seed from a long unsigned
// integer. If zero is used, then
// the seed will be set to 1.
void rand31dc::seedi(long unsigned int seedin)
{
if (seedin == 0) seedin = 1;
seed31 = seedin;
}
// ranlui()
//
// Return next pseudo-random value as
// a long unsigned integer.
long unsigned int rand31dc::ranlui(void)
{
return nextrand();
}
// ranf()
//
// Return next pseudo-random value as
// a floating point value.
float rand31dc::ranf(void)
{
return (nextrand() / 2147483647.0);
}