Generating Random numbers to conform a distribution

William Prothero prothero at earthlearningsolutions.org
Wed Jun 8 11:18:43 EDT 2022


Mark W has it. The random number generator creates a "uniform" distribution. The distribution of the means of  collection of randomly generated uniform number sequences will be gaussian in the limit of infinite numbers in the uniformly distributed sequences (but you don't need an infinite number to get a good approximation of a gaussian distribution). Mark has also demonstrated how to do the scaling.

Good luck,
Bill P

William A. Prothero, PhD
Prof Emeritus, Dept of Earth Science
University of California, Santa Barbara

> On Jun 8, 2022, at 12:08 AM, Mark Waddingham via use-livecode <use-livecode at lists.runrev.com> wrote:
> 
> On 2022-06-07 21:51, David V Glasgow via use-livecode wrote:
>> Quite a lot of stats and maths packages offer a feature whereby the N,
>> the Mean and the SD are variables specified by the user, and N random
>> numbers are then generated with the required mean and SD.  I remember
>> the venerable and excellent Hypercard  HyperStat
>> <https://link.springer.com/content/pdf/10.3758/BF03204668.pdf> (1993)
>> by David M Lane doing exactly that.
>> Or is there an elegant formula?  I have Googled about and can’t see
>> one, but maybe I don’t know the magic words.  And if someone wanted to
>> script this in LC what would be the best approach? (just general
>> guidance here, wouldn’t want anyone to invest their valuable time in
>> what is at present just vague musings)
>> Any hints from the stats gurus?
> 
> I'm not a stats guru but...
> 
> I think all you need to do here is to use some of the intrinsic 'properties' of the Mean and SD.
> 
> Lets say you have a collection X of numbers then the following things are always true:
> 
>  P1: Mean(c * X) = c * Mean(X)
>  P2: Mean(X + k) = k + Mean(X)
>  P3: SD(c * X) = abs(c) * SD(X)
>  P4: SD(X + k) = SD(X)
> 
> In English, scaling a set of numbers scales their mean by the same amount, and offsetting a set of numbers offsets their mean by the same amount, Similarly, scaling a set of numbers scales their SD by the same amount, and offsetting a set of numbers makes no difference to the SD (as the SD is a relative quantity - it cares about distance from the mean, not magnitude).
> 
> Now, hopefully we can agree that if you generate a set of a random numbers, then scaling and offsetting them still uniformly does not reduce the randomness (randomness means the numbers form a uniform distribution over the range of generation, if you scale and offset then all you are doing is changing the range - not the distribution).
> 
> So with this in mind, let TMean and TSD be the target mean and target SD. Then:
> 
>  1. Generate N random numbers in the range [0, 1] - S0, ..., SN
> 
>  2. Compute SMean := Mean(S0, ..., SN)
> 
>  3. Compute SSD := SD(S0, ..., SN)
> 
> Now we take a small diversion from a sequence of enumerated steps to ask "what offset and scale do we need to apply to the set of numbers so that we get TMean and TSD, rather than SMean and SSD?".
> 
> The amount we need to scale by is mandated by the SD, specifically:
> 
>     c := TSD/SSD
> 
> If we scale our source numbers by c and apply SD then we see:
> 
>     SD(c * S0, ..., c * SN) = c * SD(S0, ..., SN) [P3 above]
>                             = c * SSD
>                             = TSD / SSD * SSD
>                             = TSD
> 
> i.e. Our scaled input numbers give us the desired SD!
> 
> So now we just need to play the same 'game' with the Mean. We have:
> 
>     Mean(c * S0, ..., c * SN) = c * Mean(S0, ..., SN)
>                               = c * SMean
> 
> However we really want a mean of TMean so define:
> 
>     k := TMean - c * SMean
> 
> Then if we translate our (scaled!) source numbers by k and apply Mean then we see:
> 
>    Mean(c * S0 + k, ..., c * SN + k) = c * Mean(S0, ..., SN) + k [P1 and P2 above]
>                                      = c * SMean + k
>                                      = c * SMean + TMean - c * SMean
>                                      = TMean
> 
> i.e. Our scaled and offset input numbers give us the desired Mean!
> 
> Note that SD is invariant under offsetting (P4) so SD(c * S0 + k, ..., c * SN + k) = SD(c * S0, ... c * SN) = TSD!
> 
> We can now return to our sequence of steps:
> 
>  4. Compute c := TSD/SSD
> 
>  5. Compute k := TMean - c * SMean
> 
>  6. Compute the target random numbers, Tn := c * Sn + k
> 
> So, assuming my maths is correct above T0, ..., TN, will be still be 'random' (for some suitable definition of random), but have Mean of TMean and SD of TSD as desired.
> 
> In LiveCode Script, the above is something like:
> 
>   function randomNumbers pN, pTMean, pTSD
>      local tSource
>      repeat pN times
>         put random(2^31) & comma after tSource
>      end repeat
> 
>      local tSMean, tSSD
>      put average(tSource) into tSMean
>      put stdDev(tSource) into tSSD
> 
>      local tC, tK
>      put pTSD / pSSD into tC
>      put pTMean - tC * tSMean into tK
> 
>      local tTarget
>      repeat for each item tS in tSource
>        put tC * tS + tK & comma after tTarget
>      end repeat
> 
>      return tTarget
>   end randomNumbers
> 
> Hope this helps!
> 
> Mark.
> 
> -- 
> Mark Waddingham ~ mark at livecode.com ~ http://www.livecode.com/
> LiveCode: Everyone can create apps
> 
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode




More information about the use-livecode mailing list