Forum Moderators: coopster

Message Too Old, No Replies

random but consistent. how to do it?

         

httpwebwitch

8:34 pm on Nov 8, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Here's an interesting problem:

I have a site with about 12,000 pages.
Each one needs to show the same list of items (it's an array), shuffled in a random order. But here's the catch - each URL must consistently show the items in the same order, all the time.

/index.php => always shows items as B,C,F,G,E,D,A
/about.php => always shows items as C,D,A,G,F,B,E

you get the idea.

I don't care if more than one page shows things in the same order. The number of permutations may be less than the number of pages. In the system, both are unknown - for all I know there may be only 2 things in the array. Or 200. who knows.

Storing the permutations indexed by URL is not an option - this needs to be a formula, not a lookup table.

My hunch is I should leverage an MD5 of the URL, and use that to generate the order of the items in the array.

Any ideas how to do this?

cameraman

12:28 am on Nov 9, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The md5 is ok unless you need more numbers. If so, you could md5 the page and its parent directory separately (explode() on / and md5 the last two array elements, then array_merge() them).

$ary = str_split(md5($_SERVER['PHP_SELF']),2); // produce 16 hex pairs
// Now turn them into base 10 values:
$max = count($your_link_array);
foreach($ary as $idx => $val) {
$ary[$idx] = hexdec($val);
if($ary[$idx] >= $max)
$ary[$idx] %= $max; // if number too high, wrap it around
} // EndFor each array item

for($i = 0; $i < 16; $i++)
echo "<a href=\"" . $your_link_array[$ary[$i]] . "\">link$i</a><br>\n";

httpwebwitch

3:41 am on Nov 9, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



hmm. I haven't executed that code, but it seems like it's getting a random set from $myarray, not a random permutation. Is that so?

I don't want to end up with: A,G,B,F,G,B,B,D

for argument's sake, let's say I create it as a function, like this:

$newarray = fn($url,$myarray);

$newarray has to be a "shuffled" version of $myarray, ie no dupes or omissions

here's one idea I had:
1) create a long array of hex pairs (or quads, or sextuples, or whatever magnitude is needed), just like you did, same length as $myarray, call it $hexarray. it isn't random, but it certainly won't be sequential, and it won't be predictable either.
2) sort $hexarray, using a bubblesort or quicksort
3) while sorting $hexarray, the bubble sort will swap elements... for each swap in $hexarray, do the exact same swap to $myarray.

thus I'll end up with a pseudo-random permutation of $myarray, determined by the swapping used to sort $hexarray. The outcome: $newarray will be exactly as unpredictable and nonsequential as $hexarray was (before sorting).

d'you think that'll work?

if I pursue this algorithm, one challenge will be to create the $hexarray using $url, making sure $hexarray is long enough and has enough variety in it to produce a really well-shuffled output. And this, using only $url as input.

for each MD5 I generate, I'll get 32 characters.
to shuffle a 20-element array, I'll need 20 elements in $hexarray
I'll also want the range of values in $hexarray to be large enough to produce lots of noise... something like 16^$n where $n is derived from count($myarray)

Gee, I think I have some math to do

cameraman

7:33 am on Nov 9, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You're making it more complicated than it needs to be. If you want unique values, use array_unique() on $ary. As written, $ary can contain values from 0 to 255. If you have more than 256 elements in $myarray you'll need to increase the chunk size in str_split accordingly. After that, if you use the values in $ary as indices to $myarray, you don't need to sort or shuffle anything - you've already got your 'almost-randomness' from the md5 string, so no sorting is necessary. $ary will be a subset of $myarray but can potentially 'reach' any of the elements in $myarray, given sufficient chunk size.

You'd said The number of permutations may be less than the number of pages. How many links do you want to produce? Each character in an md5 string is a hex digit so you only have 16 possible values from each; if you pair them you can have 16 values from 0-255, and if you use triplets you can have 10 values from 0-4095. You could do various things with the md5 string like reversing it and appending it to the original or taking a character from the beginning and a character from the end, 2nd in from beginning and 2nd in from end, etc. You could also use other functions on the url like sha1, hash, base64encode, md5 the file's creation date... I suppose you could md5 the md5 but I would think combining an md5 with an sha1 would be better, although I can't even come close to justifying why I would think that.

I think the only math you need to do is to determine how long the string needs to be in order to produce enough unique numbers at the end. Instead of trying to do that math I would be more likely to check the count after array_unique and proceed to the next mangling function, and on until the array produced enough index values.

httpwebwitch

1:58 pm on Nov 9, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



$ary will be a subset of $myarray

that's the problem - the requirements state that $newarray be a shuffled version of $myarray; no omissions or duplicates. Think: deck of cards, not lottery numbers. Every item in $myarray muct be present in $newarray. Only the sequence will change.

I'm going to do a little coding now, and see what happens

pinterface

9:01 pm on Nov 10, 2008 (gmt 0)

10+ Year Member



The proposed solutions seem unnecessarily complicated. Let's break down your requirements and see where they take us.

You want to [1] shuffle an array, [2] based on a stable value.

[1] is easy, so we'll start with shuffle [us3.php.net]. If you follow the shuffle documentation, it points you to srand [us3.php.net] with which you can seed the pseudo-random number generator used by shuffle. srand takes an int, so now all we have to do is figure out how to convert a string (your URL) to an integer. Fortunately, such a function exists: crc32 [us3.php.net].

Combine everything together and you get:

srand(crc32($_SERVER['PHP_SELF']));
shuffle($array);
srand(); // reseed the RNG; otherwise any other random value will be based on the page's URL, too

and now you have an array shuffle which is stable to the page's URL.

httpwebwitch

2:32 am on Nov 11, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



pinterface, that's brilliant! Thanks - I'll try it tomorrow

httpwebwitch

5:36 am on Nov 29, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



it worked, it's elegant, it's fantastic. Thanks again