SE250:lab-5:tsen009
Introduction
How do hash functions perform in theory and in practice?
Task
The task is to use the given functions to compare the randomness of the given hash functions.
Choosing values
The given main function is as follows,
int main( ) { int sample_size = ???; int n_keys = ???; int table_size = ???; ent_test( "Buzhash low", low_entropy_src, sample_size, &rt_add_buzhash ); printf( "Buzhash low %d/%d: llps = %d, expecting %g\n", n_keys, table_size, llps( n_keys, low_entropy_src, table_size, buzhash ), expected_llps( n_keys, table_size ) ); return 0; }
the values of the sample_size, n_keys, table_size need to be alterd...
Compiling
The compile command was figured after much confusion and frustration. the compile command used was,
gcc lab-5.c randtest.c buzhash.c arraylist.c -o bob && bob.exe
Initial Test
inputs:
int sample_size = 20; int n_keys = 3; int table_size = 50;
Output:
Testing Buzhash low on 20 samples Entropy = 4.321928 bits per byte. Optimum compression would reduce the size of this 20 byte file by 45 percent. Chi square distribution for 20 samples is 236.00, and randomly would exceed this value 75.00 percent of the times. Arithmetic mean value of data bytes is 117.7500 (127.5 = random). Monte Carlo value for Pi is 2.666666667 (error 15.12 percent). Serial correlation coefficient is -0.208584 (totally uncorrelated = 0.0). Buzhash low 3/50: llps = 1, expecting 1.03487
Initial Questions
WHAT ON EARTH DOSE THIS OUTPUT MEAN!!!!!!!!!!! seriously? ZOMG!!!!! (soz for the caps... lol)...
...googles...
Further Testing
well, its obvious i have no idea what im doing, but meh, i thought id pretend that i know what im doing and carry on the experiment with diffrent values, here it goes the big massive result dump...
Test 1
int sample_size = 1000; int n_keys = 1000; int table_size = 1000;
Testing Buzhash low on 1000 samples Entropy = 7.843786 bits per byte. Optimum compression would reduce the size of this 1000 byte file by 1 percent. Chi square distribution for 1000 samples is 214.46, and randomly would exceed this value 95.00 percent of the times. Arithmetic mean value of data bytes is 128.0860 (127.5 = random). Monte Carlo value for Pi is 3.132530120 (error 0.29 percent). Serial correlation coefficient is -0.017268 (totally uncorrelated = 0.0). Buzhash low 1000/1000: llps = 6, expecting 5.51384
Test 2
int sample_size = 300; int n_keys = 1; int table_size = 500;
Testing Buzhash low on 300 samples Entropy = 7.348095 bits per byte. Optimum compression would reduce the size of this 300 byte file by 8 percent. Chi square distribution for 300 samples is 239.31, and randomly would exceed this value 75.00 percent of the times. Arithmetic mean value of data bytes is 131.9200 (127.5 = random). Monte Carlo value for Pi is 3.040000000 (error 3.23 percent). Serial correlation coefficient is -0.083985 (totally uncorrelated = 0.0). Buzhash low 1/500: llps = 1, expecting 0.633119
Test 3
int sample_size = 10000; int n_keys = 200; int table_size = 90;
Testing Buzhash low on 10000 samples Entropy = 7.985498 bits per byte. Optimum compression would reduce the size of this 10000 byte file by 0 percent. Chi square distribution for 10000 samples is 201.50, and randomly would exceed this value 99.00 percent of the times. Arithmetic mean value of data bytes is 125.8253 (127.5 = random). Monte Carlo value for Pi is 3.181272509 (error 1.26 percent). Serial correlation coefficient is -0.000047 (totally uncorrelated = 0.0). Buzhash low 200/90: llps = 6, expecting 6.64293
Test 4
int sample_size = 999999; int n_keys = 9999; int table_size = 99999;
Testing Buzhash low on 999999 samples Entropy = 7.999876 bits per byte. Optimum compression would reduce the size of this 1000000 byte file by 0 percent. Chi square distribution for 1000000 samples is 171.27, and randomly would exceed this value 99.99 percent of the times. Arithmetic mean value of data bytes is 127.5198 (127.5 = random). Monte Carlo value for Pi is 3.135468542 (error 0.19 percent). Serial correlation coefficient is -0.000567 (totally uncorrelated = 0.0). Buzhash low 9999/99999: llps = 3, expecting 3.327
Test 5
int sample_size = 7672; int n_keys = 42; int table_size = 1;
Testing Buzhash low on 7672 samples Entropy = 7.979032 bits per byte. Optimum compression would reduce the size of this 7672 byte file by 0 percent. Chi square distribution for 7672 samples is 221.42, and randomly would exceed this value 90.00 percent of the times. Arithmetic mean value of data bytes is 125.9483 (127.5 = random). Monte Carlo value for Pi is 3.151799687 (error 0.32 percent). Serial correlation coefficient is 0.000375 (totally uncorrelated = 0.0). Buzhash low 42/1: llps = 42, expecting 41.9999
Conclusion
ZOMG this is annoying /RAGE_QUIT!!!!