SE250:lab-5:jsmi233
I start today’s lab by attempting to compile the given code. Using the line below to compile, I get a copious amount of errors. Just to confirm this I tried gcc *.c, but the same occurs.
gcc arraylist.c buzhash.c randtest.c lab-5.c -o lab-5.exe && .\lab-5.exe
It turns out the problem was that I downloaded lab-5.c in html format somehow. Having re-downloaded this file correctly, the code compiles as expected.
It is taking a long time to work out exactly what I am supposed to be doing.
For my choice of numbers I had:
int sample_size = 1000; int n_keys = 300; int table_size = 3000;
Because I just copied off the guy next to me.
The results of running my program were:
Testing Buzhash low on 1000 samples
Entropy = 7.843786 bits per byte.
Optimum compression would reduce the size of this 1000 byte file by 1 percent.
Chi square distribution for 1000 samples is 214.46, and randomly would exceed this value 95.00 percent of the times.
Arithmetic mean value of data bytes is 128.0860 (127.5 = random). Monte Carlo value for Pi is 3.132530120 (error 0.29 percent). Serial correlation coefficient is -0.017268 (totally uncorrelated = 0.0).
Buzhash low 300/3000: llps = 2, expecting 2.38294
Now I’m going to try and work out what this is supposed to mean. The motivation for desiring very random numbers is obvious. The term “entropy” I assume is being used to refer to the degree of randomness. We are testing the entropy of something. The entropy of the different hash functions. What is the sample size? I still don’t get what the sample size is. Its now 11:09am and I still don’t get what the sample size is. The sample size is the number of times we test the function. Therefore the greater number of samples, the closer we will get to theoretical values.
Hence I chose sample_size = 100000, because with any larger value, the program crashes.
rt_add_buzhash, low entropy source: 7.998297
rt_add_buzhash, typical entropy source: 7.997834
rt_add_buzhashn, low entropy source: 7.998236
rt_add_buzhashn, typical entropy source: 7.998236
rt_add_hash_CRC, low entropy source: 5.574705
rt_add_hash_CRC, typical entropy source: 7.840456
rt_add_Java_Integer, low entropy source: 4.828241
rt_add_Java_Integer, typical entropy source: 4.828241
rt_add_Java_Object, low entropy source: 2.000000
rt_add_Java_Object, typical entropy source: 5.754918
rt_add_Java_String, low entropy source: 7.999587
rt_add_Java_String, typical entropy source: 7.945541
rt_add_base256, low entropy source: 0
rt_add_base256, typical entropy source: 4.022975
rt_add_rand, low entropy source: 7.953082
rt_add_rand, typical entropy source: 7.952724
rt_add_high_rand, low entropy source: 7.998276
rt_add_high_rand, typical entropy source: 7.998075
As suggested, the high random function was more random than the rand function.
The functions in order of randomness for the low entropy sources are:
rt_add_Java_String, low entropy source: 7.999587
rt_add_buzhash, low entropy source: 7.998297
rt_add_high_rand, low entropy source: 7.998276
rt_add_buzhashn, low entropy source: 7.998236
rt_add_rand, low entropy source: 7.953082
rt_add_hash_CRC, low entropy source: 5.574705
rt_add_Java_Integer, low entropy source: 4.828241
rt_add_Java_Object, low entropy source: 2.000000
rt_add_base256, low entropy source: 0
The functions in order of randomness for the typical entropy sources are:
rt_add_buzhashn, typical entropy source: 7.998236
rt_add_high_rand, typical entropy source: 7.998075
rt_add_buzhash, typical entropy source: 7.997834
rt_add_rand, typical entropy source: 7.952724
rt_add_Java_String, typical entropy source: 7.945541
rt_add_hash_CRC, typical entropy source: 7.840456
rt_add_Java_Object, typical entropy source: 5.754918
rt_add_Java_Integer, typical entropy source: 4.828241
rt_add_base256, typical entropy source: 4.022975
Expirament two with load factor of 3:
Hash functions with low entropy source in order of coolness:
Java_String_hash low 3000/1000: llps = 5, expecting 9.98866
Buzhash low 3000/1000: llps = 9, expecting 9.98866
hash_CRC low 3000/1000: llps = 30, expecting 9.98866
base256 low 3000/1000: llps = 3000, expecting 9.98866
Java_Object_hash low 3000/1000: llps = 3000, expecting 9.98866
Hash functions with typical entropy source in order of coolness:
Buzhash typical 3000/1000: llps = 8, expecting 9.98866
Java_String_hash typical 3000/1000: llps = 9, expecting 9.98866
hash_CRC typical 3000/1000: llps = 11, expecting 9.98866
Java_Object_hash typical 3000/1000: llps = 32, expecting 9.98866
base256 typical 3000/1000: llps = 206, expecting 9.98866