SE250:lab-5:jsmi233

From Marks Wiki
Revision as of 05:19, 3 November 2008 by Mark (Sọ̀rọ̀ | contribs) (1 revision(s))
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

I start today’s lab by attempting to compile the given code. Using the line below to compile, I get a copious amount of errors. Just to confirm this I tried gcc *.c, but the same occurs.

gcc arraylist.c buzhash.c randtest.c lab-5.c -o lab-5.exe && .\lab-5.exe

It turns out the problem was that I downloaded lab-5.c in html format somehow. Having re-downloaded this file correctly, the code compiles as expected.

It is taking a long time to work out exactly what I am supposed to be doing.

For my choice of numbers I had:

 int sample_size = 1000;
 
 int n_keys = 300;
 
 int table_size = 3000;

Because I just copied off the guy next to me.

The results of running my program were:

Testing Buzhash low on 1000 samples

Entropy = 7.843786 bits per byte.

Optimum compression would reduce the size of this 1000 byte file by 1 percent.

Chi square distribution for 1000 samples is 214.46, and randomly would exceed this value 95.00 percent of the times.

Arithmetic mean value of data bytes is 128.0860 (127.5 = random). Monte Carlo value for Pi is 3.132530120 (error 0.29 percent). Serial correlation coefficient is -0.017268 (totally uncorrelated = 0.0).

Buzhash low 300/3000: llps = 2, expecting 2.38294

Now I’m going to try and work out what this is supposed to mean. The motivation for desiring very random numbers is obvious. The term “entropy” I assume is being used to refer to the degree of randomness. We are testing the entropy of something. The entropy of the different hash functions. What is the sample size? I still don’t get what the sample size is. Its now 11:09am and I still don’t get what the sample size is. The sample size is the number of times we test the function. Therefore the greater number of samples, the closer we will get to theoretical values.

Hence I chose sample_size = 100000, because with any larger value, the program crashes.

rt_add_buzhash, low entropy source: 7.998297

rt_add_buzhash, typical entropy source: 7.997834

rt_add_buzhashn, low entropy source: 7.998236

rt_add_buzhashn, typical entropy source: 7.998236

rt_add_hash_CRC, low entropy source: 5.574705

rt_add_hash_CRC, typical entropy source: 7.840456

rt_add_Java_Integer, low entropy source: 4.828241

rt_add_Java_Integer, typical entropy source: 4.828241

rt_add_Java_Object, low entropy source: 2.000000

rt_add_Java_Object, typical entropy source: 5.754918

rt_add_Java_String, low entropy source: 7.999587

rt_add_Java_String, typical entropy source: 7.945541

rt_add_base256, low entropy source: 0

rt_add_base256, typical entropy source: 4.022975

rt_add_rand, low entropy source: 7.953082

rt_add_rand, typical entropy source: 7.952724

rt_add_high_rand, low entropy source: 7.998276

rt_add_high_rand, typical entropy source: 7.998075


As suggested, the high random function was more random than the rand function.

The functions in order of randomness for the low entropy sources are:

rt_add_Java_String, low entropy source: 7.999587

rt_add_buzhash, low entropy source: 7.998297

rt_add_high_rand, low entropy source: 7.998276

rt_add_buzhashn, low entropy source: 7.998236

rt_add_rand, low entropy source: 7.953082

rt_add_hash_CRC, low entropy source: 5.574705

rt_add_Java_Integer, low entropy source: 4.828241

rt_add_Java_Object, low entropy source: 2.000000

rt_add_base256, low entropy source: 0


The functions in order of randomness for the typical entropy sources are:

rt_add_buzhashn, typical entropy source: 7.998236

rt_add_high_rand, typical entropy source: 7.998075

rt_add_buzhash, typical entropy source: 7.997834

rt_add_rand, typical entropy source: 7.952724

rt_add_Java_String, typical entropy source: 7.945541

rt_add_hash_CRC, typical entropy source: 7.840456

rt_add_Java_Object, typical entropy source: 5.754918

rt_add_Java_Integer, typical entropy source: 4.828241

rt_add_base256, typical entropy source: 4.022975

Expirament two with load factor of 3:

Hash functions with low entropy source in order of coolness:

Java_String_hash low 3000/1000: llps = 5, expecting 9.98866

Buzhash low 3000/1000: llps = 9, expecting 9.98866

hash_CRC low 3000/1000: llps = 30, expecting 9.98866

base256 low 3000/1000: llps = 3000, expecting 9.98866

Java_Object_hash low 3000/1000: llps = 3000, expecting 9.98866


Hash functions with typical entropy source in order of coolness:

Buzhash typical 3000/1000: llps = 8, expecting 9.98866

Java_String_hash typical 3000/1000: llps = 9, expecting 9.98866

hash_CRC typical 3000/1000: llps = 11, expecting 9.98866

Java_Object_hash typical 3000/1000: llps = 32, expecting 9.98866

base256 typical 3000/1000: llps = 206, expecting 9.98866