<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en-GB">
	<id>https://wiki.kram.nz/index.php?action=history&amp;feed=atom&amp;title=SE250%3Alab-5%3Azyan057</id>
	<title>SE250:lab-5:zyan057 - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://wiki.kram.nz/index.php?action=history&amp;feed=atom&amp;title=SE250%3Alab-5%3Azyan057"/>
	<link rel="alternate" type="text/html" href="https://wiki.kram.nz/index.php?title=SE250:lab-5:zyan057&amp;action=history"/>
	<updated>2026-04-29T00:56:49Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.45.3</generator>
	<entry>
		<id>https://wiki.kram.nz/index.php?title=SE250:lab-5:zyan057&amp;diff=6913&amp;oldid=prev</id>
		<title>Mark: 14 revision(s)</title>
		<link rel="alternate" type="text/html" href="https://wiki.kram.nz/index.php?title=SE250:lab-5:zyan057&amp;diff=6913&amp;oldid=prev"/>
		<updated>2008-11-03T05:19:57Z</updated>

		<summary type="html">&lt;p&gt;14 revision(s)&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;==Task 1==&lt;br /&gt;
OK...&lt;br /&gt;
So we need to find the suitable sample size before we can test and rank the has functions.&lt;br /&gt;
&lt;br /&gt;
Here is the code I have wrote to find a suitable sample size&lt;br /&gt;
  ent_test( &amp;quot;Buzhash low&amp;quot;, low_entropy_src, 10, &amp;amp;rt_add_buzhash );&lt;br /&gt;
  ent_test( &amp;quot;Buzhash low&amp;quot;, low_entropy_src, 100, &amp;amp;rt_add_buzhash );&lt;br /&gt;
  ent_test( &amp;quot;Buzhash low&amp;quot;, low_entropy_src, 1000, &amp;amp;rt_add_buzhash );&lt;br /&gt;
  ent_test( &amp;quot;Buzhash low&amp;quot;, low_entropy_src, 10000, &amp;amp;rt_add_buzhash );&lt;br /&gt;
  ent_test( &amp;quot;Buzhash low&amp;quot;, low_entropy_src, 100000, &amp;amp;rt_add_buzhash );&lt;br /&gt;
  ent_test( &amp;quot;Buzhash low&amp;quot;, low_entropy_src, 1000000, &amp;amp;rt_add_buzhash );&lt;br /&gt;
  ent_test( &amp;quot;Buzhash low&amp;quot;, low_entropy_src, 10000000, &amp;amp;rt_add_buzhash );&lt;br /&gt;
&lt;br /&gt;
Result&lt;br /&gt;
&lt;br /&gt;
 Testing Buzhash low on 10 samples&lt;br /&gt;
 Entropy = 3.584963 bits per byte.&lt;br /&gt;
 &lt;br /&gt;
 Optimum compression would reduce the size&lt;br /&gt;
  of this 12 byte file by 55 percent.&lt;br /&gt;
 &lt;br /&gt;
 Chi square distribution for 12 samples is 244.00, and randomly&lt;br /&gt;
  would exceed this value 50.00 percent of the times.&lt;br /&gt;
 &lt;br /&gt;
 Arithmetic mean value of data bytes is 126.0833 (127.5 = random).&lt;br /&gt;
 Monte Carlo value for Pi is 2.000000000 (error 36.34 percent).&lt;br /&gt;
 Serial correlation coefficient is -0.341362 (totally uncorrelated = 0.0).&lt;br /&gt;
&lt;br /&gt;
 Testing Buzhash low on 100 samples&lt;br /&gt;
 Entropy = 6.248758 bits per byte. &lt;br /&gt;
 &lt;br /&gt;
 Optimum compression would reduce the size&lt;br /&gt;
 of this 100 byte file by 21 percent.&lt;br /&gt;
 &lt;br /&gt;
 Chi square distribution for 100 samples is 273.76, and randomly&lt;br /&gt;
 would exceed this value 25.00 percent of the times.&lt;br /&gt;
 &lt;br /&gt;
 Arithmetic mean value of data bytes is 129.3100 (127.5 = random).&lt;br /&gt;
 Monte Carlo value for Pi is 3.250000000 (error 3.45 percent).&lt;br /&gt;
 Serial correlation coefficient is -0.092433 (totally uncorrelated = 0.0).&lt;br /&gt;
&lt;br /&gt;
 Testing Buzhash low on 1000 samples&lt;br /&gt;
 Entropy = 7.847331 bits per byte.&lt;br /&gt;
 &lt;br /&gt;
 Optimum compression would reduce the size&lt;br /&gt;
 of this 1000 byte file by 1 percent.&lt;br /&gt;
 &lt;br /&gt;
 Chi square distribution for 1000 samples is 207.81, and randomly&lt;br /&gt;
 would exceed this value 97.50 percent of the times.&lt;br /&gt;
 &lt;br /&gt;
 Arithmetic mean value of data bytes is 126.7080 (127.5 = random).&lt;br /&gt;
 Monte Carlo value for Pi is 3.277108434 (error 4.31 percent).&lt;br /&gt;
 Serial correlation coefficient is 0.007539 (totally uncorrelated = 0.0).&lt;br /&gt;
&lt;br /&gt;
 Testing Buzhash low on 10000 samples&lt;br /&gt;
 Entropy = 7.984998 bits per byte.&lt;br /&gt;
 &lt;br /&gt;
 Optimum compression would reduce the size&lt;br /&gt;
 of this 10000 byte file by 0 percent.&lt;br /&gt;
 &lt;br /&gt;
 Chi square distribution for 10000 samples is 206.87, and randomly&lt;br /&gt;
 would exceed this value 97.50 percent of the times.&lt;br /&gt;
 &lt;br /&gt;
 Arithmetic mean value of data bytes is 126.8134 (127.5 = random).&lt;br /&gt;
 Monte Carlo value for Pi is 3.157262905 (error 0.50 percent).&lt;br /&gt;
 Serial correlation coefficient is 0.008094 (totally uncorrelated = 0.0).&lt;br /&gt;
&lt;br /&gt;
 Testing Buzhash low on 100000 samples&lt;br /&gt;
 Entropy = 7.998378 bits per byte.&lt;br /&gt;
 &lt;br /&gt;
 Optimum compression would reduce the size&lt;br /&gt;
 of this 100000 byte file by 0 percent.&lt;br /&gt;
 &lt;br /&gt;
 Chi square distribution for 100000 samples is 224.41, and randomly&lt;br /&gt;
 would exceed this value 90.00 percent of the times.&lt;br /&gt;
 &lt;br /&gt;
 Arithmetic mean value of data bytes is 127.6864 (127.5 = random).&lt;br /&gt;
 Monte Carlo value for Pi is 3.112444498 (error 0.93 percent).&lt;br /&gt;
 Serial correlation coefficient is 0.000743 (totally uncorrelated = 0.0).&lt;br /&gt;
&lt;br /&gt;
 Testing Buzhash low on 1000000 samples&lt;br /&gt;
 Entropy = 7.999896 bits per byte.&lt;br /&gt;
 &lt;br /&gt;
 Optimum compression would reduce the size&lt;br /&gt;
 of this 1000000 byte file by 0 percent.&lt;br /&gt;
 &lt;br /&gt;
 Chi square distribution for 1000000 samples is 144.94, and randomly&lt;br /&gt;
 would exceed this value 99.99 percent of the times.&lt;br /&gt;
 &lt;br /&gt;
 Arithmetic mean value of data bytes is 127.4920 (127.5 = random).&lt;br /&gt;
 Monte Carlo value for Pi is 3.144948580 (error 0.11 percent).&lt;br /&gt;
 Serial correlation coefficient is -0.000856 (totally uncorrelated = 0.0).&lt;br /&gt;
&lt;br /&gt;
 Testing Buzhash low on 10000000 samples&lt;br /&gt;
 Entropy = 7.999986 bits per byte.&lt;br /&gt;
 &lt;br /&gt;
 Optimum compression would reduce the size&lt;br /&gt;
 of this 10000000 byte file by 0 percent.&lt;br /&gt;
 &lt;br /&gt;
 Chi square distribution for 10000000 samples is 191.37, and randomly&lt;br /&gt;
 would exceed this value 99.50 percent of the times.&lt;br /&gt;
 &lt;br /&gt;
 Arithmetic mean value of data bytes is 127.4860 (127.5 = random).&lt;br /&gt;
 Monte Carlo value for Pi is 3.140720456 (error 0.03 percent).&lt;br /&gt;
 Serial correlation coefficient is 0.000051 (totally uncorrelated = 0.0).&lt;br /&gt;
&lt;br /&gt;
The result suggest that any sample size larger than 100000 does not make much difference.&lt;/div&gt;</summary>
		<author><name>Mark</name></author>
	</entry>
</feed>