Talk:SE250:lab-5:jsmi233
Jump to navigation
Jump to search
Comments from John Hamer
- "I just copied off the guy next to me." — this is not an adequate reason. How confident are you that the sample size is ok? This is important. If you get it wrong, you risk collecting worthless data.
- "Now I’m going to try and work out what this is supposed to mean" — excellent. I'm pleased to see you are not intimidated by venturing into the unknown.
- ". Therefore the greater number of samples, the closer we will get to theoretical values. Hence I chose sample_size = 100000, because with any larger value, the program crashes." — thank you! A coherent justification.
- In fact, "entropy" is just one measure of randomness. You can still gerrymander the data so the entropy is high but the data is predictable. The same is true of all the tests. That's why we use a suite of tests -- it's harder to fool them all.
- Why a load factor of 3?
- Java String is worth commenting on: it does so-so in the stats tests, but comes up "better than expected" in practice. How come?
- base256 is also worthy of comment. How can it be so bad (returning the same value for every input!)
- And what's up with Java Object hash, suddenly collapsing.
This is a solid effort. You show a good understanding of the material, although you skirted the harder parts of probing the meaning of the different stats tests. Your results could have been better presented in fixed-space tables or as graphs.