1)
Message boards :
Projects :
bad news everyone, results of the BOINC Collatz project are invalid
(Message 103311)
Posted 28 Feb 2021 by poppinfresh99 Post: At that speed, you should start up a rival project yourself. How easy is it to run a Boinc server? Sadly, I do not want to for many reasons... (1) I don't have the hardware or Internet package necessary to do this. My newest computer is from 2012, and Xfinity doesn't let me host a server. I am perfectly happy with my hardware at the moment. (2) I honestly think that the Collatz conjecture has been experimentally tested enough. I was motivated to write these codes because I find the algorithms and mathematical tricks to be fascinating. (3) I would much prefer that people use their resources to cure diseases on BOINC. (4) I simply wouldn't enjoy managing a BOINC project, though helping someone set one up would be a fun learning experience. I prefer math and code (and other things). Thanks for asking, but please now respect my choice (others have not been so respectful). But I'd be very happy to let someone else use my code for a BOINC project because, if you are going to use resources, you should be using the best code! In fact, over the summer, I could even help you set it up! I'm kinda curious about how setting up BOINC projects works, and I like the BOINC philosophy of giving people choice to compute whatever they want. |

2)
Message boards :
Projects :
bad news everyone, results of the BOINC Collatz project are invalid
(Message 103305)
Posted 28 Feb 2021 by poppinfresh99 Post: Update: Jon Sonntag now claims that he never was trying to find highest steps even though *all* of the results he posts are of highest steps, and, at massive time and electricity cost, he reduces numbers to 1 while counting the steps it takes to do so. Maybe he is being honest about not trying to find highest steps and is just extremely confused? Either way, there is code that is 7700% or 3500% faster (for CPU and GPU, respectively), making the project a waste of time and electricity regardless of whether or not one feels that the Collatz conjecture should still be experimentally tested. Here are ways he can speed up his code, which have been communicated to him as well (over the past 12 months)... (1) By not reducing to 1 and just reducing below the starting number, you only need to do 3.492652 steps on average, which is a *vast* improvement over the 4.8 * log2(N) steps of his current approach (where log2(N) is approximately 70). What I mean by the starting number is the 9 in the following example... 9 -> 14 -> 7 -> 11 -> 17 -> 26 -> 13 -> 20 -> 10 -> 5 -> 8 -> 4 -> 2 -> 1 Once you get to 7, you stop because it is less than the starting 9. Just 2 steps are needed instead of the above many steps. Then, when you test the starting number of 10, it immediately reduces to 5, so you move to 11. Of course, a good sieve would prevent 9, 10, and 11 from ever being tested, but you get the idea. For this to work, you can't skip anything not excluded by a sieve. (2) For a 2^k sieve and a given k, I exclude about 25% more numbers than he does by removing numbers that join the paths of smaller numbers. (3) My "sieveless" method allows me to run FAR larger sieves. His 2^26 sieve requires me to test 1.3% of numbers, but a 2^60 sieve, which my "sieveless" technique can do, would probably run only 0.05% of numbers. This is a *vast* improvement. (4) Using a 3^1 sieve allows me to run 67% of the numbers he currently runs. A 3^2 sieve allows me to test 55% of the numbers he currently runs. I mentioned BOINC credit as a side note earlier. The fact that his code isn't valid is a larger concern! As for making code that is actually valid, the following has been communicated to the project... (1) One should create checksums of steps-to-reduce-below-starting-number as you go then compare the checksum with another device that is validating the task. These checksums require minimal computational time. David Barina's code makes these checksums if you want to see how they are done. I temporarily put them in my "sieveless" code as well to compare my checksums against his (they were the same), though my "repeated k steps" code hasn't yet been compared to any other code (though it indirectly has via my "reduce to 1" code reproducing known records). (2) Reduce maxStep in the OpenCL kernel. Even better, have a better way of detecting a GPU watchdog timer! Note that the checksum is not enough because the checksum will never be updated for *any* device with a watchdog timer that encounters an infinite cycle that would disprove the Collatz conjecture. (3) Do not use Intel or AMD devices that fail an initial arithmetic-accuracy test. On my webpage, I provide some basic arithmetic tests that many Intel and AMD devices fail (the ones that fail worrisomely give the same wrong answers as each other). He must then start over. But, with the massive speedup of good code and the always-faster GPUs that people have, he could easily reproduce all of his previous results in a matter of months. |

3)
Message boards :
Projects :
bad news everyone, results of the BOINC Collatz project are invalid
(Message 102943)
Posted 9 Feb 2021 by poppinfresh99 Post: Thanks for the reply robsmith. I have taken it up with the project, but Jon Sonntag deleted my posts, so I would like the truth to appear at least somewhere. If you don't mind, I would like to post a bit more information here, so then I can simply link to this thread. I'd rather Jon Sonntag delete a simple link than be able to delete the entire thread. Also, can't BOINC remove the project from their list? https://boinc.berkeley.edu/projects.php I have found smoking-gun proof! By running a task from the BOINC Collatz project, I found a file called sieve26.bin! I ran the following Linux command (on my macOS actually)... od -N 256 -t u4 sieve26.bin Giving the result... 0000000 1037374 0 27 31 0000020 47 63 71 91 0000040 103 111 155 167 0000060 223 251 447 495 0000100 671 703 795 871 0000120 927 1055 1307 1383 0000140 1407 1471 1583 1639 0000160 1695 1743 1819 2047 0000200 2111 2151 2207 2215 0000220 2287 2375 2543 2559 0000240 2715 2751 2815 2943 0000260 3055 3071 3099 3175 0000300 3227 3323 3391 3399 0000320 3431 3567 3631 3711 0000340 3739 3823 3839 3943 0000360 3951 4095 4123 4127 If you run the following simple Python 3 code, you can reproduce the above invalid 2^26 sieve. k = 26 nStart = 3 count = 0 while nStart < 2**k: nStart += 4 # since only n%4 == 3 are in the sieve n = nStart for i in range(k): if n%2 == 1: # odd n = (3*n+1) // 2 else: n = n // 2 if n < nStart: break if i == k-1: if count < 62: print(nStart) count += 1 print(count) I can now for sure conclude that the results of the BOINC Collatz project are invalid! I asked David BaĆina for his expert input, and he agrees that this type of sieve is invalid. He sent me this link for valid sieves... http://www.ericr.nl/wondrous/techpage.html When searching for highest steps, we must only use the sieves described in the "class record algorithm" section of the above link. |

4)
Message boards :
Projects :
bad news everyone, results of the BOINC Collatz project are invalid
(Message 102903)
Posted 9 Feb 2021 by poppinfresh99 Post: I have actually just discovered that certain types of sieves can be used when reducing to 1! A 3^2 (or 3^1) sieve works very nicely, but the 2^k sieve that you can create has a huge what-I-call deltaN that is a real pain for not-wonderful gain. See my webpage for more details! https://www.bradleyknockel.com/collatz/ Anyway, the BOINC Collatz project uses the invalid kind of sieve for reducing numbers to 1. |

5)
Message boards :
Projects :
bad news everyone, results of the BOINC Collatz project are invalid
(Message 102842)
Posted 7 Feb 2021 by poppinfresh99 Post: Bad news everyone, results of the BOINC Collatz project are invalid. The project has the goal of finding the high steps to reduce a number to 1. However, you may not use a sieve when doing this. See the kernel code from this project below, and you'll see that a sieve is used. Please see this paper for a valid algorithm. http://www.ijnc.org/index.php/ijnc/article/download/135/144 See "Algorithm 2". All results of this BOINC Collatz project regarding high steps are invalid. In light of this, there are algorithms that don't find high steps to reduce a number to 1 that are 78 times faster for CPU (7700% faster) and 36 times faster on GPU (3500% faster), so this BOINC project performs 1% as fast on CPU and 3% as fast on GPU as it should. This is assuming that this BOINC project's code is valid in any form, and I have many serious concerns from the small amount of code that I can actually see. Let's look at a 2^2 sieve for a very simple example of what this project is doing wrong. This sieve will only run 25% of the numbers... 3, 7, 11, 15, 19, 23, ... start number... 1: 0 steps 2: 1 step 3: 7 steps 4: 2 steps 5: 5 steps 6: 8 steps ... But notice that 8 steps occurs at 6, which would never be tested with this sieve. I suppose the 1 step of number 2 isn't going to be found either! You can easily confirm for yourself that this project is still invalid by looking at... https://boinc.thesonntags.com/collatz/high_steppers.php It only shows numbers of pattern 3, 7, 11, 15, 19, ... The following are also all very worrisome, especially considering that, in this project, tasks are not validated by another device... (1) most devices at this project do not have ECC RAM (2) maxStep = 2^28-1, which on the many GPUs that have a watchdog timer, may not detect the infinite cycles that would disprove the Collatz conjecture. This would especially be a problem if global_work_size is large! (3) AMD and Intel GPUs are known to commonly make arithmetic errors for the 64-bit integers (ulong) used in the code It is important that EVERY number be tested, and I have zero confidence that this is true. I am even more worried because Jon Sonntag, the project admin, refuses to release *any* amount of source code for review. I have had to struggle to figure out the above! Who knows what other errors exist?! Dare I mention that, as many of us know, the project gives a HUGE amount of BOINC credit! Dare I mention the lack of organized authoritative results? I have emailed BOINC to have them consider removing this project from their published list. Science United uses that list to assign work. BOINC says they are unable to determine if the code is invalid. Jon Sonntag deleted my forum posts at his project, so I am posting at these BOINC forums. I would hate for any scientist to take the results of his project seriously. The lesson to be learned here is to listen to the scientists and to the experts, and don't follow people who are all show and no substance. Just because Jon Sonntag knows more than you about something doesn't mean that he is trained in thinking scientifically. I have very fast and valid code here... https://www.bradleyknockel.com/collatz/ I provide many suggestions for getting this code to work with something like BOINC. My code is open for review, so let me know if you find any improvements! Please let me know if you want to use the code! No one at the BOINC Collatz project understands the science behind the project, so, if you do, please confirm my conclusions! Jon Sonntag's invalid kernel code... //NOTICE: This kernel must define LUTSTEP and SIEVESTEP //Table: power of 3 __constant uint power3[21] = {1u,3u,9u,27u,81u,243u,729u,2187u,6561u,19683u, 59049u,177147u,531441u,1594323u,4782969u,14348907u,43046721u,129140163u,387420489u,1162261467u, 3486784401u }; inline uint2 mul64(const uint a, const uint b){ return (uint2)(a*b, mul_hi(a,b)); } __kernel void kernelSteps ( __global uint4 *results, //Result: s0:max step; s1:low offset; s2:high offset; s3:total steps. Size = global work size __global const uint *sieve, //Numbers needed to compute for every 2^(SIEVESTEP) numbers. size = global work size __global const uint2 *table, //look-up table for jumping in collatz algorithm. Size = 2^(LUTSTEP) __global const uint *delay, //Collatz delays for numbers under 2^(LUTSTEP),. Size = 2^(LUTSTEP) const ulong2 start, //Starting value (128-bit), must be multiple of 2^(SIEVESTEP) const uint kNo //'kernel set' number, kNo = 1 means offset = 1* 2^(SIEVESTEP). kNo = 0 means clear the result ) { const uint gid = get_global_id(0); const uint bitDiff = 32 - LUTSTEP, tableSizeMinusOne = (1u << LUTSTEP) - 1; const uint maxStep = 0xfffffff; //maximum step allowed by this kernel (2^28-1) uint contCarry = 1u, stepCount = 0, p3, overflow; //contCarry : if loop should continue and hold carry bits, stepCount : count collatz Delay, p3 : store power3[] const ulong totalOffset = (ulong)sieve[gid] + ((ulong)kNo * (ulong)(SIEVESIZE)); ulong2 uval = (ulong2)(start.s0 + totalOffset, start.s1); uval.s1 += (uval.s0 < totalOffset); uint2 lut, mulResult, val_h; //lut : table item. mulResult : mul64() result, val_h : val and val_h combined to be a 196-bit integer uint4 val = (uint4)((uint)uval.s0,(uint)(uval.s0>>32), (uint)uval.s1, (uint)(uval.s1>>32)); val_h = (uint2)(0u, 0u ); //Loop while(contCarry){ //get look-up table item lut = table[val.s0 & tableSizeMinusOne]; p3 = power3[lut.s0]; //Do n = (n>>LUTSTEP)*a + b //bit 0-31 mulResult = mul64((val.s0 >> LUTSTEP) + (val.s1 << bitDiff), p3); //Multiply 'a' val.s0 = mulResult.s0 + lut.s1; //Add 'b' contCarry = mulResult.s1 + (val.s0 < mulResult.s0); //count carry bits //bit 32-63 mulResult = mul64((val.s1 >> LUTSTEP) + (val.s2 << bitDiff), p3); val.s1 = mulResult.s0 + contCarry; contCarry = mulResult.s1 + (val.s1 < mulResult.s0); //bit 64-95 mulResult = mul64((val.s2 >> LUTSTEP) + (val.s3 << bitDiff), p3); val.s2 = mulResult.s0 + contCarry; contCarry = mulResult.s1 + (val.s2 < mulResult.s0); //bit 96-127 mulResult = mul64((val.s3 >> LUTSTEP) + (val_h.s0 << bitDiff), p3); val.s3 = mulResult.s0 + contCarry; contCarry = mulResult.s1 + (val.s3 < mulResult.s0); //bit 128-159 mulResult = mul64((val_h.s0 >> LUTSTEP) + (val_h.s1 << bitDiff), p3); val_h.s0 = mulResult.s0 + contCarry; contCarry = mulResult.s1 + (val_h.s0 < mulResult.s0); //bit 160-191 and overflow detection mulResult = mul64((val_h.s1 >> LUTSTEP), p3); val_h.s1 = mulResult.s0 + contCarry; overflow = mulResult.s1 + (val_h.s1 < mulResult.s0); stepCount += (lut.s0 + LUTSTEP); //add step //if val < tableSize or overflow or step >= maxStep, exit loop (set contCarry to 0), else set it to 1 contCarry = ((val.s0 > tableSizeMinusOne) | (val.s1 | val.s2 | val.s3 | val_h.s0 | val_h.s1)) && (!(overflow | (stepCount >= maxStep))); } //If overflow is true, stepCount = maxStep; otherwise, stepCount += delay[val.s0&(tableSize-1)], contCarry is use as a temporary int contCarry = stepCount + delay[val.s0 & tableSizeMinusOne]; stepCount = select(contCarry, maxStep, overflow); //Use val as the result to save space as it's no longer used val = results[gid]; contCarry = (stepCount > val.s0) | (kNo == 0) | ((stepCount == val.s0) && (totalOffset < ((ulong)val.s1 + ((ulong)val.s2<<32)))); val.s0 = select(val.s0,stepCount,contCarry); val.s1 = select(val.s1, (uint)totalOffset, contCarry); val.s2 = select(val.s2, (uint)(totalOffset>>32), contCarry); contCarry = (kNo == 0); val.s3 = select(val.s3 + stepCount, stepCount, contCarry); //save result to output array results[gid] = val; } //end of kernel __kernel void kernelReduce(__global uint4 *results, __global uint4 *out, __local uint4 *temp) { uint gid = get_global_id(0); uint gsz = get_global_size(0); uint lid = get_local_id(0); uint lsz = get_local_size(0); uint4 a = results[gid]; uint4 b = results[gid + gsz]; ulong ua, ub; ua = (ulong)a.s1 + ((ulong)a.s2<<32); ub = (ulong)b.s1 + ((ulong)b.s2<<32); uint x = (a.s0 > b.s0) || ((a.s0 == b.s0) && (ua < ub)) ; temp[lid].s0 = select(b.s0,a.s0,x); temp[lid].s1 = select(b.s1,a.s1,x); temp[lid].s2 = select(b.s2,a.s2,x); temp[lid].s3 = a.s3 + b.s3; barrier(CLK_LOCAL_MEM_FENCE); for(uint stride=lsz/2;stride>1;stride>>=1) { if (lid < stride) { a = temp[lid]; b = temp[lid + stride]; ua = (ulong)a.s1 + ((ulong)a.s2<<32); ub = (ulong)b.s1 + ((ulong)b.s2<<32); x = (a.s0 > b.s0) || ((a.s0 == b.s0) && (ua < ub)) ; temp[lid].s0 = select(b.s0,a.s0,x); temp[lid].s1 = select(b.s1,a.s1,x); temp[lid].s2 = select(b.s2,a.s2,x); temp[lid].s3 = a.s3 + b.s3; } barrier(CLK_LOCAL_MEM_FENCE); } if (lid == 0) { //x = temp[0].s0 >= temp[1].s0; ua = (ulong)temp[0].s1 + ((ulong)temp[0].s2<<32); ub = (ulong)temp[1].s1 + ((ulong)temp[1].s2<<32); x = (temp[0].s0 > temp[1].s0) || ((temp[0].s0 == temp[1].s0) && (ua < ub)) ; a.s0 = select(temp[1].s0,temp[0].s0,x); a.s1 = select(temp[1].s1,temp[0].s1,x); a.s2 = select(temp[1].s2,temp[0].s2,x); a.s3 = temp[0].s3 + temp[1].s3; out[get_group_id(0)] = a; } } |

6)
Message boards :
Questions and problems :
what RAM is limited by settings?
(Message 98002)
Posted 23 Apr 2020 by poppinfresh99 Post: For the computing preference "use at most __% memory", what memory? Is this your total physical memory (4 GB, 8 GB, or whatever), non-BOINC free memory, ...? The default 90% would be a lot if it's total physical memory! If free memory, wouldn't memory compression and swapping of non-BOINC tasks make this complicated? Thanks! |

Copyright © 2021 University of California. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.