Thread 'QMC new jobs, but big trouble..'

Message boards : Projects : QMC new jobs, but big trouble..
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Profileadrianxw
Avatar

Send message
Joined: 2 Oct 05
Posts: 404
Denmark
Message 93700 - Posted: 14 Nov 2019, 15:09:48 UTC
Last modified: 14 Nov 2019, 15:20:20 UTC

I saw that QMC had new work, so reenabled it on two machines, (hyperthreaded quads), one machine downloaded 6 tasks the other seven. All had expected runtimes of 2-3 days, no problems with that, due date was a couple of weeks away.

The runtimes are optimistic, they crawl along on both of my 4GHz i7's, but looked to be possible anyway, so I left them going.

Today, the machine with six jobs is back to showing 1.000% complete, the estimated run times have changed one is 3d 16:06 and counting, two of the others show 370+ days, two more 390+ days and the last 422+ days. This machine has now used upwards of 21 days on these jobs. All aborted - 21 days of crunching wasted.

The other machine has all seven tasks still running, showing from 64.626% complete, (2d 23:37:30 elapsed), down to 39.556% complete, (1d 30:02:25 elapsed). However tempting, I am not aborting them - YET.

I have, obviously, set no new tasks on both systems.
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 93700 · Report as offensive
robsmith
Volunteer tester
Help desk expert

Send message
Joined: 25 May 09
Posts: 1301
United Kingdom
Message 93701 - Posted: 14 Nov 2019, 15:38:20 UTC

Have you reported this to QMC - they need to know in case it is a significant problem rather than an isolated one of bad luck.
ID: 93701 · Report as offensive
ProfileDave
Help desk expert

Send message
Joined: 28 Jun 10
Posts: 2703
United Kingdom
Message 93702 - Posted: 14 Nov 2019, 16:28:02 UTC - in response to Message 93700.  

Today, the machine with six jobs is back to showing 1.000% complete, the estimated run times have changed one is 3d 16:06 and counting, two of the others show 370+ days, two more 390+ days and the last 422+ days. This machine has now used upwards of 21 days on these jobs. All aborted - 21 days of crunching wasted.


Wow! that is even longer than in the early days of CPDN when many tasks took over six months to complete even on a (then) fast machine.
ID: 93702 · Report as offensive
Profileadrianxw
Avatar

Send message
Joined: 2 Oct 05
Posts: 404
Denmark
Message 93708 - Posted: 14 Nov 2019, 19:48:07 UTC - in response to Message 93701.  
Last modified: 14 Nov 2019, 20:16:55 UTC

>>> Have you reported this to QMC

If you go to QMC's site, http://qmcathome.org/, it looks somewhat different to most normal BOINC projects. The forum is not working. I tried to trace Martin Korth, so far, without success. The link on the website does not work. He says elsewhere, that he is no longer there, and has his own company, elsewhere, that he is freelance, so I am not sure he is even the right person to ask.

This evening, the seven workunits on this machine are still running, varying between 3d 03:59:20, (66.860% done), and 2d 00:23:34, (42.589% done), elapsed. The "remaining" drops by a second every 3-4 seconds. They never seem to stop running, so perhaps the integration with BOINC is not well implemented. This machine is running the QMC tasks, and GPU task projects which use the other CPU core.

Both my machines are Windows 8.1 x64, be nice to have a Linux cruncher comment.

>>> Wow!

Indeed. I well remember the days of CPDN monsters, of course, the machines were a lot less powerfull then though. The values were,, obviously, ridiculous, that is why I dropped them. Something had gone seriously wrong.
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 93708 · Report as offensive
BobCat13

Send message
Joined: 6 Dec 06
Posts: 118
United States
Message 93711 - Posted: 14 Nov 2019, 23:04:57 UTC - in response to Message 93700.  

adrianxw wrote:
I saw that QMC had new work, so reenabled it on two machines, (hyperthreaded quads), one machine downloaded 6 tasks the other seven. All had expected runtimes of 2-3 days, no problems with that, due date was a couple of weeks away.

The runtimes are optimistic, they crawl along on both of my 4GHz i7's, but looked to be possible anyway, so I left them going.

Today, the machine with six jobs is back to showing 1.000% complete, the estimated run times have changed one is 3d 16:06 and counting, two of the others show 370+ days, two more 390+ days and the last 422+ days. This machine has now used upwards of 21 days on these jobs. All aborted - 21 days of crunching wasted.

The other machine has all seven tasks still running, showing from 64.626% complete, (2d 23:37:30 elapsed), down to 39.556% complete, (1d 30:02:25 elapsed). However tempting, I am not aborting them - YET.

I have, obviously, set no new tasks on both systems.

That is the way QMC tasks run. 1.000% done means that the task has reached it's first checkpoint. Of all the QMC tasks I have run, the fastest one to finish completed at the 5.000% mark (about 140 hours) and the longest one was at the 18.000% mark (about 750 hours).

If you are the first to complete the task, you will get credit even if the task has passed the deadline. If someone else completes the task before you, you will not receive credit as your result will be marked "Too late". Also, if you see the task listed as having too many errors, don't worry, you will still receive credit if you can complete it successfully.
ID: 93711 · Report as offensive
Profileadrianxw
Avatar

Send message
Joined: 2 Oct 05
Posts: 404
Denmark
Message 93713 - Posted: 15 Nov 2019, 6:43:01 UTC
Last modified: 15 Nov 2019, 7:02:29 UTC

This morning, all the tasks are still there. There has been a development though. two of the tasks are now back to showing 1.000% complete one with 358+ days to complete the other 375+ days to complete. The time remaining is going up. There is clearly something really weird going on with them.

What do you people think? Should I leave them running, or is it a lost cause?

<edit>
Sorry Bob, I did not see your post before I wrote that, I've released the two suspended. I will not accept new work. If you crunch the thing too completion, and then get ignored, they have some navel searching to do, how do they expect people to put up with that? We all know credit is useless, but the days spent running these things machines have been prevented from running other projects work - simply out of order.
They are clearly VERY weird.
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 93713 · Report as offensive
robsmith
Volunteer tester
Help desk expert

Send message
Joined: 25 May 09
Posts: 1301
United Kingdom
Message 93714 - Posted: 15 Nov 2019, 9:02:07 UTC

One thing of note is that the project declares itself to be have:
First, inofficial test units out.

This is just to check that everything works as it should.


If that is still the case then don't be too saddened by its poor performance, but I would be concerned about the lack of forum which would enable users to report-back problems, comments etc.
Personally I will wait until they announce they are releasing "proper" test units, and have cleaned up the website.
ID: 93714 · Report as offensive
Profileadrianxw
Avatar

Send message
Joined: 2 Oct 05
Posts: 404
Denmark
Message 93717 - Posted: 15 Nov 2019, 11:50:25 UTC

A few hours later, a third of the jobs has dropped to 1.000% and 370 odd days left to go. What you say is quite right, but clearly, things are not working as they should.
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 93717 · Report as offensive
Profileadrianxw
Avatar

Send message
Joined: 2 Oct 05
Posts: 404
Denmark
Message 93719 - Posted: 15 Nov 2019, 13:08:33 UTC - in response to Message 93714.  
Last modified: 15 Nov 2019, 13:19:29 UTC

... something else you might wonder about is that the project, and the other one on the start page, both purport to be Martin Korths, who is no longer there, and, thus, show invalid contact lines.
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 93719 · Report as offensive
Jim1348

Send message
Joined: 8 Nov 10
Posts: 310
United States
Message 93757 - Posted: 17 Nov 2019, 8:18:22 UTC - in response to Message 93700.  

Today, the machine with six jobs is back to showing 1.000% complete, the estimated run times have changed one is 3d 16:06 and counting, two of the others show 370+ days, two more 390+ days and the last 422+ days. This machine has now used upwards of 21 days on these jobs. All aborted - 21 days of crunching wasted.

After about 1 1/2 days of progress on my i7-9700 (Ubuntu 18.04.3), the same thing happened to me.
I am out. This thing is not ready for prime time.
ID: 93757 · Report as offensive
ProfileDave
Help desk expert

Send message
Joined: 28 Jun 10
Posts: 2703
United Kingdom
Message 93758 - Posted: 17 Nov 2019, 12:10:58 UTC - in response to Message 93757.  

Not going to sign up for this one yet but will keep an eye on it as it is an area of interest. Anyway all my cores are busy for the next two weeks at least.
ID: 93758 · Report as offensive
Profileadrianxw
Avatar

Send message
Joined: 2 Oct 05
Posts: 404
Denmark
Message 93759 - Posted: 17 Nov 2019, 13:11:38 UTC

>>> i7-9700 (Ubuntu 18.04.3)

Good to see that it is not a Windows specific problem, I know a lot of research establishments use Linux as it is cheaper. Does make me wonder though, if it is showing bizarre behaviour on both systems, what did they test it on before starting this exercise?

Currently, 6 of my work units are at the 1.000% stage, some have been there since yesterday, the last is at 70.729% done, so has not reached the checkpoint yet.
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 93759 · Report as offensive
Jim1348

Send message
Joined: 8 Nov 10
Posts: 310
United States
Message 93760 - Posted: 17 Nov 2019, 14:14:26 UTC - in response to Message 93759.  

>>> i7-9700 (Ubuntu 18.04.3)

Good to see that it is not a Windows specific problem, I know a lot of research establishments use Linux as it is cheaper. Does make me wonder though, if it is showing bizarre behaviour on both systems, what did they test it on before starting this exercise?

It is also not Intel vs. AMD. The same thing happened (after 2 days 11 hours) on my Ryzen 3700x, also on Ubuntu.
ID: 93760 · Report as offensive
Profileadrianxw
Avatar

Send message
Joined: 2 Oct 05
Posts: 404
Denmark
Message 93761 - Posted: 17 Nov 2019, 17:01:50 UTC
Last modified: 17 Nov 2019, 17:13:43 UTC

The last work unit has also dropped back to the 1.000% done state. Was 70+% earlier today. I'll leave them running for now and see what happens. The remaining is increasing rapidly, 600 days barrier will be broken in about an hour at the current rate.

<edit>
Quicker than I thought, broken whilst still able to edit this message!

Maybe someone there will think to look here for feedback.
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 93761 · Report as offensive
Profileadrianxw
Avatar

Send message
Joined: 2 Oct 05
Posts: 404
Denmark
Message 94041 - Posted: 5 Dec 2019, 20:25:16 UTC
Last modified: 5 Dec 2019, 20:26:53 UTC

Update, and an observation which might help them, should they actually be looking for any help.
All my tasks on here went back to 1%, with enormous "remaining" values. Since then, running 24/7, ie. not being swapped by BOINC..., 2 have advanced as far as 7.000%, 2 are at 6.000% and the last 3 at 5.000%. Something I have noticed is the "remaining" column. As I'd mentioned earlier, this has a ludicrous value in it, and it is increasing for all work units, BUT, the 2 at 7% appear to be increasing by smaller amounts than the 6%'s which in turn, seem to be increasing slower than the 5%'s. Earlier, I was seeing 600+ days remaining, now they range from 310 days, (7% done), up to 420 days, (5% done).
All have passed their deadline date by more than a week now.
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 94041 · Report as offensive
Jim1348

Send message
Joined: 8 Nov 10
Posts: 310
United States
Message 94042 - Posted: 5 Dec 2019, 21:02:56 UTC - in response to Message 94041.  

Trying to decipher their status page is a bit of an art. But I think that the fact that "avg runtime of last 100 results" shows "0.00" means that nobody completes them.
It is a new form of black hole.
http://qmcathome.org/server_status.php
ID: 94042 · Report as offensive
Profileadrianxw
Avatar

Send message
Joined: 2 Oct 05
Posts: 404
Denmark
Message 94043 - Posted: 5 Dec 2019, 21:12:12 UTC
Last modified: 5 Dec 2019, 21:14:37 UTC

It says earlier in the thread that even if the due date is passed, if you are the first to return a work unit, it is accepted. I can see that for at least a few of mine, I am the only person still running them, others have aborted or crashed them. Another thing is that they complete at different percentages, the shortest being 5% the longest 12%, (I think - don't quote me there).
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 94043 · Report as offensive
Dayle Diamond

Send message
Joined: 5 Apr 13
Posts: 16
United States
Message 94122 - Posted: 9 Dec 2019, 13:39:41 UTC - in response to Message 94043.  

It's not even letting me register for tasks - the CAPCHA doesn't load during registration.

Maybe they're not ready for anybody's help yet?
ID: 94122 · Report as offensive
Profileadrianxw
Avatar

Send message
Joined: 2 Oct 05
Posts: 404
Denmark
Message 94127 - Posted: 9 Dec 2019, 14:41:53 UTC
Last modified: 9 Dec 2019, 14:46:15 UTC

There are certainly issues with the current "cleanmobility.now 6.06" project. My 7 work units are all still running, uninterupted, more than 3 weeks in now. Two are at 8.000%, two at 7.000%, one at 6.000% and the last two at 5.000%.
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 94127 · Report as offensive
Profileadrianxw
Avatar

Send message
Joined: 2 Oct 05
Posts: 404
Denmark
Message 94260 - Posted: 14 Dec 2019, 12:44:31 UTC

One of my work units jumped from 10.000% to 100.000% overnight, it is still running though. The others, one at 10.000% three at 9.000% two at 6.000%, oh, the excitment.
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 94260 · Report as offensive
1 · 2 · Next

Message boards : Projects : QMC new jobs, but big trouble..

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.