Fun with versions 6.4.5 and 6.6.3/4

Message boards : Questions and problems : Fun with versions 6.4.5 and 6.6.3/4
Message board moderation

To post messages, you must log in.

AuthorMessage
Furlozza
Avatar

Send message
Joined: 9 Feb 09
Posts: 8
Australia
Message 22976 - Posted: 9 Feb 2009, 15:13:09 UTC

Am using a dual boot (ubuntu 64 and XP pro 32) which has been working well in both until today.

Switching back to Windows, where I had V 6.2.x installed, it started quite nicely as usual, downloaded the units I wanted and then when I wanted to move to see what jobs I had in the task bar it Froze. THe WUs continued but closing the Manager saved them. I installed your latest version and the same thing occurred again. The jobs that were downloaded were for Milkyway (V17) and Seti Cuda (WUs I note that cause similar events in windoze 64 as well) I also am running CPDN and WCG, but they have been suspended since reopening BM the first time today.

I uninstalled Boinc and reinstalled V6.6.4 with the same result (hangs on trying to move to "Tasks" but WUs continue to be processed). Another uninstall and then reinstall 6.4.5 and clicking on "Tasks" doesn't freeze Boinc and I can see what is in puter and what is being done.

When I had first installed 6.4.5 back in January and used CUDA, whilst others talked of the need for a cc_config.xml file, I didn't need one as my E6850 performed on all three cylinders (CPU and GPU). However, this time I needed to use the cc_config.xml file which I compiled and placed in the correct folder, because it started doing both CUDA for Seti and two Milkywave WUs at once. AS I type, it is doing THREE, thats right, THREE Milkyway units at once. Not bad for a Dual core plus GPU to do Three CPU WUs at once.

Is it possible to get a cure to ANY of the problems before I just decide to quit, period?

EDIT: Under 6.4.5, The process tree for boinc reads:
explorer.exe->boincmgr.exe->boinc.exe->MB_6.08;Milkyway_0.17;Milkyway_0.17; Milkyway_0.17

In 6.6.4 it went:
explorer.exe->boinc.exe->MB_6.08;Milkyway_0.17;Milkyway_0.17

System->smss.exe->winlogin.exe->(services)boincmgr.exe

This is according to Sysinternals Process Explorer. I am not too sure of the exact path for manager under 6.6.4, but am definite in that it was up in the Systems area, and NOT under explorer.exe.
Hope this helps.
ID: 22976 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5082
United Kingdom
Message 22977 - Posted: 9 Feb 2009, 15:43:28 UTC

I have to ask - what were you hoping to achieve by downloading an application clearly marked

(MAY BE UNSTABLE - USE ONLY FOR TESTING)

It sounds as if these versions are exactly as described: UNSTABLE - lol.

If your intention is to help the developers uncover the bugs - welcome aboard. Please post your error messages and debug logs here.

If you would prefer to continue stable production crunching for your science projects, I suggest you stick with the recommended stable version.
ID: 22977 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15483
Netherlands
Message 22978 - Posted: 9 Feb 2009, 15:52:55 UTC - in response to Message 22976.  

However, this time I needed to use the cc_config.xml file which I compiled and placed in the correct folder, because it started doing both CUDA for Seti and two Milkywave WUs at once. AS I type, it is doing THREE, thats right, THREE Milkyway units at once. Not bad for a Dual core plus GPU to do Three CPU WUs at once.

And what was in that cc_config.xml file? By chance the <ncpus>3</ncpus> flag?

Please read what options in cc_config.xml do.
<ncpus>
Act as if there were N CPUs: run N tasks at once. This is for debugging, i.e. to simulate 2 CPUs on a machine that has only 1. Don't use it to limit the number of CPUs used by BOINC; use general preferences instead.

The hanging and crashing of BOINC Manager when going to the Task tab is being looked into. If you can post the last stack trace that is in the stderrgui.txt file, the developers can check that against what they are already suspecting it may be.

And else, just do not use 6.6.x until it's the recommended release.
ID: 22978 · Report as offensive
Furlozza
Avatar

Send message
Joined: 9 Feb 09
Posts: 8
Australia
Message 22980 - Posted: 9 Feb 2009, 16:19:07 UTC - in response to Message 22978.  
Last modified: 9 Feb 2009, 16:33:19 UTC

The config file may not be necessary as I have found that it is BOINC's habit of allocating priority to jobs that it feels may not complete in time that made the three milkyways WUs run at once. By suspending all bar 5 WUs, it now uses the GPU and two CPUs.

As for the files, I think They were wiped when I deleted all and every file/folder firstly via windows and then manually for a complete full clean install of 6.4.5.

I had been using 6.6.3 without any troubles until I downloaded WUs for both Seti and Milkyway today. Milkyway was the newest program I had joined in Windoze, whereas Seti had been running perfectly well in 6.6.3 as a CUDA with modified app.

I repeat the following:

EDIT: Under 6.4.5, The process tree for boinc reads:

explorer.exe->boincmgr.exe->boinc.exe->MB_6.08;Milkyway_0.17;Milkyway_0.17; Milkyway_0.17

In 6.6.4 it went:

explorer.exe->boinc.exe->MB_6.08;Milkyway_0.17;Milkyway_0.17 AND

System->smss.exe->winlogin.exe->(services)boincmgr.exe


IE, in 6.6.4 there were TWO limbs for Boinc as opposed to 6.4.5's single all encompassing limb.

Oh, and am aware they are not stable.... but when a program goes from stable to inoperable and all that has changed is the downloading of WUs from a new project, ...............

EDIT: here's the file (shows how much I know)



Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x7C421FA1 read attempt to address 0x00000034

Engaging BOINC Windows Runtime Debugger...



********************


BOINC Windows Runtime Debugger Version 6.6.3


Dump Timestamp : 02/09/09 23:13:40
Loaded Library : dbghelp.dll
Loaded Library : symsrv.dll
Loaded Library : srcsrv.dll
Loaded Library : version.dll


Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x7C421FA1 read attempt to address 0x00000034

Engaging BOINC Windows Runtime Debugger...



********************


BOINC Windows Runtime Debugger Version 6.6.3


Dump Timestamp : 02/09/09 23:15:27
Loaded Library : dbghelp.dll
Loaded Library : symsrv.dll
Loaded Library : srcsrv.dll
Loaded Library : version.dll


Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x7C421FA1 read attempt to address 0x00000034

Engaging BOINC Windows Runtime Debugger...



********************


BOINC Windows Runtime Debugger Version 6.6.3


Dump Timestamp : 02/09/09 23:26:02
Loaded Library : dbghelp.dll
Loaded Library : symsrv.dll
Loaded Library : srcsrv.dll
Loaded Library : version.dll


Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x7C421FA1 read attempt to address 0x00000034

Engaging BOINC Windows Runtime Debugger...



********************


BOINC Windows Runtime Debugger Version 6.6.4


Dump Timestamp : 02/09/09 23:32:44
Loaded Library : dbghelp.dll
Loaded Library : symsrv.dll
Loaded Library : srcsrv.dll
Loaded Library : version.dll


Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x7C421FA1 read attempt to address 0x00000034

Engaging BOINC Windows Runtime Debugger...



********************


BOINC Windows Runtime Debugger Version 6.6.4


Dump Timestamp : 02/09/09 23:47:37
Loaded Library : dbghelp.dll
Loaded Library : symsrv.dll
Loaded Library : srcsrv.dll
Loaded Library : version.dll

Apologies if it is abridged....... could reinstall 6.6.4 again and see what transpires if necessary.
ID: 22980 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15483
Netherlands
Message 22981 - Posted: 9 Feb 2009, 16:35:51 UTC - in response to Message 22980.  

The config file may not be necessary as I have found that it is BOINC's habit of allocating priority to jobs that it feels may not complete in time that made the three milkyways WUs run at once.

Um... it will run 2 tasks at maximum (one per CPU or core) in high priority mode (earliest deadline first mode, it won't raise the priority of the application in Windows), perhaps swapping with a third one. But it will never try to run 3 tasks on two CPUs unless you set the NCPUS flag in cc_config.xml to 3. Running 2 tasks on one CPU will only slow the calculation of it down.

So what is in your cc_config.xml then?

but when a program goes from stable to inoperable and all that has changed is the downloading of WUs from a new project

That's not the only thing that has changed. A lot of things have changed between 6.4.5 and 6.6.2 (completely new work fetch module and client_side CPU scheduler, both with a lot of bugs). Since that time the developers have tried to fix those bugs and introduced others, some of them can be recognized when you add projects, detach from them, re-allow work fetch from a project that's been on NNT for a long time, etc. etc.

Between 6.6.2 and 6.6.3 the way that Long Term Debt is calculated and used is completely changed. Now no project has a positive LTD, the maximum is zero (well, 1.0, as I found). Do check the changes log for {shock} the changes between versions.
ID: 22981 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5082
United Kingdom
Message 22982 - Posted: 9 Feb 2009, 16:37:26 UTC - in response to Message 22980.  

.... but when a program goes from stable to inoperable and all that has changed is the downloading of WUs from a new project, ....

That's an interesting comment. I doubt that Milky Way made the difference, but downloading SETI tasks would load the GPU and try to display plan_class data into the tasks pane. That's the point where the developers believe the fault lies.

Do we have any verifiable reports for v6.6.3/4 that might suggest that the tasks pane is stable and usable if only CPU tasks are loaded, but crashes as frequently reported when a plan_class is used? That would be a pretty powerful clue.

Also, do we know whether the developers' pre-release test machine (I presume they have one?):

a) is fitted with a CUDA card?
b) is actually loaded with CUDA tasks?
ID: 22982 · Report as offensive
Furlozza
Avatar

Send message
Joined: 9 Feb 09
Posts: 8
Australia
Message 22983 - Posted: 9 Feb 2009, 16:55:52 UTC - in response to Message 22981.  

cc_config read:

<cc_config>
<options>
<ncpus>3</ncpus>
</options>
</cc_config>

The reason for this was that it was a suggested work around when BOINC wasn't picking up CUDAs, and I thought such was the case now with 6.4.5. since the reason for the move to 6.6.3 was that it supposedly read and comprehended that there was a GPU capable of calcs.

The new fetch system is a right proper PITA if wanting to concentrate on one project whilst using a dual core. To be told that you have to wait 24 hrs to have the program deign to download work is downright annoying. And the "debt"???? wasn't aware I was getting back into that cycle.

HMMMMMmmmmm since as you say there were changes in the fetch, I suppose that coming back into Xp from ubuntu created the screw-up. It sure would explain why I had received no new jobs for three separate days operating in XP before returning to ubuntu on each occasion and getting work on a consistent basis. (ABC, milkyway, rosetta and Einstein)

I was going to ask if it would be possible for the actual person operating the computer on which Boinc is operating to actually ask for work, rather than have the messages say that I am not asking for work, but just hitting the update button cause I like to waste time, with a cache that has just two jobs in it neither of which relate to the project I have clicked on requesting the update.

Footnote: and they say sarcasm is a dying art??





ID: 22983 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15483
Netherlands
Message 22984 - Posted: 9 Feb 2009, 16:56:48 UTC - in response to Message 22982.  

Also, do we know whether the developers' pre-release test machine (I presume they have one?):

a) is fitted with a CUDA card?
b) is actually loaded with CUDA tasks?

If by pre-release machine you mean the computers they themselves run, then the answer is:
David: a+b yes
Rom: a+b yes (Windows only)
Charlie: a yes, b no as Nvidia hasn't released a 64bit driver yet, so no CUDA on Mac.

Other developers:
Eric: a+b yes
Youknowwho: a+b yes.
ID: 22984 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15483
Netherlands
Message 22985 - Posted: 9 Feb 2009, 17:10:22 UTC - in response to Message 22983.  
Last modified: 9 Feb 2009, 17:16:52 UTC

cc_config read:

<cc_config>
<options>
<ncpus>3</ncpus>
</options>
</cc_config>

The reason for this was that it was a suggested work around when BOINC wasn't picking up CUDAs, and I thought such was the case now with 6.4.5. since the reason for the move to 6.6.3 was that it supposedly read and comprehended that there was a GPU capable of calcs.

The ncpus workaround was only needed on 6.4.5, it was not needed since 6.5.0.. and so above also not needed. BOINC will do it for you.

So take out cc_config.xml and restart BOINC.

To be told that you have to wait 24 hrs to have the program deign to download work is downright annoying.

Told by whom? By BOINC or by the project? If the latter, then that's a scheduler message, which has nothing to do with the client. Are you sure the project isn't out of work, or you got the maximum amount of work for the amount of CPUs already? Just following clues I read in that sentence.

HMMMMMmmmmm since as you say there were changes in the fetch, I suppose that coming back into Xp from ubuntu created the screw-up. It sure would explain why I had received no new jobs for three separate days operating in XP before returning to ubuntu on each occasion and getting work on a consistent basis. (ABC, milkyway, rosetta and Einstein)

Um, no. The work fetch module works per BOINC, per hostID, not one for all BOINCs on your system. I mean, you running BOINC in Windows has no consequences for running BOINC in another OS, as both those BOINCs (what's the plural of BOINC?) have their own directories, their own debts, their own work, their own hostID etc. They cannot be compared to each other.

I was going to ask if it would be possible for the actual person operating the computer on which Boinc is operating to actually ask for work

Increase the amount of Additional days of work. But do watch out as the new work fetch can over-ask. I've noticed on a 1.25 Additional Days that my computer would ask way too much work with 6.6.3 (I never tried 6.6.4), maxing out the amount of work from Milkyway (24 tasks, 12 per CPU) on top of a couple of hundred tasks from Primegrid and work from Enigma, Einstein, Cosmology, Leiden and WCG.

Footnote: and they say sarcasm is a dying art??

And today is a quiet day... ignore my stronger posts, I am not relieving my aggression on you. I'll go bash some trolls in Oblivion for the next hour. ;-)
ID: 22985 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5082
United Kingdom
Message 22986 - Posted: 9 Feb 2009, 17:26:00 UTC - in response to Message 22985.  

(what's the plural of BOINC?)

BOINCen?

... maxing out the amount of work from Milkyway (24 tasks, 12 per CPU) ...

That could be where the 24 hours is coming from - daily quota limit reached, backoff set by project. You would need to look back in the logs (stdoutdae.txt, if the manager is playing up) and see what happened just before the 24-hours delay was imposed. There'll be a work fetch request for Milkyway, and a line about quota, if I'm right.
ID: 22986 · Report as offensive
Furlozza
Avatar

Send message
Joined: 9 Feb 09
Posts: 8
Australia
Message 22987 - Posted: 9 Feb 2009, 17:35:13 UTC

had a coffee and two ciggies.... or was it the other way around (it is 4.03AM in Oz)

here's what happened when i started Boinc last night:

Opened manager, suspended CPDN and WCG(in tasks window) and allowed Seti to get new work (in projects window), and it also garnered milkyway work as well. Milkyway had been NNWed last time I was in XP whereas the other three were allowed new work.

Clicked on 'Tasks' and Boinc froze (this is in 6.6.3) So I killed it via ctrl+alt+del

Opened up process explorer and tasks were still running so killed them off within process explorer.

Reopened boinc. same happened except I killed off the programs via Process Explorer. And again.

Deleted 6.3.3 and installed 6.3.4 (thought it might have been a glitch that had been corrected, although since I had used 6.3.3 before was rather surprised (too say the least) when it crashed. I can't be too sure, but it could be that this was the first time since installing 6.3.3 that new Seti work had been downloaded.........

Just read the versions change log..... be goood if I comprehended half of it though. .........

uninstalled 6.6.3 and installed 6.6.4 with the same results and the split process tree. Uninstalled 6.6.4 and installed 6.4.5. It ran, BUT only used 1 CPU whilst running GPU WU, hence the use of the cc_config file. It worked properly (2 CPU and 1 GPU) until Milkyway started saying it had priority jobs and stopped the GPU abd used "three" CPUs (rather ran two jobs on one CPU and another on the other CPU) until I suspended all bar three other milkyway jobs and Seti started using the GPU again, and milkyway used the two CPUs as was happening initially.

I know that 6.4.5 supposedly does the right thing with CUDA, but inmy case I needed to use the config file as a workaround, or else it would have had one CPU doing at most 5% work, when it feed data to the GPU.
ID: 22987 · Report as offensive
Furlozza
Avatar

Send message
Joined: 9 Feb 09
Posts: 8
Australia
Message 22988 - Posted: 9 Feb 2009, 17:56:53 UTC - in response to Message 22985.  

cc_config read:

<cc_config>
<options>
<ncpus>3</ncpus>
</options>
</cc_config>

The reason for this was that it was a suggested work around when BOINC wasn't picking up CUDAs, and I thought such was the case now with 6.4.5. since the reason for the move to 6.6.3 was that it supposedly read and comprehended that there was a GPU capable of calcs.

The ncpus workaround was only needed on 6.4.5, it was not needed since 6.5.0.. and so above also not needed. BOINC will do it for you.

So take out cc_config.xml and restart BOINC.


Am using 6.4.5 NOW

To be told that you have to wait 24 hrs to have the program deign to download work is downright annoying.

Told by whom? By BOINC or by the project? If the latter, then that's a scheduler message, which has nothing to do with the client. Are you sure the project isn't out of work, or you got the maximum amount of work for the amount of CPUs already? Just following clues I read in that sentence.


Actually, the new message (at least in 6.6.3) states that this request is being made by the USER and the NO tasks are being requested, but by George, I WAS requesting work. And I had downloaded and used 6.6.3 since 31/01/09 (01/31/09?? for yanks??*grin*)


I was going to ask if it would be possible for the actual person operating the computer on which Boinc is operating to actually ask for work

Increase the amount of Additional days of work. But do watch out as the new work fetch can over-ask. I've noticed on a 1.25 Additional Days that my computer would ask way too much work with 6.6.3 (I never tried 6.6.4), maxing out the amount of work from Milkyway (24 tasks, 12 per CPU) on top of a couple of hundred tasks from Primegrid and work from Enigma, Einstein, Cosmology, Leiden and WCG.


Actually, have work set at 2 days, since I use NNW to mange what tasks are done when. Maybe folks can opt out of this extra falderal
ID: 22988 · Report as offensive
Furlozza
Avatar

Send message
Joined: 9 Feb 09
Posts: 8
Australia
Message 22990 - Posted: 9 Feb 2009, 18:07:55 UTC - in response to Message 22986.  

In Ubuntu, I get a constant stream of work from Milkyway. It requests work from anything from 10 secs to up to an hour plus. Using the modified app, I do at least three WUs per hour per core and because of the 12 WUs per core limit it requests work regularly given the 2 days I want in cache (works in ABC as well)

Of course, the version there is....... 6.4.5. Was going to install 6.6.x but am now glad I didn't since it is enough fun and games changing owners and groups whenever a new modified app comes out.

I would presume (hope??) the same will apply within XP, but am clearing cache before installing the modified app for milky there.
ID: 22990 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15483
Netherlands
Message 22991 - Posted: 9 Feb 2009, 19:06:53 UTC - in response to Message 22988.  

Actually, have work set at 2 days, since I use NNW to mange what tasks are done when. Maybe folks can opt out of this extra falderal

falderal? Additional Work setting? Set it to zero.

With both Connect interval and additional work set to zero, you'll only ask 1 second of work for each project that's allowed to fetch work.
ID: 22991 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15483
Netherlands
Message 22992 - Posted: 9 Feb 2009, 19:09:26 UTC

There is a way around the crashing manager. Just use the BOINC Manager of an older installation that did work. See Richard's post here in Seti for how to do so... do read on to John Deer's posts, as I have had that same situation happen and had to go with that fix.
ID: 22992 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5082
United Kingdom
Message 23010 - Posted: 9 Feb 2009, 23:50:27 UTC

Another interesting post at SETI: message 863921.

This one seems to suggest that the v6.6.4 Manager is OK when showing only Astropulse and v6.08 (plan_class CUDA), but crashes when SETI v6.03 (no plan class) are mixed with v6.08

Since the precise benefit of v6.6.x is the ability to mix v6.03 and v6.08, this leaves us in a "damned if you do, damned if you don't" situation wrt upgrading.
ID: 23010 · Report as offensive
Furlozza
Avatar

Send message
Joined: 9 Feb 09
Posts: 8
Australia
Message 23011 - Posted: 10 Feb 2009, 0:56:54 UTC

6.4.5 plus config file are working. Thanks to lightning after last post, machine was off and I slept, so cache clearout is slow, but getting there. I shall be a coward and let those what know more than I (ie how the program should work and How it should work) do there job without further interuptions from me.

Oh, there is one thing though..... if BOINC says in messages that a new version is available.... could this be restricted to just the stable versions?
ID: 23011 · Report as offensive
Profile Gundolf Jahn

Send message
Joined: 20 Dec 07
Posts: 1069
Germany
Message 23013 - Posted: 10 Feb 2009, 8:58:34 UTC - in response to Message 23011.  
Last modified: 10 Feb 2009, 8:59:07 UTC

Oh, there is one thing though..... if BOINC says in messages that a new version is available.... could this be restricted to just the stable versions?

I SECOND THAT! And possibly to machines/OSs that are capable of installing them!

Gruß,
Gundolf
Computer sind nicht alles im Leben. (Kleiner Scherz)
ID: 23013 · Report as offensive

Message boards : Questions and problems : Fun with versions 6.4.5 and 6.6.3/4

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.