Possible 7.8.2 error? - see inside

Message boards : BOINC client : Possible 7.8.2 error? - see inside
Message board moderation

To post messages, you must log in.

AuthorMessage
Ulrich Metzner

Send message
Joined: 5 Mar 16
Posts: 11
Germany
Message 81580 - Posted: 25 Sep 2017, 21:59:16 UTC

Hi there,

i recently installed 7.8.2 replacing 7.6.33 and i suddenly got validation errors on the GT430, the GT 640 was fine.
Because this was the only thing that changed, i reinstalled 7.6.33 and guess what - the GT 430 runs fine again!
This happened in the milkyway project - see: https://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=4176

Best regards, Uli
ID: 81580 · Report as offensive
Ulrich Metzner

Send message
Joined: 5 Mar 16
Posts: 11
Germany
Message 81820 - Posted: 5 Oct 2017, 11:11:21 UTC

If anyone is interested, there are the last invalid reports in my account at the link in above post. They will vanish soon.
I think, this is a serious bug, cause it is 100% reproducible by installing 7.8.2 and reinstalling 7.6.33 fixes it.
Maybe next week i'll give 7.8.3 a try, but for now i am too busy.
ID: 81820 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5077
United Kingdom
Message 81823 - Posted: 5 Oct 2017, 11:41:42 UTC - in response to Message 81820.  
Last modified: 5 Oct 2017, 11:50:08 UTC

OK, I've looked at an invalid task, and I've found a valid task run with the same application with roughly the same runtime - and yes, that proved to have been run on the GT 430 under BOINC v7.6.33. So, it's plausibly a valid comparison.

I extracted and saved stderr_txt from each task. The first thing to note is that the file for the invalid task is 64 KB (and shows signs that this is the final 64 KB of a longer file - it's truncated): the valid file is 17 KB.

Secondly, the invalid file contains the phrase 'called boinc_finish(0)' five times: the valid file contains it only once, at the very end.

So, I conclude that the application is looping.

This could be because of the known, reported, bug in v7.8.2, where a slot directory is not fully cleansed after use. This bug has been fixed in v7.8.3, so I don't intend to look any further until test results are available for v7.8.3 - please let us know when you have some.

To save information for future investigation, we're talking about MilkyWay@Home v1.46 (opencl_nvidia_101) on host 616064
ID: 81823 · Report as offensive
Ulrich Metzner

Send message
Joined: 5 Mar 16
Posts: 11
Germany
Message 81826 - Posted: 5 Oct 2017, 13:04:56 UTC - in response to Message 81823.  

Secondly, the invalid file contains the phrase 'called boinc_finish(0)' five times: the valid file contains it only once, at the very end.

Thanks for the look at!

The downloaded WUs @milkyway consist of a pack of 5 conventional WUs in one pack.
Could be, the 7.8.2 client does not understand the way WUs are packed together, as the 7.6.33 client does?
The 7.6.33 client puts the 5 results together to one, while the 7.8.2 client handles them separately?

Anyway, strange enough this only happens on the GT 430, while the GT 640 on the very same machine is peacefully crunching away...
ID: 81826 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5077
United Kingdom
Message 81827 - Posted: 5 Oct 2017, 14:10:38 UTC - in response to Message 81826.  
Last modified: 5 Oct 2017, 14:20:24 UTC

No, I don't think that's a likely explanation. You know the Milkyway application better than I do: you might like to take a look at one of the valid 17 KB files yourself - it did seem rather repetitive to me, but I didn't have the patience to count to five...

I did line up the two files at the bottom, and the structure seemed identical back up to what seemed like a normal start (valid case), intrusive 'called boinc finish (0)' (invalid case). That's what made me suspect a complete task loop, rather than any form of sub-tasking.

Edit - when you get a chance to test and report using v7.8.3, could you please include the segment of the client Event Log which covers the processing of the test task. You'll be familiar with the BOINC scenario

        "Task %s exited with zero status but no 'finished' file",
        "If this happens repeatedly you may need to reset the project."
- I would wonder whether this might have appeared around the time of the failures.
ID: 81827 · Report as offensive
Ulrich Metzner

Send message
Joined: 5 Mar 16
Posts: 11
Germany
Message 81879 - Posted: 9 Oct 2017, 0:53:47 UTC

Problem seems to be solved by BOINC 7.8.3 - Thank you! :)
ID: 81879 · Report as offensive

Message boards : BOINC client : Possible 7.8.2 error? - see inside

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.