Snooze Bug?

Message boards : Questions and problems : Snooze Bug?
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Gary Charpentier
Avatar

Send message
Joined: 23 Feb 08
Posts: 2465
United States
Message 62384 - Posted: 30 May 2015, 3:12:44 UTC

Windows 7 Pro, SP1, 64bit, BOINC 7.2.42(x64), Project Find@home. Science Vina_1.2_windows_x86

When snooze, or activity suspend, is selected, these tasks do not snooze and continue to run, per the Task Manager. They show as stopped in the BOINC display.

Not sure if the issue is in BOINC or in the project's executable. BOINC may need to handle this even if the bug is in the executable.
ID: 62384 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5082
United Kingdom
Message 62385 - Posted: 30 May 2015, 8:46:55 UTC - in response to Message 62384.  

This is quite possibly caused by a bit of BOINC code called the 'Applications Programming Interface', or API. BOINC supplies this code, but each project compiles and links it into their science applications: it becomes part of the application, and is responsible for handling all communications between the application and BOINC. Including, in particular, listening for messages telling the application to suspend, and acting on them.

There was a bug in the API, causing apps to (mis-)behave in the way you describe. We called them "zombie tasks", and the bug was fixed - very appropriately - on halloween last year: f0c39bdf5117d8f7dd5092033971d7f700bd22dc.

So, any science app executable, at any project, deployed before November 2014 is likely to have this bug. It doesn't always show up: it's most likely to be a problem with applications that use GPUs. If those two criteria (date and GPU use) match, ask the project to update their API library and re-compile their application, drawing their attention to the code commit I've linked.
ID: 62385 · Report as offensive
Profile Gary Charpentier
Avatar

Send message
Joined: 23 Feb 08
Posts: 2465
United States
Message 62388 - Posted: 30 May 2015, 15:42:39 UTC - in response to Message 62385.  

This is quite possibly caused by a bit of BOINC code called the 'Applications Programming Interface', or API. BOINC supplies this code, but each project compiles and links it into their science applications: it becomes part of the application, and is responsible for handling all communications between the application and BOINC. Including, in particular, listening for messages telling the application to suspend, and acting on them.

There was a bug in the API, causing apps to (mis-)behave in the way you describe. We called them "zombie tasks", and the bug was fixed - very appropriately - on halloween last year: f0c39bdf5117d8f7dd5092033971d7f700bd22dc.

So, any science app executable, at any project, deployed before November 2014 is likely to have this bug. It doesn't always show up: it's most likely to be a problem with applications that use GPUs. If those two criteria (date and GPU use) match, ask the project to update their API library and re-compile their application, drawing their attention to the code commit I've linked.

Find@Home Is not a GPU project and it says on the Applications page they were issued on May 11, 2015. But I have no way of knowing what version of the API the developer linked to. I posted a link to here on a thread in their number crunching on the bug. http://findah.ucd.ie/forum_thread.php?id=224

Thanks
ID: 62388 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15483
Netherlands
Message 62393 - Posted: 30 May 2015, 16:49:24 UTC

Thanks for the executables, Gary.


Both vina_1.2_windows_intelx86.exe and vina_1.2_windows_x86_64.exe are built with API_VERSION_7.5.0
ID: 62393 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5082
United Kingdom
Message 62395 - Posted: 30 May 2015, 17:02:39 UTC - in response to Message 62393.  

Thanks for the executables, Gary.

Both vina_1.2_windows_intelx86.exe and vina_1.2_windows_x86_64.exe are built with API_VERSION_7.5.0

Which gives us an earliest possible download date of 10 June 2014 (64dc838626764f9d7b5f104099dd11149b89b61a), but sadly tells us nothing about whether the 31 October 2014 bugfix was applied.
ID: 62395 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5082
United Kingdom
Message 62404 - Posted: 31 May 2015, 9:33:26 UTC

I didn't pick up this impression from Gary's initial post, but it turns out that this problem is well known and much discussed at the project message board. The project developer expressed an intention to upgrade the apps by compiling against the new API on 24 March 2015, and reported the work completed on 27 April 2015: I think we can assume that the 11 May 2015 deployment doesn't have last year's bug.

Perhaps more significantly, the underlying application doesn't checkpoint - ever. Apparently the BOINC implementation of the code was done by IBM for World Community Grid: is 'not checkpointing' typical of WCG sub-projects?
ID: 62404 · Report as offensive
Profile Gary Charpentier
Avatar

Send message
Joined: 23 Feb 08
Posts: 2465
United States
Message 62406 - Posted: 31 May 2015, 15:26:04 UTC

Richard, didn't mean to lead anyone astray. Until I read you post, I hadn't looked on their board. However if this project doesn't checkpoint ever, then is this project perhaps "rouge" and perhaps should not be listed along with any other such projects? After all BOINC is documented to allow pause for several reasons and that also includes keeping applications in memory or not when suspended.
ID: 62406 · Report as offensive

Message boards : Questions and problems : Snooze Bug?

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.