BOINC Sometimes Stops Processing (Milkyway@home Units)

Message boards : Questions and problems : BOINC Sometimes Stops Processing (Milkyway@home Units)
Message board moderation

To post messages, you must log in.

AuthorMessage
DuckyJ

Send message
Joined: 18 Aug 13
Posts: 2
United States
Message 50223 - Posted: 18 Aug 2013, 12:04:24 UTC

I seem to have a problem similar to http://boinc.berkeley.edu/dev/forum_thread.php?id=8529&postid=49990, but different enough that I thought a separate thread was in order.

BOINC 7.0.64 (x64) running as a service without the screensaver on a Dell Optiplex 7010, Intel Core i5-3570 CPU @ 3.40GHz, 8GB RAM, AMD Radeon HD 7470 with @4GB total graphics memory (1GB dedicated, 3GB shared) on driver version 8.922.0.0, Windows 7 64-bit SP1, DirectX 11

Projects: LHC@home 1.0, Milkyway@Home, MindModeling@Beta, SETI@home, World Community Grid

The problem always occurs with MilkyWay@home work units, though not all Milkyway@Home units cause this to happen. Every couplefew days I'll notice that there are two or more "virgin" Milkyway@Home units sitting in the queue alone with a status of "Ready to start", but no work is being done by BOINC at all, (confirmed in Windows Task Manager). If I try to download work units from other projects while in this state, it does connect to the project servers, but always fails with the error that the task isn't the "highest priority."

The only temporary solution I've found is to abort the stuck work units, at which point BOINC will automatically and successfully start polling my projects for new work units. Completely shutting down BOINC and restarting it when work units are stuck makes no difference. I've tried removing and re-adding the Milkyway@Home project, but the problem persists.

The attached log is following a complete shutdown and restart of BOINC while two Milkyway@home units were stuck in the queue, (and subsequently aborted):

8/18/2013 6:16:50 AM | | No config file found - using defaults
8/18/2013 6:16:50 AM | | Starting BOINC client version 7.0.64 for windows_x86_64
8/18/2013 6:16:50 AM | | log flags: file_xfer, sched_ops, task
8/18/2013 6:16:50 AM | | Libraries: libcurl/7.25.0 OpenSSL/1.0.1 zlib/1.2.6
8/18/2013 6:16:50 AM | | Running as a daemon
8/18/2013 6:16:50 AM | | Data directory: C:\ProgramData\BOINC
8/18/2013 6:16:50 AM | | Running under account boinc_master
8/18/2013 6:16:50 AM | | Processor: 4 GenuineIntel Intel(R) Core(TM) i5-3570 CPU @ 3.40GHz [Family 6 Model 58 Stepping 9]
8/18/2013 6:16:50 AM | | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 cx16 sse4_1 sse4_2 popcnt aes syscall nx lm vmx smx tm2 pbe
8/18/2013 6:16:50 AM | | OS: Microsoft Windows 7: Enterprise x64 Edition, Service Pack 1, (06.01.7601.00)
8/18/2013 6:16:50 AM | | Memory: 7.99 GB physical, 9.99 GB virtual
8/18/2013 6:16:50 AM | | Disk: 97.65 GB total, 51.51 GB free
8/18/2013 6:16:50 AM | | Local time is UTC -4 hours
8/18/2013 6:16:50 AM | | VirtualBox version: 4.2.16
8/18/2013 6:16:50 AM | | No usable GPUs found
8/18/2013 6:16:50 AM | LHC@home 1.0 | URL http://lhcathomeclassic.cern.ch/sixtrack/; Computer ID 10295304; resource share 100
8/18/2013 6:16:50 AM | Milkyway@Home | URL http://milkyway.cs.rpi.edu/milkyway/; Computer ID 532655; resource share 100
8/18/2013 6:16:50 AM | MindModeling@Beta | URL http://mindmodeling.org/; Computer ID 38793; resource share 100
8/18/2013 6:16:50 AM | SETI@home | URL http://setiathome.berkeley.edu/; Computer ID 7050626; resource share 100
8/18/2013 6:16:50 AM | World Community Grid | URL http://www.worldcommunitygrid.org/; Computer ID 2427552; resource share 100
8/18/2013 6:16:50 AM | SETI@home | General prefs: from SETI@home (last modified 15-Sep-2006 11:44:27)
8/18/2013 6:16:50 AM | SETI@home | Computer location: home
8/18/2013 6:16:50 AM | SETI@home | General prefs: no separate prefs for home; using your defaults
8/18/2013 6:16:50 AM | | Reading preferences override file
8/18/2013 6:16:50 AM | | Preferences:
8/18/2013 6:16:50 AM | | max memory usage when active: 4089.23MB
8/18/2013 6:16:50 AM | | max memory usage when idle: 7360.61MB
8/18/2013 6:16:50 AM | | max disk usage: 48.83GB
8/18/2013 6:16:50 AM | | max CPUs used: 2
8/18/2013 6:16:50 AM | | (to change preferences, visit a project web site or select Preferences in the Manager)
8/18/2013 6:16:50 AM | | Not using a proxy
8/18/2013 6:18:10 AM | Milkyway@Home | task de_nbody_07_23_dark_2_1372784655_756186_1 aborted by user
8/18/2013 6:18:11 AM | Milkyway@Home | task de_nbody_07_23_no_dark_2_1372784655_975095_2 aborted by user


Thanks,
DuckyJ
ID: 50223 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15483
Netherlands
Message 50224 - Posted: 18 Aug 2013, 14:25:34 UTC - in response to Message 50223.  

8/18/2013 6:16:50 AM | | Processor: 4 GenuineIntel Intel(R) Core(TM) i5-3570 CPU @ 3.40GHz [Family 6 Model 58 Stepping 9]

8/18/2013 6:16:50 AM | | Reading preferences override file
8/18/2013 6:16:50 AM | | Preferences:

8/18/2013 6:16:50 AM | | max CPUs used: 2

8/18/2013 6:18:10 AM | Milkyway@Home | task de_nbody_07_23_dark_2_1372784655_756186_1 aborted by user
8/18/2013 6:18:11 AM | Milkyway@Home | task de_nbody_07_23_no_dark_2_1372784655_975095_2 aborted by user

The Milkyway n-body tasks are multitasking, meaning that they want to use all the CPU cores that your system has to do the calculations. You tell BOINC to use only 2 of the 4 cores. The n-body application will wait until all 4 cores are available.

Ways around it?
1) Edit the Milkyway project preferences and uncheck the MilkyWay@Home N-Body Simulation application, save changes to the web site with the 'Update preferences' button at the bottom, Update the project in BOINC.

2) Tell BOINC to use all 4 cores.
ID: 50224 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5082
United Kingdom
Message 50225 - Posted: 18 Aug 2013, 16:20:30 UTC - in response to Message 50223.  

Do you ever use the BOINC Manager 'Tools|preferences' menu to change that 'max CPUs used' value?

My experience with the version of multithreading being used at Milkyway is that:

The number of cores that N-Body wants to use is set - for all cached tasks - each time work is downloaded from their server. If BOINC is using two cores at that moment, then the Milkyway work is set to require two cores: if BOINC is using all four cores when work is downloaded, then the work is set to require four cores.

The problem might occur if you download new work when four cores are available, but subsequently change your preferences to only allow two cores to run. The work which expects all four cores to be available might then be stalled indefinitely.

For more details, see my post on the subject at Milkyway.

There are reports at Milkyway that multithreading works differently on different operating systems - Alinator has had problems with WinXP/64 - but I'm using Win7/64 same as you. However, I did have problems with BOINC v7.0.64 (they were to do with GPUs, so they won't affect you): it's working better with the alpha v7.2.10 client, which may or may not be promoted to 'recommended' sometime this fall.
ID: 50225 · Report as offensive
DuckyJ

Send message
Joined: 18 Aug 13
Posts: 2
United States
Message 50235 - Posted: 19 Aug 2013, 7:55:32 UTC

Thanks, guys.

It turns out that the problem was the setting "On multiprocessor systems, use at most ___% of the processors" was set to 0. I had seen that while trying to figure out the problem, and thought it was weird at the time, but assumed 0 meant "no restrictions", because if it really meant 0% then nothing would work at all. How it got set to 0, (I'm positive I didn't do it), and why 0% means two cores, I have no clue.

To sum up the solution for anyone else having this problem:
Tools -> Computing preferences... -> "processor usage" tab -> Other options -> Set "On multiprocessor systems, use at most ___% of the processors" to 100
ID: 50235 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15483
Netherlands
Message 50238 - Posted: 19 Aug 2013, 15:04:51 UTC - in response to Message 50235.  

BOINC uses two variables to determine how many CPU cores it can use:
1. Use at most N processors.
2. Use at most X% of the processors.

When you set X% to zero, BOINC falls back to the value you set in N. N can only be set through a project's website, or by manually adding <max_cpus>N</max_cpus> to the global_prefs_override.xml file.
ID: 50238 · Report as offensive

Message boards : Questions and problems : BOINC Sometimes Stops Processing (Milkyway@home Units)

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.