| Info | Message |
|---|---|
| 41) Message boards : BOINC client : What is "debt" and how do I clear it up in Boinc?
Message 10909 Posted 15 Jun 2007 by Metod, S56RKO |
It seems to me that BOINC in first line of order tries to keep the STD in balance. Indeed. STD is regarded when BOINC decides which project to run next. That's done either when any project app finishes or after period of time set as Switch between applications every. First STD for all projects get recalculated. Projects that have positive STD and have some work in buffer will be regarded to run next (and typically those with largest positive STD will be run rather than those with lower positive STD). LTD is regarded when BOINC decides that it needs to top-up buffer. Only projects that have positive or small negative LTD will be allowed to fetch new work. My observation is that LTD gets recalculated only once per day. I've noticed regularly that STD can get as low as -86400 (that's the number of seconds in a day) and it gets added to LTD once a day. STD is reset to 0 at that moment. |
| 42) Message boards : BOINC client : What is "debt" and how do I clear it up in Boinc?
Message 10907 Posted 15 Jun 2007 by Metod, S56RKO |
Debt is just a measure of how much CPU time a project has gotten relative to other projects. Actually it's just the other way around: if LTD is negative, then project has gotten more than it's share. BOINC client will not fetch more work for that project until LTD gets close to 0. It'll start to fetch new work when LTD is still slightly negative (with it's absolute value proportional to number of CPUs, project resource share and setting of Computer is connected to the Internet about every). |
| 43) Message boards : BOINC client : "[error] GUI RPC bind failed: 99" under Debian Etch
Message 10509 Posted 27 May 2007 by Metod, S56RKO |
You'll get all the logging to terminal window ... Actually iptables doesn't prevent a progamme to bind (eg. use) a port. It just blocks access to that particular port from disallowed foreign address. The fact you can not successfully start BOINC even if run as root means that either there's another application running that already bound to that port or you have some other security mechanism preventing BOINC from binding to that port. According to what you wrote earlier there's no application using the port in question and I don't have any experience with security enhancement tools, so I can't help you here ... After loads of hacking, trial&error and searching the boinc sites (where the trouble shooting sites are out of date, I think ... ;) ) it works now ... :) Running BOINC using --no_gui_rpc cures symptom but not cause. Running BOINC without being bound to it's GUI RPC port means you can not administer your BOINC remotely - it can be done using BOINC manager or some third party software (eg. BoincView) running on another machine. Other than this BOINC doesn't need this port as it behaves just like any other application that needs net access from time to time (such as Web browser or e-mail client). |
| 44) Message boards : BOINC client : "[error] GUI RPC bind failed: 99" under Debian Etch
Message 10470 Posted 25 May 2007 by Metod, S56RKO |
Do you have any other ideas? Huh, I ran out of them :( It's puzzling me as I have several boxes here running Sarge and Etch and I don't any problems whatsoever. Just got one ;) ... You could try to run boinc as root - temporarily of course. If it doesn't complain about the same error, then this would mean that you have something installed that prevents normal users to start certain applications. Not that I'm aware of any such thing that would prevent from user's app to use port 31416, but anyhow. [edit] To run boinc as root, change working directory to the one you have installed boinc into. Then you have two possibilities:
|
| 45) Message boards : BOINC client : "[error] GUI RPC bind failed: 99" under Debian Etch
Message 10439 Posted 23 May 2007 by Metod, S56RKO |
I get the above error under Debian Etch (tested with boinc_5.8.16_i686 and also with boinc_5.9.11). With 5.8.16 it is nearly impossible to stop the client, whereas with 5.9.11 it stops automatically with "[error] GUI RPC bind failed: 99" ... :( Can you check if there's perhaps another programme already using port that BOINC is trying to bind to? Use command netstat -ap as root and check if there's something binding port number 31416. It would be listed in column Local Address and programme using that port would be stated in last column - PID/Program name. Some services are listed using port names rather than numbers. Mapping between number and name is taken from file /etc/services, so you may want to check if there's mapping for port 31416 ... |
| 46) Message boards : BOINC client : A Real Stumper
Message 10431 Posted 23 May 2007 by Metod, S56RKO |
But that would mean benchmarking at Performance Level X, but producing at X-60% depending on P-State achieved... Well, BOINC does record fraction of time available to projects: <cpu_efficiency>. When new work is requested, this value is taken into account. IMHO that's not a feature but a bug. That BOINC is supposed at low Priority is normal, but running at too low Priority to achieve full CPU performance should be reserved as an Option in the Preferences (e.g. "Grant running Projects Priority Level X when Idle, Level Y when in use"), not be the Default. One of statements made by majority of projects is that project application will use only rest resources ... those resources that would otherwise be idle. Now, if your CPU would otherwise throttle to minimum frequency, that's fine. Actually it's up to you as user or host administrator to set up minimum machine performance and BOINC will happily use all that is available. I second your idea about user being able to set run priority. It sould be set to lowest by default though. Implementation would be a bit tricky though. Windows only gives 6 possible settings, of those I'd only consider 3 safe (normal, low, lowest), while Unixes offer 40, of those 20 safe (from 0 to 19). A bit Off-Topic, but use of User-Definable P-States would allow for excellent Throttling according to user-defined conditions. I'd love that one in Summer and make it dependent off Temperatures, for example. BOINC has nothing to do with P states. As it does nothing to do with raw disk access, process scheduling and memory management. I'm not going to allow BOINC play with settings of my production machines. If you'd like to play with your, you're more than welcome to create cron jobs to play with system settings. I'm running scheduled tasks on a couple of my machines suspending them during working hours. And I can't let BOINC do it automagicaly as there are not enough venues to choose between. You could similarly change P-states or whatever based on sensor readings, calendar and level of work cache... |
| 47) Message boards : BOINC client : A Real Stumper
Message 10406 Posted 21 May 2007 by Metod, S56RKO |
It seems the BOINC benchmarking does manage to trigger it in some Versions, but the Project Clients I have seen did not while running. BOINC CC runs at normal priority (pri=0) at all times - including while it runs benchmarks. Well, perhaps some distributions change this behaviour, but Berkeley's BOINC does it like this. Project apps on the other hand run at lowest priority (pri=19). IMHO it's actually a good thing that low priority applications don't cause CPU run at high frequency. That's intention for them being low priority - to consume only resources that would otherwise be wasted. If one builds a mid-priority cruncher, then it's up to administrator of such machine to make appropriate changes to the CPU throttling policy. |
| 48) Message boards : BOINC client : BOINC scheduler explanation
Message 10319 Posted 18 May 2007 by Metod, S56RKO |
No, I didn't mean assigning processes to CPUs; I meant requiring a certain percent loading of a CPU. I don't know how that might work though. I've never experimented with this BOINC CC feature. My understanding, though, is that BOINC enforces CPU loading rather coarsely: eg. if user limits BOINC to 33% of CPU usage, it'll run a process for a second full speed and then suspend it for two seconds. It's a matter of probability and statistics but if you'd have setting to limit CPU usage to 50% then you'd probably end up seeing CPU usage of both project applications vary wildly from one refresh of top screen to another, mostly summing up as 100% (of a single CPU). In this case linux kernel scheduler could actually end up running both processes on the same physical CPU, but probably not all the time. The easiest way of checking if you do have CPU usage limited is to inpect contents of file global_prefs.xml. There's a line which looks something like this: <cpu_usage_limit>100</cpu_usage_limit> The line above says use all available CPU, it can be anything between 0 and 100 (%). |
| 49) Message boards : BOINC client : BOINC scheduler explanation
Message 10316 Posted 18 May 2007 by Metod, S56RKO |
Your assumptions were wrong. It's up to OS' scheduller to assign process to a certain CPU to execute as it's the only piece of software in your system that knows everything about all processes. There were customized versions of BOINC CC (mainly targeted at Windows) that enabled so called CPU affinity. This meant that BOINC CC explicitly instructed OSes scheduller to try to keep process on a particular CPU and not to migrate it around if not needed.
My guess is that you're also running rather newer version of Linux kernel than you used to on the old MB. And those changes in scheduler are rather recent. Also presence of two memory controllers makes huge difference as you actually didn't have NUMA architecture with the old MB. I can not see anything in your kernel configuration that is much different from setup as I have on my Opteron systems. And I have yet to see a CPU being idle while another CPU executes two project applications ... I do, however, run kernels 2.6.15-gentoo-r7 on Gentoo, 2.6.20 and Debian Etch and 2.6.21.1 on Debian Etch. Neither of them is exactly the same as yours so it just might make a difference. But: I'd be much surprised if any distribution packagers would go and change such a feature as scheduler. They mostly tweak parts that interface kernel with users or hardware (device drivers or such) and mostly stay away from core functionality. Basically I don't have any further idea :-( |
| 50) Message boards : BOINC client : BOINC scheduler explanation
Message 10305 Posted 17 May 2007 by Metod, S56RKO |
I think my problem is related. My BOINC core won't keep my CPUs loaded; it will sometimes run two projects and put 100% of each on each CPU, and then switch to 50% of each both on the same CPU. (And it always chooses the CPU that gets hotter; this seems to be from an assumption that the second CPU will stay cooler, but on this SuperMicro H8DCE that is wrong.) You said you're using Linux. You can check which processor runs which process real-time. Open terminal window and then run command top. Processor being used byy particular process is shown in column named 'P'. By default it won't show processor being used, so you have to switch on the column:
It's up to BOINC to start as many processes as there are CPUs available in the system. Seems that this task is performed as expected as you have two project apps running in parallel. It's up to operating system to schedule execution of processes that need CPU time effectively. In multiprocessor system this means also spreading running processes between processors. My experience is that Linux does this job decently. I'm not saying that it couldn't be done better but there are worse OSes in this regard. What is special with your machine is that you're running dual processor Opteron system. Opterons are special in the way they are handling system memory (RAM). They have memory controller on chip as opposed to slightly older architecture (as used by Intel) which features memory controller on north bridge. As you're running two processors this means your system has two memory controllers. This means your 4GB of RAM is actually split to two parts, most probably in halves. One half is controlled by first processor and second half by second processor. This kind of memory arrangement is called NUMA (Non-Uniform Memory Architecture) and is quite well known from massive parallel super-computers while AMD was first to introduce it to consumer market. BTW, connectivity between processors and between core of processor to its memory controller is implemented using Hyper-transport technology. The two sub-families of Opterons (2xxx and 8xxx) mostly difer by the number of Hper-transport links present: sub-family 2 has one and sub-family 8 has three. Now, when second processor runs process that has code or data located in memory region controlled by first processor, access to memory has to go through first processor and its memory controller. Which causes slight penalty. Newer Linux kernels have process scheduler that tries to keep processes running on the same physical processor as it might benefit from keeping code and data in CPU's cache. When running on NUMA architecture, this keeps inter-processor memory accesses to minimum. Side effect is what you're seeing. Even newer Linux kernels have another feature: it can move process' memory from one physical RAM region to another in case process runs on different processor than the one controlling the RAM in use. So to say: memory follows process. Now to your problem: I kind of doubt that both project applications get migrated to one processor just like this. Most probably they get there due to some other process running in your system. After other processes stop using CPUs it might happen that OS scheduler doesn't redistribute processes between CPUs immediately. It'd help if you could state version of Linux kernel you're running. [edit] spelling |
| 51) Message boards : BOINC client : Network status resumes spontaneously
Message 10246 Posted 14 May 2007 by Metod, S56RKO |
I've pretty much reproduced the same behaviour though with project preferences not by manually suspending the network. I'm not sure I understand this: how do you check wether network connectivity is suspended or not? Generally you have 3 possibilities: suspended, based on preferences and always active. I believe that you need to set it to based on preferences. You probably won't be able to see BOINC CC actually to suspend networking but rather you should be able to notice lack of any network activity outside allowed time frame. There's one exception: if you manually start an action that involves network activity (such as project update), then network connectivity will be enabled for a minute (or so). This might mean that your BOINC will contact also other projects apart from the one you wanted and it might report and/or request new work. |
| 52) Message boards : BOINC client : 5.9.x problem reports
Message 10207 Posted 11 May 2007 by Metod, S56RKO |
I've found out what I've had problems about: obviously there was a new tag in app_info.xml introduced: <platform>platform name</platform> Things are back to normal using modified version of app_info.xml (added the highlited line) ... [pre] <app_info> <app> <name>einstein_S5R2</name> </app> <file_info> <name>einstein_S5R2_4.21_i686-pc-linux-gnu</name> <executable/> </file_info> <file_info> <name>einstein_S5R2_4.21_i686-pc-linux-gnu.so</name> </file_info> <app_version> <app_name>einstein_S5R2</app_name> <version_num>421</version_num> <platform>x86_64-pc-linux-gnu</platform> <file_ref> <file_name>einstein_S5R2_4.21_i686-pc-linux-gnu</file_name> <main_program/> </file_ref> <file_ref> <file_name>einstein_S5R2_4.21_i686-pc-linux-gnu.so</file_name> </file_ref> </app_version> </app_info> [/pre] |
| 53) Message boards : BOINC client : 5.9.x problem reports
Message 10110 Posted 10 May 2007 by Metod, S56RKO |
The 64bit client will only download a 32bit application when the project has updated its scheduler to the latest version. I think the latest version is 5.14 Yeah, thought so ... My primary concern, though, is why anonymous platform mechanism doesn't seem to work for me anymore? |
| 54) Message boards : BOINC client : Missing Boinc_5.9.4 .msi File while trying to update.
Message 10102 Posted 10 May 2007 by Metod, S56RKO |
I tried to update to a newer version, and I come up with a missing .MSI file during the installation. It seems to be on a temp directory which does not exist anymore. I seem to remember that somebody wrote that you need to re-install the same version (5.9.4) as the installation file (.msi in this case) is not cached. After that, you can immediately update to newer version. |
| 55) Message boards : BOINC client : 5.9.x problem reports
Message 10081 Posted 9 May 2007 by Metod, S56RKO |
Hi! Seems to me that I'll have to edit this message many times as it triggers some anti-spam check if posted in one step. My problem: I've compiled BOINC CC 5.9.10 for x86_64-pc-linux-gnu. In principle it runs fine, however I have problems running Einstein. As this project doesn't support named platform neither natively nor via alternative platform, I have to use anonymous platform. This was fine until (and including) 5.8.15, but not anymore. Here's output from BOINC client: 2007-05-09 08:46:42 [---] Starting BOINC client version 5.9.10 for x86_64-pc-linux-gnu 2007-05-09 08:46:42 [---] log flags: task, file_xfer, sched_ops 2007-05-09 08:46:42 [---] Libraries: libcurl/7.15.5 OpenSSL/0.9.8c zlib/1.2.3 libidn/0.6.5 2007-05-09 08:46:42 [---] Data directory: /home/metodk/boinc 2007-05-09 08:46:42 [Einstein@Home] Found app_info.xml; using anonymous platform 2007-05-09 08:46:42 [SETI@home] Found app_info.xml; using anonymous platform 2007-05-09 08:46:42 [---] Processor: 4 AuthenticAMD AMD Opteron(tm) Processor 280 [Family 15 Model 33 Stepping 2] 2007-05-09 08:46:42 [---] Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt lm 3dnowext 3dnow pni cmp_legacy 2007-05-09 08:46:42 [---] Memory: 15.53 GB physical, 9.54 GB virtual 2007-05-09 08:46:42 [---] Disk: 238.31 GB total, 150.62 GB free 2007-05-09 08:46:42 [---] Version change (5.9.8 -> 5.9.10) 2007-05-09 08:46:42 [climateprediction.net] URL: http://climateprediction.net/; Computer ID: 417453; location: work; project prefs: default 2007-05-09 08:46:42 [Einstein@Home] URL: http://einstein.phys.uwm.edu/; Computer ID: 574403; location: work; project prefs: default 2007-05-09 08:46:42 [SETI@home] URL: http://setiathome.berkeley.edu/; Computer ID: 2271751; location: work; project prefs: default 2007-05-09 08:46:42 [rosetta@home] URL: http://boinc.bakerlab.org/rosetta/; Computer ID: 496760; location: work; project prefs: default 2007-05-09 08:46:42 [SETI@home Beta Test] URL: http://setiweb.ssl.berkeley.edu/beta/; Computer ID: 6546; location: work; project prefs: default 2007-05-09 08:46:42 [---] General prefs: from SETI@home (last modified 2007-05-06 13:04:28) 2007-05-09 08:46:42 [---] Host location: work 2007-05-09 08:46:42 [---] General prefs: no separate prefs for work; using your defaults 2007-05-09 08:46:42 [---] Preferences limit memory usage when active to 7949.60M B 2007-05-09 08:46:42 [---] Preferences limit memory usage when idle to 14309.28MB 2007-05-09 08:46:42 [---] Preferences limit disk usage to 9.31GB 2007-05-09 08:46:42 [---] Running CPU benchmarks 2007-05-09 08:47:17 [Einstein@Home] Sending scheduler request: Requested by user2007-05-09 08:47:17 [Einstein@Home] (not requesting new work or reporting completed tasks) 2007-05-09 08:47:27 [Einstein@Home] Scheduler RPC succeeded [server version 509] 2007-05-09 08:47:27 [Einstein@Home] Message from server: Resent lost result h1_0287.90_S5R2__76_S5R2c_1 2007-05-09 08:47:27 [Einstein@Home] Message from server: Resent lost result h1_0287.90_S5R2__75_S5R2c_1 2007-05-09 08:47:27 [Einstein@Home] Message from server: Resent lost result h1_0287.90_S5R2__74_S5R2c_1 2007-05-09 08:47:27 [Einstein@Home] Message from server: Resent lost result h1_0287.90_S5R2__73_S5R2c_0 2007-05-09 08:47:27 [Einstein@Home] Message from server: Resent lost result h1_0287.90_S5R2__72_S5R2c_0 2007-05-09 08:47:27 [Einstein@Home] Message from server: Resent lost result h1_0287.90_S5R2__118_S5R2c_3 2007-05-09 08:47:27 [Einstein@Home] Message from server: Resent lost result h1_0287.90_S5R2__134_S5R2c_3 2007-05-09 08:47:27 [Einstein@Home] Message from server: Resent lost result h1_0287.90_S5R2__122_S5R2c_3 [color=red]2007-05-09 08:47:27 [Einstein@Home] [error] No app version for result: x86_64-pc-linux-gnu -1 2007-05-09 08:47:27 [Einstein@Home] [error] No app version for result: x86_64-pc-linux-gnu -1 2007-05-09 08:47:27 [Einstein@Home] [error] No app version for result: x86_64-pc-linux-gnu -1 2007-05-09 08:47:27 [Einstein@Home] [error] No app version for result: x86_64-pc-linux-gnu -1 2007-05-09 08:47:27 [Einstein@Home] [error] No app version for result: x86_64-pc-linux-gnu -1 2007-05-09 08:47:27 [Einstein@Home] [error] No app version for result: x86_64-pc-linux-gnu -1 2007-05-09 08:47:27 [Einstein@Home] [error] No app version for result: x86_64-pc-linux-gnu -1 2007-05-09 08:47:27 [Einstein@Home] [error] No app version for result: x86_64-pc-linux-gnu -1[/color] 2007-05-09 08:47:27 [Einstein@Home] Deferring communication for 1 min 0 sec 2007-05-09 08:47:27 [Einstein@Home] Reason: requested by project Why the highlited lines? Needless to say that Einstein server thinks my host has those tasks while my host doesn't know anything about it. Here's my app_info.xml for Einstein:
<app_info>
<app>
<name>einstein_S5R2</name>
<user_friendly_name>Hierarchical all-sky pulsar search</user_friendly_name>
</app>
<file_info>
<name>einstein_S5R2_4.18_i686-pc-linux-gnu</name>
<executable/>
</file_info>
<file_info>
<name>einstein_S5R2_4.18_i686-pc-linux-gnu.so</name>
</file_info>
<app_version>
<app_name>einstein_S5R2</app_name>
<version_num>418</version_num>
<file_ref>
<file_name>einstein_S5R2_4.18_i686-pc-linux-gnu</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>einstein_S5R2_4.18_i686-pc-linux-gnu.so</file_name>
</file_ref>
</app_version>
</app_info>
Note: I first tried this with version 5.9.8 and saw these symptoms. When I saw anouncement about problems regarding 64-bit platform and alternative platform being solved, I decided to try 5.9.10. No luck though :( |
| 56) Message boards : BOINC client : unable to download seti work unit
Message 10073 Posted 8 May 2007 by Metod, S56RKO |
I have not been able to download a SETI work unit in weeks!! I'd suggest you to go and check SETI@home Web pages to find out the reason, but I'm quite sure you already did that. |
| 57) Message boards : BOINC client : Linux Installation Problem
Message 9620 Posted 16 Apr 2007 by Metod, S56RKO |
What about your tar and bash version? (I'm assuming your shell is bash). Huh ... are you sure your box is Linux? It might be some other variant of Unix. What does uname -a print out? System utilities (such as gunzip, tar, etc) usually take command line option --version if they are GNU by origin. Bash, for one, is ancient. [edit] So is your gunzip ... One of linux boxes I have access to gives: $ tar --version tar (GNU tar) 1.16 Copyright (C) 2006 Free Software Foundation, Inc. This is free software. You may redistribute copies of it under the terms of the GNU General Public License <http://www.gnu.org/licenses/gpl.html>. There is NO WARRANTY, to the extent permitted by law. Written by John Gilmore and Jay Fenlason. $ bash --version GNU bash, version 3.1.17(1)-release (i486-pc-linux-gnu) Copyright (C) 2005 Free Software Foundation, Inc. $ gunzip --version gunzip 1.3.5 (2002-09-30) Copyright 2002 Free Software Foundation Copyright 1992-1993 Jean-loup Gailly This program comes with ABSOLUTELY NO WARRANTY. You may redistribute copies of this program under the terms of the GNU General Public License. For more information about these matters, see the file named COPYING. Compilation options: DIRENT UTIME STDC_HEADERS HAVE_UNISTD_H HAVE_MEMORY_H HAVE_STRING_H HAVE_LSTAT ASMV Written by Jean-loup Gailly. [/edit] |
| 58) Message boards : BOINC client : Linux Installation Problem
Message 9556 Posted 13 Apr 2007 by Metod, S56RKO |
Please don't give up! If you execute the following: ( read l; read l; read l; exec cat ) < boinc_5.8.17_i686-pc-linux-gnu.sh > foo.tar.gz do you get a valid gzipped tar file? file should be something like this: $ file foo.tar.gz foo.tar.gz: gzip compressed data, from Unix, last modified: Thu Mar 8 03:29:46 2007 The date part will be different in your case of course. |
| 59) Message boards : BOINC client : Linux Installation Problem
Message 9490 Posted 10 Apr 2007 by Metod, S56RKO |
Your browser probably downloaded file as ASCII instead of BINARY. I guess that running Your question is answered on this web page: http://wiki.archlinux.org/index.php/How_to_make_wget_to_work_with_proxy_and_proxy_authentification. |
| 60) Message boards : BOINC client : Linux Installation Problem
Message 9486 Posted 10 Apr 2007 by Metod, S56RKO |
I have downloaded boinc_5.8.15_i686-pc-linux-gnu.sh onto my linux server. When I try installing it by typing: Your browser probably downloaded file as ASCII instead of BINARY. I guess that running wget http://boinc.berkeley.edu/dl/boinc_5.8.17_i686-pc-linux-gnu.sh from linux command prompt should do it properly. |
Copyright © 2025 University of California.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License,
Version 1.2 or any later version published by the Free Software Foundation.