Feeder

Message boards : Server programs : Feeder
Message board moderation

To post messages, you must log in.

AuthorMessage
DJStarfox

Send message
Joined: 19 Jul 07
Posts: 17
Message 26070 - Posted: 17 Jul 2009, 17:28:54 UTC

Based on a discussion in this SETI thread, I would like to know if adding a double buffer to the feeder would result in a more efficient (and higher throughput) system. As most of you know, SETI is the largest project by number of volunteers, and their servers are extremely busy.

What performance metrics do we have to measure this part of the BOINC server-side system?

My goal is to help BOINC scale up better.
ID: 26070 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15477
Netherlands
Message 26072 - Posted: 17 Jul 2009, 18:21:30 UTC - in response to Message 26070.  

Forwarded to developers.
ID: 26072 · Report as offensive
Profile David Anderson
Volunteer moderator
Project administrator
Project developer
Avatar

Send message
Joined: 10 Sep 05
Posts: 717
Message 26074 - Posted: 17 Jul 2009, 18:32:23 UTC - in response to Message 26070.  

Not sure what you mean by double buffer.

But in any case, the feeder isn't usually a bottleneck.
When it is a bottleneck, it's because its DB query runs slowly,
and this is a MySQL issue
(typically it means the MySQL server doesn't have enough RAM)
ID: 26074 · Report as offensive
DJStarfox

Send message
Joined: 19 Jul 07
Posts: 17
Message 26075 - Posted: 17 Jul 2009, 18:41:56 UTC - in response to Message 26074.  

Having a slow database connection (in relation to local storage or system memory) is typical for most applications.... so in order to workaround that...

What I meant by a double buffer system was to have the feeder have two threads and two buffers (queues). The scheduler pulls from queue 1 while the feeder is busy filling queue 2. When queue 1 is empty, they both swap buffers. In theory, this should give the scheduler more throughput and allow for higher latencies on the database connection when filling the buffer.
ID: 26075 · Report as offensive
ZPM
Avatar

Send message
Joined: 14 Mar 09
Posts: 215
United States
Message 26079 - Posted: 17 Jul 2009, 20:43:00 UTC - in response to Message 26075.  
Last modified: 17 Jul 2009, 20:45:30 UTC

i see what you mean, running off of generator 1, while 2 is being refilled, and then switching back and forth like a green house running on battery backup(in this case, 2 battery back-up systems)... sort of in the way that we have the splitters... this way, request for work would be 24/7 and everyone would get work... but it doesn't help the bandwidth issue one darn bit.

instead of work being created in 10 minute interval or w/e, constant work; as long as we have enough raw data to go around..
ID: 26079 · Report as offensive
DJStarfox

Send message
Joined: 19 Jul 07
Posts: 17
Message 26083 - Posted: 17 Jul 2009, 22:26:26 UTC - in response to Message 26079.  

By reducing the failure rate (times clients connect but don't get any work because the feeder/scheduler is too busy), clients will request work less often, reducing bandwidth (a little). Also, but making the feeder/scheduler faster, the response time may improve too, also reducing overall bandwidth.

Just a theory. The big assumption is that the feeder is heavily delayed by the database and the scheduler is waiting on the feeder at least a significant amount of time.
ID: 26083 · Report as offensive
Nicolas

Send message
Joined: 19 Jan 07
Posts: 1179
Argentina
Message 26101 - Posted: 19 Jul 2009, 1:54:43 UTC - in response to Message 26074.  

When it is a bottleneck, it's because its DB query runs slowly,
and this is a MySQL issue
(typically it means the MySQL server doesn't have enough RAM)

Or it may mean that the database layout is horrible (XML blobs causing fragmentation).

ID: 26101 · Report as offensive
Nicolas

Send message
Joined: 19 Jan 07
Posts: 1179
Argentina
Message 26102 - Posted: 19 Jul 2009, 1:56:46 UTC - in response to Message 26075.  

What I meant by a double buffer system was to have the feeder have two threads and two buffers (queues). The scheduler pulls from queue 1 while the feeder is busy filling queue 2. When queue 1 is empty, they both swap buffers. In theory, this should give the scheduler more throughput and allow for higher latencies on the database connection when filling the buffer.

I don't understand how that would help. In the current code, while the feeder is busy filling the one and only buffer, the scheduler can still "pull from it". You seem to think the scheduler is locked from using the buffer while the feeder is doing a DB query to refill it.
ID: 26102 · Report as offensive
DJStarfox

Send message
Joined: 19 Jul 07
Posts: 17
Message 26106 - Posted: 19 Jul 2009, 2:45:10 UTC - in response to Message 26102.  

Well, my assumption was that there is a concurrency issue, and having two buffers would reduce the latency. But if just having a larger buffer would have the same benefit as two smaller buffers, then it would just make a lot more sense to have a bigger buffer (no programming).
ID: 26106 · Report as offensive

Message boards : Server programs : Feeder

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.