Joined: 13 Jun 06
I post this question in the api section because I feel that more of my questions center around the workings of the api and communication.
I have found a lot of information while browsing the fourms, but there are still many questions. So I will explain what I have, what I am trying to accomplish, and see if anyone has any ideas and suggestions.
The project I am working on can generate a couple hundred gigs of data a day. We are gearing up to kick this project off on a larger scale and the first test we did got up to 500 gigs. So we have lots of data to work on.
We have written a shell script using the basics ( grep, sed, awk, ect ) that can parse through a good chunk of data rather quickly. We set up 20 high end server boxes with Linux and ran the shell script to grab a chunk of data, parse through it, and return the results. The results vary in size but rarely ever break about 10-20 MB. While we are considering a rewrite into a compiled language later down the road, at this time it just isnt a feasible solution without a larger staff.
We have many windows and linux computers that do next to nothing at night that we would like to be able to utilize. We have already determined that if we dump the executables (grep, sed, awk, ect) with all the information on to a computer ( Linux or Windows) that we can parse this data on that system. However, we still need a way to track which computers are doing what, the computer load, general information about the computer doing the work, and some way to regulate when the computer can do work. A distributed computing solution I believe to be best when we have so many computers unused for large amounts of time, but we need a way to control them.
Since I have been using Boinc for a while with other projects (Einstein@home, BURP, ect) I was wanting to see if we could use Boinc to utilize the many other computer systems. It provides all of the system monitoring that we are conserned about and I think that with a little bit of work, we can use Boinc for this project. The thing that concerns me the most is how Boinc communicates through the API and how Boinc runs its processes.
The way that I understand it, we will have to pass the data through Boinc with all the executables (probably just initially on first connect), have a Boinc program that calls the shell/windows script which in turn launches (grep, awk, sed, ect).exe, and then Boinc will return the answer. Is this even possible? Will Boinc even support this type of program (scripting), or should we consider something else?
Any comments and suggestions that might help are welcome.
Copyright © 2021 University of California. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.