Posts by MarcioCavalcanti

1) Message boards : Questions and problems : Please Help Me if You Can (Message 105965)
Posted 3 Nov 2021 by MarcioCavalcanti
Post:
I'm not in the script writing business, so cannot help you there. Perhaps someone else here can. Or you can find something on the interwebs.


Thanks. I asked around here because it would definitely bring many new users from the project into using Rosetta @ Home and consequently to BOINC. Hope a good soul with some time to spare appears that is able to help me with the script to download the user list daily and fetch the credit informations there for rewarding.
2) Message boards : Questions and problems : Please Help Me if You Can (Message 105958)
Posted 3 Nov 2021 by MarcioCavalcanti
Post:
I did not know that so thanks for the info. As I said I'm an amateur on the matter but I need to get the data so I asked devs and basically they unanimously said I would have to create a webscrapper.

So how cant I get the data by those rosetta/stats/? Would I need to create a script to download the user.gz list daily and pick up the info one by one? Would you be able to help me?
3) Message boards : Questions and problems : Please Help Me if You Can (Message 105956)
Posted 3 Nov 2021 by MarcioCavalcanti
Post:
Greetings.
I need to set up a script to fetch credits informations from some users in a team on R@H and then reward points in a "game" based proportionally on how many credits each participant scored on their Rosetta@Home activities.

This is 100% non profitable work from me trying to bring more people to help and on the good spirit of helping I have started trying to create a webscraper but honestly I'm terrible since I'm no more than an amateur dev. I don't even know if this code generates payout(rewarding) data properly.
Does anyone know if there is a rosetta@home api that could help or - even better - a webscraper for the rosetta@home website? Something similar to this project here would be great: https://github.com/stuckatsixpm/fah_scraper

Here is what I have managed to write (probably not gonna work or is gonna work badly) as the webscrapper for the rosetta@home website but I don't think it would even work. If anyone have some minutes of spare time and can help me out I would be forever thankful:
"
import argparse
import logging
import sqlite3
import datetime
import json
from collections import namedtuple

import requests
from bs4 import BeautifulSoup

UserStats = namedtuple("UserStats", ["name", "credit", "recent_average_credit"])

def init_db(db_file="./folding_data.db"):
logging.info("Initializing database.")
con = sqlite3.connect(db_file)
cur = con.cursor()
cur.execute(
"CREATE TABLE IF NOT EXISTS folding_data (name TEXT PRIMARY KEY, credit INTEGER, recent_average_credit INTEGER, credit_delta INTEGER, date_delta INTEGER, date INTEGER)"
)
con.commit()

return con


def fetch_stats(teamid=30157):
url = "https://boinc.bakerlab.org/rosetta/team_display.php?teamid={}".format(teamid)
r = requests.get(url)
if r.ok:
return r.text
else:
raise Exception("Failed to fetch: {}".format(url))


def log_stats(db, user_stats):
logging.debug("logging stats for user: {}".format(user_stats.name))
cur = db.cursor()
prev_credit = cur.execute(
"SELECT credit FROM folding_data WHERE name == '{}'".format(user_stats.name)
).fetchone()
prev_date = cur.execute(
"SELECT date FROM folding_data WHERE name == '{}'".format(user_stats.name)
).fetchone()
if prev_credit:
prev_credit = prev_credit[0]
else:
prev_credit = 0
if prev_date:
prev_date = prev_date[0]
else:
prev_date = 0
credit_delta = user_stats.credit - prev_credit
date = int(datetime.datetime.utcnow().timestamp())
date_delta = date - prev_date

cur.execute("DELETE FROM folding_data WHERE name == '{}'".format(user_stats.name))
cur.execute(
"INSERT INTO folding_data VALUES ('{}', {}, {}, {}, {}, {})".format(
user_stats.name,
user_stats.credit,
user_stats.recent_average_credit,
credit_delta,
date_delta,
int(datetime.datetime.utcnow().timestamp()),
)
)


def create_snapshot(db, teamid=30157):
logging.info("Creating snapshot.")
stats = fetch_stats(team=team)
soup = BeautifulSoup(stats, "html.parser")
members = soup.find_all("table", {"class": "members"})
rows = members[0].find_all("tr")

for row in rows[1:]:
try:
_, _, name, credit, recent_average_credit = [item.text for item in row.find_all("td")]
user_stats = UserStats(name, int(credit), int(recent_average_credit))
log_stats(db, user_stats)
except Exception as e:
logging.error("Failed to log data for user: {}".format(str(e)))

db.commit()


def save_snapshot(db, output="./folding_data.json"):
logging.info("Saving snapshot as JSON.")
cur = db.cursor()
names = cur.execute("SELECT name FROM folding_data")
snapshot = {}
for name in names.fetchall():
name = name[0]
data = cur.execute(
"SELECT credit_delta, date_delta FROM folding_data WHERE name = '{}'".format(
name
)
)
data = data.fetchall()
credit = data[0][0]
time = data[0][1]
if credit > 0:
snapshot[name] = {
"credit": credit,
"time": time,
}

with open(output, "w") as outfile:
outfile.write(json.dumps(snapshot))


def main():
parser = argparse.ArgumentParser()
parser.add_argument("team", help="The team ID to scrape. Ex. 234980")
parser.add_argument(
"--db-file",
default="./folding_data.db",
help="The path to the local stats DB. This DB will be created if it doesn't exist.",
)
parser.add_argument(
"--json-file",
default="./folding_data.json",
help="The path to the output JSON file containing the credit and time deltas.",
)
parser.add_argument("--verbose", action="store_true", help="Print debug logs.")
args = parser.parse_args()

if args.verbose:
logging.basicConfig(level=logging.DEBUG)
else:
logging.basicConfig(level=logging.INFO)

db = init_db(db_file=args.db_file)
create_snapshot(db, team=args.team)
save_snapshot(db, output=args.json_file)


if __name__ == "__main__":
main()
"




Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.