wiki:RightToErasure

Version 4 (modified by Kevin Reed, 6 years ago) (diff)

--

Right to Erasure

The European GDPR law provides for a "Right to Erasure". An interpretation of this law is that it gives users the right to have all of their data stored in an electronic system deleted from that system if they request it. This proposal exists as an attempt to satisfy that requirement of the GDPR law.

User Experience

The BOINC website will provide a new page that allows a user to start the process of deleting their account. The link to this page will be found on home.php under Your account -> Account information -> Change -> delete account. The page will be <base_url>/request_delete_account.php.

Once the user visits this new page, they will be presented with text that states:

You have the ability to delete your account and all related data.  Please note that this cannot be undone once it is completed.  The process works as follows:

- Enter in your password below and click on the “Delete my Account” button
- You will receive an email which contains a link.  Please click on that link.
- On the page displayed, you will need to re-enter your password and then click “Delete my Account”
- At this point the account is scheduled to be deleted.  The actual deletion will occur within 48 hours.  

At any point before the delete occurs you can return to this page and cancel the scheduled deletion.

Once the user provides their password, an email is sent to the user with a link that is similar to:

<base_url>/confirm_delete_account.php?userid=<userid>&token=<token> 

When they click on the link they will be taken to a page that asks them if they are sure that they want to delete their account. They must re-enter their password and click the button that says "delete account" in order to have there account deleted. They will be informed that their account has been scheduled to be deleted within the next 48 hours.

If the user returns to the request_delete_account.php while they have an active token, the page will ask them to check their email for the email that was sent and then provide them with an option to generate a new email if they cannot find the first one. This will again require the user to enter their password to generate the email.

If the user returns to the request_delete_account.php while the delete is scheduled they will see a message that states that there account is scheduled to be deleted but that they can cancel the deletion by clicking a button.

Technical Implementation

New Tables

token

  • token varchar(254) not null pk
  • userid int not null
  • type char not null
  • create_time int not null default unix_timestamp()
  • expire_time int not null

user_deleted

  • userid int not null pk
  • public_cross_project_id varchar(254) not null
  • delete_time not null default unix_timestamp()

host_deleted

  • hostid int not null pk
  • public_cross_project_id varchar(254) not null
  • delete_time not null default unix_timestamp()

Altered Tables

user

  • delete_request_time int nullable

Token Generation

Tokens will be generated from the function inc/util.inc random_string(). However, that function currently relies on functions that are not cryptographically secure. As a result, this function will be replaced with the following implementation:

function random_string() {
    return bin2hex(random_bytes(16));
}

A couple of notes about this choice:

Token Usage

The token generated by random_string() and included in the email will be stored on the token table and will be set to expire after 24 hours. The token type for this will be "D" for delete.

If the user clicks on the link in the email and the token is invalid or expired, they will be presented with a page that states that the link was invalid and that they need to return to the request_delete_account.php and request a new link.

Data Exports

BOINC provides a mechanism for the mass export of data (db_dump). GDPR requires that this mechanism also provide notification to consumers of that data that accounts have been deleted.

The current format of the data for users looks like (user.xml):

<user>
 <id>13306</id>
 <name>etest051717a</name>
 <country></country>
 <create_time>1495032737</create_time>
 <total_credit>1218.038168</total_credit>
 <expavg_credit>0.088678</expavg_credit>
 <expavg_time>1504635602.002442</expavg_time>
 <cpid>0213f2f995c5a3fd86aec4b79b08a05d</cpid>
 <teamid>118</teamid>
</user>
<user>
 <id>13384</id>
 <name>etest062917a</name>
 <country></country>
 <create_time>1498749567</create_time>
 <total_credit>6740.232830</total_credit>
 <expavg_credit>0.096624</expavg_credit>
 <expavg_time>1506990001.965522</expavg_time>
 <cpid>a09031094836310f043f0ff8bcfca355</cpid>
</user>

We are proposing this is changed to so that users that remain continue to be exported in users.xml as before. However, users who have been deleted will no longer be exported in this file.

<user>
 <id>13306</id>
 <name>etest051717a</name>
 <country></country>
 <create_time>1495032737</create_time>
 <total_credit>1218.038168</total_credit>
 <expavg_credit>0.088678</expavg_credit>
 <expavg_time>1504635602.002442</expavg_time>
 <cpid>0213f2f995c5a3fd86aec4b79b08a05d</cpid>
 <teamid>118</teamid>
</user>

However, there will also be a new file created called user_deleted.xml that contains a list of users who have been deleted.

<user>
 <id>13384</id>
 <cpid>a09031094836310f043f0ff8bcfca355</cpid>
</user>

The user_deleted file indicates those users that must be deleted from the downstream system. The db_dump utility would first export records from the user table and then export records from the user_deleted table to generate these two files.

Hosts would be handled similarly where the existing host.xml would only contain those hosts that still exist on the host table and that host_deleted.xml would contain those that were deleted and would look like:

<host>
    <id>884</id>
    <host_cpid>36e9d265f8fe553bedbbef1cd21a6182</host_cpid>
</host>

This portion of db_dump would pull from host and then from host_deleted to create the files.

confirm_delete_account_action.php

On the confirm_delete_account.php page, then token should be included as a hidden field. The confirm_delete_account_action.php page that receives the request should validate both the users password and the users token before processing. Only if both are valid should this occur. If they are both valid, then the user.delete_request_time will be set to unix_timestamp().

delete_account.php

The delete of the account is final and is not recoverable in anyway. The delete_account.php script would be a script that runs once a day that seeks user records that have a value of delete_request_time < unix_timestamp() - 24*3600 and delete_request_time > 0 (in case for some reason a 0 gets inserted instead of null).

For each user identified, the delete_account.php will consist of the following actions:

  • An entry will be inserted into the user_deleted table
  • An entry will be inserted into the host_deleted table for each host record the user has.
  • All entries for the user will be deleted from the following tables:
    • badge_user
    • banishment_vote
    • credit_user
    • credited_job
    • donation_paypal
    • forum_logging
    • forum_preferences
    • friend (where either user_src or user_dest equals the deleted userid)
    • host_app_version (for each host the user owns)
    • msg_from_host (for each host the user owns)
    • msg_to_host (for each host the user owns)
    • host
    • notify
    • post_ratings (for each post the user created)
    • post_ratings (for each post the user rated)
    • post (for each post made by the user, find any posts that has that as a parent and set parent_post_id to null)
    • post (remove posts made by the user)
    • private_messages
    • sent_email
    • subscriptions
    • team_admin
    • team_delta
    • user
  • Problematic Tables
    • team (when user_id is for the user, how to remove since field is not null)
  • Note that rows in the following are not deleted because these will be deleted in due course and are necessary for technical operation of the system:
    • result
  • Questions.
    • thread – do we need to remove thread?
    • user_submit
    • user_submit_app

Final Removal

A script that runs once a day will be developed that removes entries from the user_deleted and host_deleted tables when create_time indicates that they are over 60 days old. This provides sufficient time for consumers of the data export to receive notification of the deletion and to remove the data from their system.