wiki:RightToErasure

Right to Erasure

The European GDPR law provides for a "Right to Erasure". An interpretation of this law is that it gives users the right to have all of their data stored in an electronic system deleted from that system if they request it. This proposal exists as an attempt to satisfy that requirement of the GDPR law.

There will be a feature flag that must be enabled in order for this feature to be available on a project website. The feature flag will control the display of the link to the /delete_account_request.php page and it will prevent the use of /delete_account_request.php unless the flag as been enabled.

User Experience

The BOINC website will provide a new page that allows a user to start the process of deleting their account. The link to this page will be found on home.php under Your account -> Account information -> Change -> delete account. The page will be <base_url>/delete_account_request.php.

If the user has changed their email address within the past 7 days they will be prevented from deleting their account until the 7 days has elapsed. This is to prevent someone gaining access to an account, changing the email address and then deleting it without the notification going to the old email address. See Email Change Notification for more details about how we are safeguarding email address changes.

Once the user visits this new page, they will be presented with text that states:

You have the ability to delete your account and all related data.  Please note that this cannot be undone once it is completed.  The process works as follows:

- Enter in your password below and click on the “Delete my Account” button
- You will receive an email which contains a link.  Please click on that link.
- On the page displayed, you will need to re-enter your password and then click “Delete my Account”

At this point all information about your account will be immediately deleted.”

Once the user provides their password, an email is sent to the user with a link that is similar to:

<base_url>/delete_account_confirm.php?userid=<userid>&token=<token> 

When they click on the link they will be taken to a page that asks them if they are sure that they want to delete their account. They must re-enter their password and click the button that says "delete account" in order to have there account deleted. The account will be immediately deleted at this point and the user will be redirected to the project's home page.

If the user returns to the delete_account_request.php while they have an active token, the page will ask them to check their email for the email that was sent and then provide them with an option to generate a new email if they cannot find the first one. This will again require the user to enter their password to generate the email.

Technical Implementation

New Tables

token

  • token varchar(254) not null pk
  • userid int not null
  • type char not null
  • create_time int not null default unix_timestamp()
  • expire_time int not null

user_deleted

  • userid int not null pk
  • public_cross_project_id varchar(254) not null
  • delete_time not null default unix_timestamp()

host_deleted

  • hostid int not null pk
  • public_cross_project_id varchar(254) not null
  • delete_time not null default unix_timestamp()

Token Generation

Tokens will be generated from the function inc/util.inc random_string(). However, that function currently relies on functions that are not cryptographically secure. As a result, this function will be replaced with the following implementation:

function random_string() {
    return bin2hex(random_bytes(16));
}

A couple of notes about this choice:

Token Usage

The token generated by random_string() and included in the email will be stored on the token table and will be set to expire after 24 hours. The token type for this will be "D" for delete.

If the user clicks on the link in the email and the token is invalid or expired, they will be presented with a page that states that the link was invalid and that they need to return to the delete_account_request.php and request a new link.

Data Exports

BOINC provides a mechanism for the mass export of data (db_dump). GDPR requires that this mechanism also provide notification to consumers of that data that accounts have been deleted.

The current format of the data for users looks like (user.xml):

<user>
 <id>13306</id>
 <name>etest051717a</name>
 <country></country>
 <create_time>1495032737</create_time>
 <total_credit>1218.038168</total_credit>
 <expavg_credit>0.088678</expavg_credit>
 <expavg_time>1504635602.002442</expavg_time>
 <cpid>0213f2f995c5a3fd86aec4b79b08a05d</cpid>
 <teamid>118</teamid>
</user>
<user>
 <id>13384</id>
 <name>etest062917a</name>
 <country></country>
 <create_time>1498749567</create_time>
 <total_credit>6740.232830</total_credit>
 <expavg_credit>0.096624</expavg_credit>
 <expavg_time>1506990001.965522</expavg_time>
 <cpid>a09031094836310f043f0ff8bcfca355</cpid>
</user>

We are proposing this is changed to so that users that remain continue to be exported in users.xml as before. However, users who have been deleted will no longer be exported in this file.

<user>
 <id>13306</id>
 <name>etest051717a</name>
 <country></country>
 <create_time>1495032737</create_time>
 <total_credit>1218.038168</total_credit>
 <expavg_credit>0.088678</expavg_credit>
 <expavg_time>1504635602.002442</expavg_time>
 <cpid>0213f2f995c5a3fd86aec4b79b08a05d</cpid>
 <teamid>118</teamid>
</user>

However, there will also be a new file created called user_deleted.xml that contains a list of users who have been deleted.

<user>
 <id>13384</id>
 <cpid>a09031094836310f043f0ff8bcfca355</cpid>
</user>

The user_deleted file indicates those users that must be deleted from the downstream system. The db_dump utility would first export records from the user table and then export records from the user_deleted table to generate these two files.

Hosts would be handled similarly where the existing host.xml would only contain those hosts that still exist on the host table and that host_deleted.xml would contain those that were deleted and would look like:

<host>
    <id>884</id>
    <host_cpid>36e9d265f8fe553bedbbef1cd21a6182</host_cpid>
</host>

This portion of db_dump would pull from host and then from host_deleted to create the files.

delete_account_confirm_action.php

On the delete_account_confirm.php page, then token should be included as a hidden field. The delete_account_confirm_action.php page that receives the request should validate both the users password and the users token before processing. Only if both are valid should this occur. If they are both valid, then the user account will be deleted based on the logic below.

delete_account.inc

The delete of the account is final and is not recoverable in anyway. This process is designed so that projects can customize what happens when an account deleted. There will be two options implemented and the projects will be able to customize this to meet their needs. A configuration option will be available to choose which method to use. The default will be the 'data anonymization' approach.

data anonymization

full delete

For each user identified, the delete_account.php will consist of the following actions:

  • An entry will be inserted into the user_deleted table
  • An entry will be inserted into the host_deleted table for each host record the user has.
  • All entries for the user will be deleted from the following tables:
    • badge_user (delete from badge_user where user_id = ?)
    • banishment_vote (delete from banishment_vote where userid = ?)
    • credit_user (delete from credit_user where userid = ?)
    • credited_job (delete from credited_job where userid = ?)
    • donation_paypal (delete from donation_paypal where userid = ?)
    • forum_logging (delete from forum_logging where userid = ?)
    • forum_preferences (delete from forum_preferences where userid = ?)
    • friend (delete from friend where user_src = ? or user_dest = ?)
    • host_app_version (delete from host_app_version where host_id in (select id from host where userid = ?) )
    • msg_from_host ( delete from msg_from_host where hostid in (select id from host where userid = ?) )
    • msg_to_host ( delete from msg_to_host where hostid in (select id from host where userid = ?) )
    • host (delete from host where userid = ?)
    • notify ( delete from notfiy where userid = ? )
    • post_ratings ( delete from post_ratings where post in ( select id from post where user = ? ) )
    • post_ratings ( delete from post_ratings where user = ? )
    • post ( update post set parent_post = 0 where parent_post in (select id from post where user = ? ) )
    • post ( delete from post where user = ? )
    • sent_email ( delete from sent_email where userid = ? )
    • subscriptions ( delete from subscriptions where userid = ? )
    • team_admin ( delete from team_admin where userid = ? )
    • team_delta ( delete from team_delta where userid = ? )
    • team ( update team set userid = 0 where userid = ? )
    • token ( delete from token where userid = ? )
    • user ( delete from user where id = ? )
  • Note that rows in the following are not deleted because these will be deleted in due course and are necessary for technical operation of the system:
    • result
  • Questions.
    • private_messages ( delete from private_messages where userid = ? or senderid = ? ) # Jord had asserted that we should only do this for userid = ?
    • thread ( update thread set owner = 0 where owner = ? ) - note that the query to return the threads needs to be updated to be an outerjoin so that when misses occur, the thread still shows
    • user_submit - don't know what to do with these
    • user_submit_app - don't know what to do with these

Final Removal

A script that runs once a day will be developed that removes entries from the user_deleted and host_deleted tables when create_time indicates that they are over 60 days old. This provides sufficient time for consumers of the data export to receive notification of the deletion and to remove the data from their system.

Last modified 2 weeks ago Last modified on 04/09/18 14:38:37