There are many issues using browsers as distributed computing nodes:
- User approbation: Will they allow you to run random script on their machine as they just wanted to visit a site.
- Liability of the Data: Since process is being done on untrusted people, we must find ways to verify it.
- User disconnection: We are going to compute the data while they are browsing the web, if they change of URL, reload the page or close their browser we wont get any result.
User Disconnection Over Time
We can extract 3 phases from this graph.
- Under 5 minutes the user is really likely to disconnect.
- Between 5 and 15 minutes, the chance of disconnection is reducing
- After 15 minutes, really few users are disconnecting.
You can see it in another way:
- 50% of the users that have stayed 10 minutes are staying 1 hour.
- 10% of the users that have stayed 1 minutes are staying 1 hour.
Chance of Script Completion
What we really want to know is either or not our script will complete. In order to test that I took the data we gather and computed the percentage of users that would still be there X minutes later.
After 15 minutes, a script that takes 1-10 minutes to complete has 95% chance of finishing without being interrupted.
The user disconnection is not really an issue. If do the computation on users that are staying more than 15 minutes, we have a 95% rate of completion for a 10 minutes script.