As I wanted to find good reasons to use Javascript as a language to do image processing, I thought of distributed computing that would be extremely easy to do. Users have nothing to install, they just have to visit a webpage. And since we want many users to participate we could embed it into a popular webpage.
There are many issues using browsers as distributed computing nodes:
- User approbation: Will they allow you to run random script on their machine as they just wanted to visit a site.
- Liability of the Data: Since process is being done on untrusted people, we must find ways to verify it.
- User disconnection: We are going to compute the data while they are browsing the web, if they change of URL, reload the page or close their browser we wont get any result.
User Disconnection Over Time
In order to test the last point, I added a small Javascript program on the popular website MMO-Champion.com. Every one minute, it will send the time spent on the page to my server. I ran it for about 2 hours (then it DDOS'ed my server :(). I aggregated the results in the following chart.
We can extract 3 phases from this graph.
- Under 5 minutes the user is really likely to disconnect.
- Between 5 and 15 minutes, the chance of disconnection is reducing
- After 15 minutes, really few users are disconnecting.
You can see it in another way:
- 50% of the users that have stayed 10 minutes are staying 1 hour.
- 10% of the users that have stayed 1 minutes are staying 1 hour.
Chance of Script Completion
What we really want to know is either or not our script will complete. In order to test that I took the data we gather and computed the percentage of users that would still be there X minutes later.
After 15 minutes, a script that takes 1-10 minutes to complete has 95% chance of finishing without being interrupted.
Conclusion
The user disconnection is not really an issue. If do the computation on users that are staying more than 15 minutes, we have a 95% rate of completion for a 10 minutes script.