Wednesday, April 26, 2006

[t] google accelerator: followup

I further notice that for an "accelerated" page, browsing it in an google-accelerator-enabled browser will always trigger a subsequent visit to that page from google. What is hard to judge is that whether the google visit occurs before or after the page loading in browser. Let me hypothize what's happening under both possibilities:

1) the google visit occurs before the page loading: it could be that the accelerator sends the url to google, then some backend software agent retrieves the page from the page to be visited, and compute the difference between the retrieved copy and the copy cached by google. By using a digest algorithm, this comparison should be rather fast.
Then, if the software agent determines that the page has not been updated since last time it was cached, it returns some flag to the accelerator agent embedded in user's browser, so the accelerator agent could safely use a local cached copy.
Recall that the assumption of google accelerator is that user is using a internet connection much slower than google's.

2) the google visit occurs after the page loading: this is much safer compared to the previous one, because the user is always getting the up-to-date copy of the webpage. After the user gets the page, the accelerator notifies google to re-retrive the page to make sure it's keeping the most recent copy. The accelerator could then record how many times the user visited the page and how many times it was later found that the page was not updated. If the (times_not_updated/times_visit) ratio is rather high, it could assume that the page is static and supply it directly to the user when the page is requested again.

of course there could be a hybrid solution between the above two. Actually a hybrid approach is most probable, I think.