Xun's Studyroom: 2006

Friday, December 01, 2006

[r] Scalable Computing Resources

IEEE Committees: TCSC, TCCC.
Conferences: List by DSRLab.

Sunday, November 05, 2006

[t] a bit hacking into Google Map's url format

I am recently writing some location-based application using J2ME. One of the task is to cache some maps locally on the phone, so a time consuming map loading from web could be avoided. The rule of thumb is of course precise map

ping between the map coordinates to screen coordinates.

Take some advantage of Google Map will be an quick and easy way, as the URL is encoded with longitude and latitude values. One of the maps I used has a URL like this: ll=42.065957,-88.048339&spn=0.007678,0.027122

"spn" is for "span", which is easy to guess. By my guess that "ll" stands for "Lower Left" is wrong. It is actually the center of the map. Tricky! So, the decoded format of the URL is:

ll=latitude_of_map_center, longitude_of_map_center & spn= total_latitude_span, total_longitude_span.

Thursday, June 01, 2006

[t] Automated Software Building under Windows

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vcug98/html/_asug_building_a_project_from_the_command_line.asp

Wednesday, April 26, 2006

[t] google accelerator: followup

I further notice that for an "accelerated" page, browsing it in an google-accelerator-enabled browser will always trigger a subsequent visit to that page from google. What is hard to judge is that whether the google visit occurs before or after the page loading in browser. Let me hypothize what's happening under both possibilities:

1) the google visit occurs before the page loading: it could be that the accelerator sends the url to google, then some backend software agent retrieves the page from the page to be visited, and compute the difference between the retrieved copy and the copy cached by google. By using a digest algorithm, this comparison should be rather fast.
Then, if the software agent determines that the page has not been updated since last time it was cached, it returns some flag to the accelerator agent embedded in user's browser, so the accelerator agent could safely use a local cached copy.
Recall that the assumption of google accelerator is that user is using a internet connection much slower than google's.

2) the google visit occurs after the page loading: this is much safer compared to the previous one, because the user is always getting the up-to-date copy of the webpage. After the user gets the page, the accelerator notifies google to re-retrive the page to make sure it's keeping the most recent copy. The accelerator could then record how many times the user visited the page and how many times it was later found that the page was not updated. If the (times_not_updated/times_visit) ratio is rather high, it could assume that the page is static and supply it directly to the user when the page is requested again.

of course there could be a hybrid solution between the above two. Actually a hybrid approach is most probable, I think.

Saturday, April 08, 2006

[t] look inside Google accelerator

I download Google accelerator(http://webaccelerator.google.com/) . It is not a mainstream google product at this moment, but attracted my curiosity about the web-page prefectching strategies. This article is mainly based on my usage experience, as well as from previous reading about GFS and disk-access performance optimization papers from Google.

The accelerator uses a mixed locality measure to determine the distance of each link on a page to the current page. One of them seems to be page rank, and the other is the traditional spatial locality, in the context of page layout. When the accelerator agent (in the form of a browser plug-in) detects a page-load, it sort all the links on that page, based on their page rank. Those pages that have higher page rank tends to be prefetched at a higher priority. The usage of spatial locality is that when the reader's concerntration seems to be on a certain block of the current page, all links around the current link he chooses tends be prefetched.

It is interesting that the accelerator tries to adapt to the user's reading pattern. I noticed that when I am browsing a news website, of which news article links are laid out row-wise. The accelerator tried to prefetch pages according to my reading gap, i.e. the vertical spaces between each news page I read. This simple heuristics may works very well under some occasions.

There are of course lots of potentials to improve the 'smartness' of the agent. Such as using a Marcov chain for page browse history, or disover frequent sequences. But the problem is bounded by the fact that webpages are highly different from each other in the intra-page link structure, and that's my guess why the accelerator is not useful enough to be a mainstream google product.

Sunday, March 05, 2006

[t] AJAX is rising

I was buying a laptop from Dell online store some other day last week, and immediately found a change in the configuration interface. When a certain configuration item is chosen, the related price of the other choices were adjusted instantly, together with the total price.

That's AJAX, which was not new for me. I've been using its precedessor - DHTML - since 1999, to update the picture captions in a webpage. What's new for me is the bursting trend of deploy AJAX on webpage interfaces. To name a few: Gmail, Google map, Yahoo Instant Search. Microsoft again is holding the "wait and see how far you guys go" attitude.

Clearly AJAX is rising, although the techniques seem trivial. In the web market field, the referees who really counts are the common users, and the majority favor to Gmail does indicate AJAX's popularity.

[update on 3/8/2006]
Microsoft launched Windows Live today. With AJAX floating windows and dynamic pagelets.

Wednesday, February 22, 2006

[t] mutex and semaphore

I have been asked about the difference between these two in quite a few interviews. And everytime, I am giving increamental but incompleted answers :p. A summarization:

Mutex(mutural exclusive lock), also called spin-lock, is a busy-waiting synchronization mechanism. Technically, it's supported by the mutex functions in pthread library.

Semaphore is a traditional UNIX synchronization mechanism which defines counted number of resource, and the entities being synchorinized use semaphore by P() and V() operations.

Semaphore and mutex are different in nature. A semphore indicates certain usable resource, and provide a method to use this resource properly. A typical example I could think of using semaphore is the printer queue, which has a upper capacity limit and needs to be carefully managed. On the other hand, mutex is used to implement exclusiveness, resource is mainly protected from being accessed concurrently by more than 1 entity.

Semaphore is very flexible, when the counter is binary, it's very much like a mutex. Do recalled the intrinsic difference: to avoid contention of two writers, we need to use mutex, while to coordinate a writer and a reader, we need to use binary semaphore.

Another interesting use of semaphore is that the resource could be continuous instead of discrete, such as cpu time or memory. Of course, such kind of use is not supported by the Unix library and you will need to make your own.

Xun's Studyroom