Xun's Studyroom

Saturday, September 13, 2008

[r] Vision-based gesture interface for VR: literature list

1. [Bolt 1980] Bolt, R. Put-that-there: voice and gesture at the graphics interface. ACM SIGGraph Computer Graphics, 14(3). p.262-270. (using glove)
2. [Baudel 1993] Baudel, T. and Beaudoin-Lafon, M. Charade: remote control of objects using free-hand gestures. Communication of ACM, 36(7). p.28-35. (using glove)
3. [Freeman 1995] Freeman, W. and Weissman C. (1995). Television control by hand gestures. International Workshop on Automatic Face and Gesture Recognition. p.179-183.
4. [Segen 1998] Segen J., and Kumar, S. Gesture VR: vision-based 3D hand interface for spatial interaction.
5. [Segen 2000] Segen, J. and Kumar, S. Look ma, no mouse! Communications of the ACM, 43(7). p102-109.
6. [Crowley 2000] Crowley, J., Coutaz, J. and Berard, F. Perceptual user interfaces: things that see. Communications of the ACM, 19(8). p.454-460.
7. [Freeman 2000] Freeman, W., Beardsley, P., Kage, H., Tanaka, K.-I., Kyuma, K. and Weissman, C. Computer vision for computer interaction. ACM SIGGRAPH Computer Graphics, 33(4). p.64-68.
8. [Ringel 2001] Ringel, M., Berg. H., Jin, Y., and Winograd, T. Barehands: implement-free interaction with a wall-mounted display. ACM CHI Conference on Human Factors in Computing Systems. p. 367-368.
9. [Cao 2003] Cao, X. and Balakrishnan R., VisionWand: interaction techniques for large displays using a passive wand tracked in 3D. UIST 03'.
10. [Malik 2005] Malik S., Ranjan A. and Balakrishnan R., Interacting with large displays from a distance with vision-tracked multi-figner gestural input. UIST 05'.

Saturday, September 06, 2008

[m] DVB related people and labs

I got to know the name of Ulrich Reimers by one of his former students who's now studying in UBC.

Friday, September 05, 2008

[r] Notes from EMBC08 and Virtual Rehab08

EMBC08 and Virtual Rehab08 are held at Vancouver in August. I went to both conferences to have free peek and took some notes. Mainly in the fields I am interested in: wireless, human-computer interaction and human factors for personal systems.

EMBC08：
- There are not much new in wireless-enabled practices. Most popular radio interfaces are still Bluetooth and WiFi. One UK hospital showed some work using HSPA+ for ambulance-ER communication. Another group from Europe (IMEC) showed miniaturized wireless EEG patch using TI MSP430 processor and a 2.4 GHz transceiver using Nordic RF2401 chip (same one in the Apple-Nike shoe sensor). The transceiver is utilized to transfer one-channel simulated EEG data, albeit as claimed it is capable of 8-channel transmission at 100Hz data sampling rate.

Virtual Rehab08:
- The tutorial "using Nitendo Wii for rehabilitation" is interesting. The speaker gives an anatomy introduction of the Wii and related products, as well as some recent practice in a large rehabilitation center. One PT reported an observation that when stroke patients are trained in VE, they tend to use cheating to achieve higher scores to avoid frustration. A possible solution is to set personalized game difficulties to encourage patients' active participation.
- Greg Burdea raised two issues that need attention: 1. Besides exploring the potential benefits of virtual rehabilitation, are their side effects also clearly identified and monitored? All the PTs, when asked if patients' heart rate and blood pressure were examined before and after the training sessions, gave the answer "no". 2. Pervasive bio-signal monitoring , such as using wearable sensors together with a home wireless gateway connected to medical centers is important in ensuring the safety and good transfer of VE-based training sessions. The latter issue really enlightens my thoughts.

- Emily Keshner's keynote reviewed her endeavors of integrating VR into physiological research. My summary of her points: VR is first useful in investigate the mechanisms of the human body and as a further step, these findings can be used to design rehabilitation utilities.

- Assarf Dvorkin use poser to draw human figure superimposed on a Matlab 3D plot, this data visualization is very intuitive in describing experiment settings.

Thursday, March 01, 2007

[r] some reference materials for employer selection

Two important factors for selecting an employer are its overall rating and the cost of living at the employer's location. Luckily both information could be found comprehensively online. Here is how:

For employer rating, Fortune 100 is a good index. The survey has exhaustive company profiles and relative comparisons. CNN has the annual ranking data online at: http://money.cnn.com/magazines/fortune/bestcompanies/2007/

Also online from CNN is the "best places to live" list, good for town research:
http://money.cnn.com/magazines/moneymag/bplive/2007/

For comparison chart of cost of living, there are many resources online. Just Google "compare cost of living". You will find how amazing a lower offer makes sense than its higher counterpart in a different region. A sample website is:
http://www.bestplaces.net/col/

Based on calculations at the above website, $90k/yr in Chicago, IL is equivalent to $138,522/yr in San Diego, CA and $93,913/yr in Seattle, WA. Different cost of living sites give different conversion factors and taking average of several of them will be a good idea.

There are also great books on negotiation, one of my favorites is "Negotiating Your Salary: How To Make $1,000 A Minute" by Jack Chapman. The book is available through Amazon (link).

Friday, December 01, 2006

[r] Scalable Computing Resources

IEEE Committees: TCSC, TCCC.
Conferences: List by DSRLab.

Sunday, November 05, 2006

[t] a bit hacking into Google Map's url format

I am recently writing some location-based application using J2ME. One of the task is to cache some maps locally on the phone, so a time consuming map loading from web could be avoided. The rule of thumb is of course precise map

ping between the map coordinates to screen coordinates.

Take some advantage of Google Map will be an quick and easy way, as the URL is encoded with longitude and latitude values. One of the maps I used has a URL like this: ll=42.065957,-88.048339&spn=0.007678,0.027122

"spn" is for "span", which is easy to guess. By my guess that "ll" stands for "Lower Left" is wrong. It is actually the center of the map. Tricky! So, the decoded format of the URL is:

ll=latitude_of_map_center, longitude_of_map_center & spn= total_latitude_span, total_longitude_span.

Thursday, June 01, 2006

[t] Automated Software Building under Windows

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vcug98/html/_asug_building_a_project_from_the_command_line.asp

Wednesday, April 26, 2006

[t] google accelerator: followup

I further notice that for an "accelerated" page, browsing it in an google-accelerator-enabled browser will always trigger a subsequent visit to that page from google. What is hard to judge is that whether the google visit occurs before or after the page loading in browser. Let me hypothize what's happening under both possibilities:

1) the google visit occurs before the page loading: it could be that the accelerator sends the url to google, then some backend software agent retrieves the page from the page to be visited, and compute the difference between the retrieved copy and the copy cached by google. By using a digest algorithm, this comparison should be rather fast.
Then, if the software agent determines that the page has not been updated since last time it was cached, it returns some flag to the accelerator agent embedded in user's browser, so the accelerator agent could safely use a local cached copy.
Recall that the assumption of google accelerator is that user is using a internet connection much slower than google's.

2) the google visit occurs after the page loading: this is much safer compared to the previous one, because the user is always getting the up-to-date copy of the webpage. After the user gets the page, the accelerator notifies google to re-retrive the page to make sure it's keeping the most recent copy. The accelerator could then record how many times the user visited the page and how many times it was later found that the page was not updated. If the (times_not_updated/times_visit) ratio is rather high, it could assume that the page is static and supply it directly to the user when the page is requested again.

of course there could be a hybrid solution between the above two. Actually a hybrid approach is most probable, I think.

Saturday, April 08, 2006

[t] look inside Google accelerator

I download Google accelerator(http://webaccelerator.google.com/) . It is not a mainstream google product at this moment, but attracted my curiosity about the web-page prefectching strategies. This article is mainly based on my usage experience, as well as from previous reading about GFS and disk-access performance optimization papers from Google.

The accelerator uses a mixed locality measure to determine the distance of each link on a page to the current page. One of them seems to be page rank, and the other is the traditional spatial locality, in the context of page layout. When the accelerator agent (in the form of a browser plug-in) detects a page-load, it sort all the links on that page, based on their page rank. Those pages that have higher page rank tends to be prefetched at a higher priority. The usage of spatial locality is that when the reader's concerntration seems to be on a certain block of the current page, all links around the current link he chooses tends be prefetched.

It is interesting that the accelerator tries to adapt to the user's reading pattern. I noticed that when I am browsing a news website, of which news article links are laid out row-wise. The accelerator tried to prefetch pages according to my reading gap, i.e. the vertical spaces between each news page I read. This simple heuristics may works very well under some occasions.

There are of course lots of potentials to improve the 'smartness' of the agent. Such as using a Marcov chain for page browse history, or disover frequent sequences. But the problem is bounded by the fact that webpages are highly different from each other in the intra-page link structure, and that's my guess why the accelerator is not useful enough to be a mainstream google product.

Sunday, March 05, 2006

[t] AJAX is rising

I was buying a laptop from Dell online store some other day last week, and immediately found a change in the configuration interface. When a certain configuration item is chosen, the related price of the other choices were adjusted instantly, together with the total price.

That's AJAX, which was not new for me. I've been using its precedessor - DHTML - since 1999, to update the picture captions in a webpage. What's new for me is the bursting trend of deploy AJAX on webpage interfaces. To name a few: Gmail, Google map, Yahoo Instant Search. Microsoft again is holding the "wait and see how far you guys go" attitude.

Clearly AJAX is rising, although the techniques seem trivial. In the web market field, the referees who really counts are the common users, and the majority favor to Gmail does indicate AJAX's popularity.

[update on 3/8/2006]
Microsoft launched Windows Live today. With AJAX floating windows and dynamic pagelets.

Wednesday, February 22, 2006

[t] mutex and semaphore

I have been asked about the difference between these two in quite a few interviews. And everytime, I am giving increamental but incompleted answers :p. A summarization:

Mutex(mutural exclusive lock), also called spin-lock, is a busy-waiting synchronization mechanism. Technically, it's supported by the mutex functions in pthread library.

Semaphore is a traditional UNIX synchronization mechanism which defines counted number of resource, and the entities being synchorinized use semaphore by P() and V() operations.

Semaphore and mutex are different in nature. A semphore indicates certain usable resource, and provide a method to use this resource properly. A typical example I could think of using semaphore is the printer queue, which has a upper capacity limit and needs to be carefully managed. On the other hand, mutex is used to implement exclusiveness, resource is mainly protected from being accessed concurrently by more than 1 entity.

Semaphore is very flexible, when the counter is binary, it's very much like a mutex. Do recalled the intrinsic difference: to avoid contention of two writers, we need to use mutex, while to coordinate a writer and a reader, we need to use binary semaphore.

Another interesting use of semaphore is that the resource could be continuous instead of discrete, such as cpu time or memory. Of course, such kind of use is not supported by the Unix library and you will need to make your own.

Friday, December 09, 2005

[r] Freitas01, "A survey of evolutionary algorithms for data mining and knowledge discovery"


@misc{ freitas01survey,
author = "A. Freitas",
title = "A survey of evolutionary algorithms for data mining and knowledge discovery",
text = "Freitas, A.A. (2001). A survey of evolutionary algorithms for data mining
and knowledge discovery. To appear in: Ghosh, A.; Tsutsui, S. (Eds.) Advances
in evolutionary computation. Springer-Verlag.",
year = "2001",
url = "citeseer.ist.psu.edu/freitas01survey.html" }

This paper introduced in detail how the two variants of genetic computation: genetic algorithms(GA) and genetic programming(GP), could be used in the data mining and knowledge discovery process. It is my first time to read the clear statements that generalize knowledge discovery process. The paper illustrated GA and GP's usages in the three stages of the classification tasks in DM and KDD, namely, preprocessing, rule generation and postprocessing. As gene express programming (GEP) is the evolved offspring of GA and GP, and lots of other DM/KDD tasks could be mapped to classification tasks. It is enlightful to solve a certain KDD question, such as one in web mining, with GEP.

[m] Supercomputing 2005

I firstly noticed a blogspot edit box change: arbitrarily choice of publishing date is no longer allowed. Is this a measure to call for blog owner's timely inputs, or a feature change after some G engineers investigated MSN space? (The M guys are actually allowing flexible publish time setting at present). Never mind, I was lazy and should be solely responsible for the delay of posts, no matter they were caught or not :P.

Supercomputing 2005 was great, partly because of, but definitely much more than, the Bill Gates show. Some people are laughing at Bill, yet we have to think why supercomputing became such attractive as Mr. Gates feel obligated to do a on-site promotion of MS Windows 2003 cluster.

In my opionion, during the past decade (or 12 years, count from the birth of Linux in 1991), the two elements of computer science - computer and data - has undergone drastic changes. Free operating systems like Linux has became open to public and of comparable quality as commercial ones, thus greatly relieved the budget concern of building up computation power. The most recent news I read, is the Ultimate Linux Lunchbox. The accessibility of supercomputing to general public is not a dream but very true fact now.

The characteristics of data, in volume, dimension and interestness, are changing rapidly. The first annual KDD conference was in 1993, still mainly concerntrated in data manipulation in databases. Nowadays everybody has the access to the world's biggest database - the web - and new "database" companys as Google and Yahoo are taking the positions used to be occupied by Oracle and Sybase. Everyone became a data consumer, more or less, aware or unaware.

Then, as a young computer scientist, how should I adapt to the changes? That's really an interesting topic to be thought about thoroughly.

Thursday, October 27, 2005

[t] video see-through equipments

This week I mainly spent time on investigation of video see-through techniques, for their possible use of our AR project. Before doing this work, many people (such as the audience of my talk in Robotics Lab, even including myself!) were skeptical about how reliable a system using video see-through will be. Well, here are the findings:

University of North Carolina has been experimenting using video-AR since year 2000. Their prototype including modified versions of Proview HMD and Sony Glasstron, with holders made by their own.
The choice of Proview HMD is mainly due to its wide field of view (Model SR80: 53° (V)x 63° (H), 80° diagonal) and high resolution (1024x768 full color at 60Hz). The disadvantage of this product is that it's a bit bulkly and heavy (1.7 lbs for the head mounted part). At the same time, the price of the HMD is around $28k.
The Sony glasstron version is clearly a PLM700 just as we are using, without the eye sheds (it's very easy to take off).
Although I did not find the detail description of the camera model they used, I am pretty sure it's one of the Panasonic stick cameras commonly used in surveillance systems. Those cameras are color model and support up to 800x600 resolution. The video format is analog so there must be framegrabber used at the computer side.
Another interesting product series I found out are the Trivisio HMDs. This is a Germany company, and sell ready made stereo and monocular view see through HMDs. The cameras could support resolution as high as 800x600, as well as the display. The field of view is 40°diagonal, 32° horizontal and 24° vert. Weight of the most heavy model is 230g. Slightly heavier than the Sony Glasstron we are using (220g). If take camera weight into consideration, it's actually lighter.
The interfact between these HMDs and the computer is USB2, which could transfer video at 480Mbps rate. Far exceeds our requirements.

I sent price quotes to them and is waiting for reply. Possible guess of the price will be between 15k and 30k.
If we are going to build our own, we could take advantage of customized cameras. A reknown vendor in this field is PointGrey research(www.ptgrey.com), who manufacture a whole line of micro cameras. I have got quotes from them. Single camera price is around $2-3k. Lens could be bought from various optical vendors, and the prices vary from under $1k to $4k.
I was able to find a tracking alternative for wide field tracking. It is called HiBall, basically is a golf-ball size six-camera optical tracking system, with environment mounted infrared beacons. It could support tracking area as big as 1600 sqft, while maintain a satisfying accurary and precision. Details could be found at http://www.3rdtech.com/HiBall.htm. I have sent price quote to them, but guess the price will be considerable.

[Edited on 11/15/05]:
On my visit to the HiTLab, I got to know another vendor who is providing non-see-through HMD: eMagin. Their product is called 3D-Visor. It's not a see-through product but is stereo, relatively low cost (around $800) , could be full usb powered and has built-in inertial sensor for orientation tracking. Thank Phillips Lamb for his valuable input.

Monday, August 08, 2005

[t] static functions in C

in C, the 'static' modifier, when used on functions, just means the function's scope is limited to the file itself. This concept is not storage-related. However, when 'static' is used on variables, it does be storage-related. In contrast to 'auto' and 'register'.

A good reference for scopes in C is at here.

Another important and easy to get confused pair of concepts is constant-volatile. Reference for constant and volatile qualifers is at here.

Friday, May 20, 2005

[t] Java Review

I checked out the book "Sams Teach Yourself Java2 in 21 Days" for a Java review. The progress is fine. Till today I have finished:

Day8: Putting Interactive Programs on the Web.
Day 10: Adding Images, Animation and Sound.
Day 15: Class Roles: Packages, Interfaces and Other Features
Day 20: Designing a User Interface with Swing.
Day 21: Handling User Events with Swing.
Day 19: Java Beans and Other Advanced Features.
Day 12: Arranging Components on a User Interface.
Day 16: Exceptional Circumstances: Error Handling and Security.
Day 17: Handling Data Through Java Streams.
Day 18: Communicating Across the Internet.
Day 9: Making Programs Look Good with Graphics, Fonts and Color.
[... following studies will be put into here]

Good feeling to master something:).

Sunday, May 15, 2005

[t] Java Plotting

I tried to use java for mote signal plotting and it was successful. The plotting package I used is called Graph Class Library. Setting up Jbuilder took me quite some time, till now there are still problems to import tinyos packages.
Inspite of the cumbusomeness Jbuilder has, it is definitely a friendly and powerful development environment. The autocompletion and error checking are gorgeous.
No matter what language you are using: VC, VB or Java (could be python:P), the backstage schemes are the same: 1) prepare a buffer to store the plotting data, 2) update the buffer tail to make reflexion of time, 3) replot.
All three steps have the opportunity of improvment, in case signal comes in a high frequency. for step 1, a ring buffer or dynamic array can be used to avoid moving data in step 2; for step 2, update can be done in batch mode, or on a fixed inteval basis; for step 3, replotting can be done on part of the whole window area, or other window clipping techniques can be used.

[edit on 05/16]
The moteiv people keep warning a clean cygwin installation is a must: that's of reason. I found out that the tinyos java package path problem is actually caused by the cygwin installation. After doing a complete reinstall, problems solved.

Monday, May 09, 2005

[t] Usage of MSChart Control

MSChart control is a very useful activeX control which can fullfix virtually all excel style plottings. Prepare a datagrid and assign it the the control's ChartData member, then call control's Update() function, that's it.

One thing to note is that the first row and first column of the data grid is reserved to keep the heading data. So useful grids are actually from grid(1,1).

Tuesday, May 03, 2005

[r] liu99, "Mining Association Rules with Multiple Minimum Supports"

@inproceedings{312274,
author = {Bing Liu and Wynne Hsu and Yiming Ma},
title = {Mining association rules with multiple minimum supports},
booktitle = {KDD '99: Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining},
year = {1999},
isbn = {1-58113-143-7},
pages = {337--341},
location = {San Diego, California, United States},
doi = {http://doi.acm.org/10.1145/312129.312274},
publisher = {ACM Press},
address = {New York, NY, USA},
}

Motivation:
To address the large set finding problem caused by treating all item as of similar frequency.
Contributions:
1) Systematically stated why applying a universal minimum support value is problematic.
2) Raised algorithm MSapriori to address this problem.
Methods:
Followed apriori, but for level 2 sets are generated differently, additionally, the pruning is more restricted. ( I think the key equation cross the whole paper is : if c(1) in s or c(2)=c(1)..... ).
Discussions:
MSapriori is a neat algorithm. Also, when apply this on table datasets, the form is a bit different.

[r] zhai05, "Web Data Extraction Based on Partial Tree Alignment"

@inproceedings{1060761,
author = {Yanhong Zhai and Bing Liu},
title = {Web data extraction based on partial tree alignment},
booktitle = {WWW '05: Proceedings of the 14th international conference on World Wide Web},
year = {2005},
isbn = {1-59593-046-9},
pages = {76--85},
location = {Chiba, Japan},
doi = {http://doi.acm.org/10.1145/1060745.1060761},
publisher = {ACM Press},
address = {New York, NY, USA},
}

Motivation:
To effectively: 1) find data records in a webpage 2) align the data fields accross multiple data records.
Contributions:
Bulit the corresponding system DEPTA (or MDR-2);
Use visual cues to improve accuracy of the found data regions;
Proposed an algorithm for data field alignment based on partial tree alignment.
Methods:
Used visual cues got from browser rendering, the advantage is can improve accuracy and robustness;
Partial tree alignment.
Discussion:
The can be regarded as an alignment paper. The idea of using a seed tree as matching baseline and grow it is simple yet quite neat. Also this is an active research that is going on.