@inproceedings{ zhang96birch,
author = "Tian Zhang and Raghu Ramakrishnan and Miron Livny",
title = "{BIRCH}: an efficient data clustering method for very large databases",
pages = "103--114",
year = "1996",
url = "citeseer.ist.psu.edu/zhang96birch.html" }
Clustering in large databases is very expensive. The reason is due to excessive database scans. This paper addresses this problem.
Proposed the idea of "local clustering" and proved that introduction of "local clustering" didn't affect correctness. This idea is quite like B-Tree in database field, so every cluster caculation and update is done locally.
Proved that Cluster Feature (CF) had some interesting properties; Introduced the algorithms to deal with CFs.
The idea in this paper is influential. Recently, J. Han's group propsed moving objects clustering based on "microclusters". Their ideas are largely borrowed from Birch.