Content Preview

Lorem ipsum dolor sit amet, consectetur adipisicing elit. Odit molestiae mollitia laudantium assumenda nam eaque, excepturi, soluta, perspiciatis cupiditate sapiente, adipisci quaerat odio voluptates consectetur nulla eveniet iure vitae quibusdam? Excepturi aliquam in iure, repellat, fugiat illum voluptate repellendus blanditiis veritatis ducimus ad ipsa quisquam, commodi vel necessitatibus, harum quos a dignissimos.

Close Save changes

Keyboard Shortcuts

Help F1 or ? Previous Page ← + CTRL (Windows) ← + ⌘ (Mac) Next Page → + CTRL (Windows) → + ⌘ (Mac) Search Site CTRL + SHIFT + F (Windows) ⌘ + ⇧ + F (Mac) Close Message ESC

Lesson 14: Cluster Analysis

Overview Section

Cluster analysis is a data exploration (mining) tool for dividing a multivariate dataset into “natural” clusters (groups). We use the methods to explore whether previously undefined clusters (groups) exist in the dataset. For instance, a marketing department may wish to use survey results to sort its customers into categories (perhaps those likely to be most receptive to buying a product, those most likely to be against buying a product, and so forth).

Cluster Analysis is used when we believe that the sample units come from an unknown number of distinct populations or sub-populations. We also assume that the sample units come from a number of distinct populations, but there is no apriori definition of those populations. Our objective is to describe those populations with the observed data.

Cluster Analysis, until relatively recently, has had very little interest. This has changed because of the interest in bioinformatics and genome research. We will use an ecological example in our lesson.

Objectives

Upon completion of this lesson, you should be able to:

Carry out cluster analysis using SAS or Minitab;
Use a dendrogram to partition the data into clusters of known composition;
Carry out post hoc analyses to describe differences among clusters.