Skip to content

Tidy summarizes information about the components of a model. A model component might be a single term in a regression, a single hypothesis, a cluster, or a class. Exactly what tidy considers to be a model component varies across models but is usually self-evident. If a model has several distinct types of components, you will need to specify which components to return.

Usage

# S3 method for class 'kmeans'
tidy(x, col.names = colnames(x$centers), ...)

Arguments

x

A kmeans object created by stats::kmeans().

col.names

Dimension names. Defaults to the names of the variables in x. Set to NULL to get names x1, x2, ....

...

Additional arguments. Not used. Needed to match generic signature only. Cautionary note: Misspelled arguments will be absorbed in ..., where they will be ignored. If the misspelled argument has a default value, the default value will be used. For example, if you pass conf.lvel = 0.9, all computation will proceed using conf.level = 0.95. Two exceptions here are:

  • tidy() methods will warn when supplied an exponentiate argument if it will be ignored.

  • augment() methods will warn when supplied a newdata argument if it will be ignored.

See also

Value

A tibble::tibble() with columns:

cluster

A factor describing the cluster from 1:k.

size

Number of points assigned to cluster.

withinss

The within-cluster sum of squares.

Examples


library(cluster)
library(modeldata)
library(dplyr)

data(hpc_data)

x <- hpc_data[, 2:5]

fit <- pam(x, k = 4)

tidy(fit)
#> # A tibble: 4 × 11
#>    size max.diss avg.diss diameter separation avg.width cluster compounds
#>   <dbl>    <dbl>    <dbl>    <dbl>      <dbl>     <dbl> <fct>       <dbl>
#> 1  3544   13865.     576.   15128.       93.6    0.711  1             242
#> 2   412    3835.    1111.    5704.       93.2    0.398  2             317
#> 3   236    3882.    1317.    5852.       93.2    0.516  3             240
#> 4   139   42999.    5582.   46451.      151.     0.0843 4             724
#> # ℹ 3 more variables: input_fields <dbl>, iterations <dbl>,
#> #   num_pending <dbl>
glance(fit)
#> # A tibble: 1 × 1
#>   avg.silhouette.width
#>                  <dbl>
#> 1                0.650
augment(fit, x)
#> # A tibble: 4,331 × 5
#>    compounds input_fields iterations num_pending .cluster
#>        <dbl>        <dbl>      <dbl>       <dbl> <fct>   
#>  1       997          137         20           0 1       
#>  2        97          103         20           0 1       
#>  3       101           75         10           0 1       
#>  4        93           76         20           0 1       
#>  5       100           82         20           0 1       
#>  6       100           82         20           0 1       
#>  7       105           88         20           0 1       
#>  8        98           95         20           0 1       
#>  9       101           91         20           0 1       
#> 10        95           92         20           0 1       
#> # ℹ 4,321 more rows