Skip to content

Tidy summarizes information about the components of a model. A model component might be a single term in a regression, a single hypothesis, a cluster, or a class. Exactly what tidy considers to be a model component varies across models but is usually self-evident. If a model has several distinct types of components, you will need to specify which components to return.

Usage

# S3 method for class 'Mclust'
tidy(x, ...)

Arguments

x

An Mclust object return from mclust::Mclust().

...

Additional arguments. Not used. Needed to match generic signature only. Cautionary note: Misspelled arguments will be absorbed in ..., where they will be ignored. If the misspelled argument has a default value, the default value will be used. For example, if you pass conf.lvel = 0.9, all computation will proceed using conf.level = 0.95. Two exceptions here are:

  • tidy() methods will warn when supplied an exponentiate argument if it will be ignored.

  • augment() methods will warn when supplied a newdata argument if it will be ignored.

See also

tidy(), mclust::Mclust()

Other mclust tidiers: augment.Mclust()

Value

A tibble::tibble() with columns:

proportion

The mixing proportion of each component

size

Number of points assigned to cluster.

mean

The mean for each component. In case of 2+ dimensional models, a column with the mean is added for each dimension. NA for noise component

variance

In case of one-dimensional and spherical models, the variance for each component, omitted otherwise. NA for noise component

component

Cluster id as a factor.

Examples


# load library for models and data
library(mclust)

# load data manipulation libraries
library(dplyr)
library(tibble)
library(purrr)
library(tidyr)

set.seed(27)

centers <- tibble(
  cluster = factor(1:3),
  # number points in each cluster
  num_points = c(100, 150, 50),
  # x1 coordinate of cluster center
  x1 = c(5, 0, -3),
  # x2 coordinate of cluster center
  x2 = c(-1, 1, -2)
)

points <- centers %>%
  mutate(
    x1 = map2(num_points, x1, rnorm),
    x2 = map2(num_points, x2, rnorm)
  ) %>%
  select(-num_points, -cluster) %>%
  unnest(c(x1, x2))

# fit model
m <- Mclust(points)

# summarize model fit with tidiers
tidy(m)
#> # A tibble: 3 × 6
#>   component  size proportion variance mean.x1 mean.x2
#>       <int> <int>      <dbl>    <dbl>   <dbl>   <dbl>
#> 1         1   101      0.335     1.12  5.01     -1.04
#> 2         2   150      0.503     1.12  0.0594    1.00
#> 3         3    49      0.161     1.12 -3.20     -2.06
augment(m, points)
#> # A tibble: 300 × 4
#>       x1     x2 .class .uncertainty
#>    <dbl>  <dbl> <fct>         <dbl>
#>  1  6.91 -2.74  1          3.98e-11
#>  2  6.14 -2.45  1          1.99e- 9
#>  3  4.24 -0.946 1          1.47e- 4
#>  4  3.54  0.287 1          2.94e- 2
#>  5  3.91  0.408 1          7.48e- 3
#>  6  5.30 -1.58  1          4.22e- 7
#>  7  5.01 -1.77  1          1.06e- 6
#>  8  6.16 -1.68  1          7.64e- 9
#>  9  7.13 -2.17  1          4.16e-11
#> 10  5.24 -2.42  1          1.16e- 7
#> # ℹ 290 more rows
glance(m)
#> # A tibble: 1 × 7
#>   model     G    BIC logLik    df hypvol  nobs
#>   <chr> <int>  <dbl>  <dbl> <dbl>  <dbl> <int>
#> 1 EII       3 -2402. -1175.     9     NA   300