Tidy summarizes information about the components of a model. A model component might be a single term in a regression, a single hypothesis, a cluster, or a class. Exactly what tidy considers to be a model component varies across models but is usually self-evident. If a model has several distinct types of components, you will need to specify which components to return.
Usage
# S3 method for class 'Mclust'
tidy(x, ...)
Arguments
- x
An
Mclust
object return frommclust::Mclust()
.- ...
Additional arguments. Not used. Needed to match generic signature only. Cautionary note: Misspelled arguments will be absorbed in
...
, where they will be ignored. If the misspelled argument has a default value, the default value will be used. For example, if you passconf.lvel = 0.9
, all computation will proceed usingconf.level = 0.95
. Two exceptions here are:
See also
Other mclust tidiers:
augment.Mclust()
Value
A tibble::tibble()
with columns:
- proportion
The mixing proportion of each component
- size
Number of points assigned to cluster.
- mean
The mean for each component. In case of 2+ dimensional models, a column with the mean is added for each dimension. NA for noise component
- variance
In case of one-dimensional and spherical models, the variance for each component, omitted otherwise. NA for noise component
- component
Cluster id as a factor.
Examples
# load library for models and data
library(mclust)
# load data manipulation libraries
library(dplyr)
library(tibble)
library(purrr)
library(tidyr)
set.seed(27)
centers <- tibble(
cluster = factor(1:3),
# number points in each cluster
num_points = c(100, 150, 50),
# x1 coordinate of cluster center
x1 = c(5, 0, -3),
# x2 coordinate of cluster center
x2 = c(-1, 1, -2)
)
points <- centers %>%
mutate(
x1 = map2(num_points, x1, rnorm),
x2 = map2(num_points, x2, rnorm)
) %>%
select(-num_points, -cluster) %>%
unnest(c(x1, x2))
# fit model
m <- Mclust(points)
# summarize model fit with tidiers
tidy(m)
#> # A tibble: 3 × 6
#> component size proportion variance mean.x1 mean.x2
#> <int> <int> <dbl> <dbl> <dbl> <dbl>
#> 1 1 101 0.335 1.12 5.01 -1.04
#> 2 2 150 0.503 1.12 0.0594 1.00
#> 3 3 49 0.161 1.12 -3.20 -2.06
augment(m, points)
#> # A tibble: 300 × 4
#> x1 x2 .class .uncertainty
#> <dbl> <dbl> <fct> <dbl>
#> 1 6.91 -2.74 1 3.98e-11
#> 2 6.14 -2.45 1 1.99e- 9
#> 3 4.24 -0.946 1 1.47e- 4
#> 4 3.54 0.287 1 2.94e- 2
#> 5 3.91 0.408 1 7.48e- 3
#> 6 5.30 -1.58 1 4.22e- 7
#> 7 5.01 -1.77 1 1.06e- 6
#> 8 6.16 -1.68 1 7.64e- 9
#> 9 7.13 -2.17 1 4.16e-11
#> 10 5.24 -2.42 1 1.16e- 7
#> # ℹ 290 more rows
glance(m)
#> # A tibble: 1 × 7
#> model G BIC logLik df hypvol nobs
#> <chr> <int> <dbl> <dbl> <dbl> <dbl> <int>
#> 1 EII 3 -2402. -1175. 9 NA 300