Tidy a(n) rcorr objectSource:
Tidy summarizes information about the components of a model. A model component might be a single term in a regression, a single hypothesis, a cluster, or a class. Exactly what tidy considers to be a model component varies across models but is usually self-evident. If a model has several distinct types of components, you will need to specify which components to return.
# S3 method for rcorr tidy(x, diagonal = FALSE, ...)
rcorrobject returned from
Logical indicating whether or not to include diagonal elements of the correlation matrix, or the correlation of a column with itself. For the elements,
estimateis always 1 and
NA. Defaults to
Additional arguments. Not used. Needed to match generic signature only. Cautionary note: Misspelled arguments will be absorbed in
..., where they will be ignored. If the misspelled argument has a default value, the default value will be used. For example, if you pass
conf.lvel = 0.9, all computation will proceed using
conf.level = 0.95. Two exceptions here are:
Suppose the original data has columns A and B. In the correlation
rcorr there may be entries for both the
cor(A, B) and
cor(B, A). Only one of these pairs will ever be present in the tidy
tibble::tibble() with columns:
Name or index of the first column being described.
Name or index of the second column being described.
The estimated value of the regression term.
The two-sided p-value associated with the observed statistic.
Number of observations used to compute the correlation
# load libraries for models and data library(Hmisc) #> #> Attaching package: ‘Hmisc’ #> The following object is masked from ‘package:psych’: #> #> describe #> The following object is masked from ‘package:network’: #> #> is.discrete #> The following object is masked from ‘package:survey’: #> #> deff #> The following object is masked from ‘package:quantreg’: #> #> latex #> The following objects are masked from ‘package:dplyr’: #> #> src, summarize #> The following objects are masked from ‘package:base’: #> #> format.pval, units mat <- replicate(52, rnorm(100)) # add some NAs mat[sample(length(mat), 2000)] <- NA # also, column names colnames(mat) <- c(LETTERS, letters) # fit model rc <- rcorr(mat) # summarize model fit with tidiers + visualization td <- tidy(rc) td #> # A tibble: 1,326 × 5 #> column1 column2 estimate n p.value #> <chr> <chr> <dbl> <int> <dbl> #> 1 B A -0.0806 41 0.616 #> 2 C A -0.194 38 0.242 #> 3 C B 0.0811 37 0.633 #> 4 D A -0.451 37 0.00505 #> 5 D B -0.258 35 0.134 #> 6 D C -0.183 35 0.292 #> 7 E A -0.0593 42 0.709 #> 8 E B 0.0208 45 0.892 #> 9 E C -0.228 44 0.136 #> 10 E D -0.0134 34 0.940 #> # ℹ 1,316 more rows library(ggplot2) ggplot(td, aes(p.value)) + geom_histogram(binwidth = .1) ggplot(td, aes(estimate, p.value)) + geom_point() + scale_y_log10()