Tidy summarizes information about the components of a model. A model component might be a single term in a regression, a single hypothesis, a cluster, or a class. Exactly what tidy considers to be a model component varies across models but is usually self-evident. If a model has several distinct types of components, you will need to specify which components to return.

# S3 method for rcorr
tidy(x, diagonal = FALSE, ...)

## Arguments

x An rcorr object returned from Hmisc::rcorr(). Logical indicating whether or not to include diagonal elements of the correlation matrix, or the correlation of a column with itself. For the elements, estimate is always 1 and p.value is always NA. Defaults to FALSE. Additional arguments. Not used. Needed to match generic signature only. Cautionary note: Misspelled arguments will be absorbed in ..., where they will be ignored. If the misspelled argument has a default value, the default value will be used. For example, if you pass conf.lvel = 0.9, all computation will proceed using conf.level = 0.95. Additionally, if you pass newdata = my_tibble to an augment() method that does not accept a newdata argument, it will use the default value for the data argument.

## Details

Suppose the original data has columns A and B. In the correlation matrix from rcorr there may be entries for both the cor(A, B) and cor(B, A). Only one of these pairs will ever be present in the tidy output.

tidy(), Hmisc::rcorr()

## Value

A tibble::tibble() with columns:

column1

Name or index of the first column being described.

column2

Name or index of the second column being described.

estimate

The estimated value of the regression term.

p.value

The two-sided p-value associated with the observed statistic.

n

Number of observations used to compute the correlation

## Examples


#> Attaching package: ‘Hmisc’#> The following object is masked from ‘package:psych’:
#>
#>     describe#> The following object is masked from ‘package:network’:
#>
#>     is.discrete#> The following object is masked from ‘package:survey’:
#>
#>     deff#> The following objects are masked from ‘package:dplyr’:
#>
#>     src, summarize#> The following objects are masked from ‘package:base’:
#>
#>     format.pval, units
mat <- replicate(52, rnorm(100))
mat[sample(length(mat), 2000)] <- NA
# also column names
colnames(mat) <- c(LETTERS, letters)

rc <- rcorr(mat)

td <- tidy(rc)
td#> # A tibble: 1,326 x 5
#>    column1 column2 estimate     n p.value
#>    <chr>   <chr>      <dbl> <int>   <dbl>
#>  1 B       A        0.141      37  0.405
#>  2 C       A        0.0427     39  0.796
#>  3 C       B        0.0322     37  0.850
#>  4 D       A        0.00973    36  0.955
#>  5 D       B        0.131      31  0.482
#>  6 D       C        0.345      42  0.0251
#>  7 E       A       -0.219      36  0.199
#>  8 E       B       -0.00751    35  0.966
#>  9 E       C        0.185      46  0.218
#> 10 E       D       -0.0467     40  0.775
#> # … with 1,316 more rows
library(ggplot2)
ggplot(td, aes(p.value)) +
geom_histogram(binwidth = .1)
ggplot(td, aes(estimate, p.value)) +
geom_point() +
scale_y_log10()