--- title: "Correlation" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Correlation} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r setup} library(TangledFeatures) ``` # Introduction We attempt to link similar features on the basis of correlation. TangledFeatures automatically detects the type of correlation needed between two features. We currently generate the correlations between numeric, unordered categorical variables and ordered categorical variables You can set the correlation type based upon the data set you have within the package. You can also generate the correlation heat map for the entire data set as well as the interconnected network graph # Correlation types * Numeric-to-Numeric Correlation: Set by the cor1 metric. Currently it defaults to Pearson correlation * Numeric-to-Factor(Unordered): Set by the cor2 metric. Currently it defaults to PointBiserial correlation * Numeric-to-Factor(Ordered): Set by the cor3 metric. Currently it defaults to Kendall correlation * Factor(Unordered)-to-Factor(Ordered) Set by the cor3 metric. Currently it defaults to a chi Squared test * Factor(Unordered)-to-Factor(Unordered) Set by the cor3 metric. Currently it defaults to Cramer's V * Factor(Ordered)-to-Factor(Ordered) Set by the cor3 metric. Currently it defaults to Polychoric correlation # Correlation Visualization Set plot = TRUE in the initial function We can generate the heat map from the final TangledFeatures object ```{r eval = FALSE} plot_obj <- TangledFeatures(Data)$Correlation_heatmap plot(plot_obj) ``` We can generate the interconnected graph as well ```{r eval = FALSE} plot_obj <- TangledFeatures(Data)$Graph_plot plot(plot_obj) ```