Descriptive statistic on dataset

This function provides a summary of a dataset, including both numeric and non-numeric variables. For numeric variables, it calculates basic descriptive statistics such as minimum, maximum, median, mean, and count of non-missing values. Additionally, users can pass custom functions via the fn argument to compute additional statistics for numeric variables. For non-numeric variables, it provides frequency counts and proportions for each unique value.

Usage

ct_describe_df(data, ..., fn = NULL)

Arguments

data: A data frame containing the dataset to be summarized.
...: (Optional) Column to include in the summary. If no column is specifie, all columns in the data will be included.
fn: A named list of functions to apply to numeric variables. Each function must accept x as a vector of numeric values and return a single value or a named vector. Additional arguments for these functions can be specified as a list. For example: fn = list('sum' = list(na.rm = TRUE), 'sd').

Value

A tibble

Examples

ct_describe_df(data = data.frame(x = c(1:3, NA),
                                 y = c(3:4, NA, NA),
                                 z = c("A", "A", "B", "A")),
               y, x, z,
               fn = list('sum' = list(na.rm = TRUE), 'sd' = list(na.rm = TRUE))
              )
#> # A tibble: 4 × 12
#>   Variable Group  Prop     N   Min   Max Median  Mean `CI Left` `CI Right`
#>   <chr>    <chr> <dbl> <int> <dbl> <dbl>  <dbl> <dbl>     <dbl>      <dbl>
#> 1 y        NA       NA     2     3     4    3.5   3.5    -2.85        9.85
#> 2 x        NA       NA     3     1     3    2     2      -0.484       4.48
#> 3 z        A        75     3    NA    NA   NA    NA      NA          NA   
#> 4 z        B        25     1    NA    NA   NA    NA      NA          NA   
#> # ℹ 2 more variables: sd <dbl>, sum <int>

Usage

Arguments

Value

See also

Examples