Skip to contents

This function provides a summary of a dataset, including both numeric and non-numeric variables. For numeric variables, it calculates basic descriptive statistics such as minimum, maximum, median, mean, and count of non-missing values. Additionally, users can pass custom functions via the fn argument to compute additional statistics for numeric variables. For non-numeric variables, it provides frequency counts and proportions for each unique value.

Usage

mm_describe_df(data, ..., fn = NULL)

Arguments

data

A data frame containing the dataset to be summarized.

...

(Optional) Column to include in the summary. If no column is specifie, all columns in the data will be included.

fn

A named list of functions to apply to numeric variables. Each function must accept x as a vector of numeric values and return a single value or a named vector. Additional arguments for these functions can be specified as a list. For example: fn = list('sum' = list(na.rm = TRUE), 'sd').

Value

A tibble

See also

parse_list_fn

Examples

mm_describe_df(data = data.frame(x = c(1:3, NA),
                                 y = c(3:4, NA, NA),
                                 z = c("A", "A", "B", "A")),
               y, x, z,
               fn = list('sum' = list(na.rm = TRUE), 'sd' = list(na.rm = TRUE))
              )
#> # A tibble: 4 × 12
#>   Variable Group  Prop     N   Min   Max Median  Mean `CI Left` `CI Right`   sum
#>   <chr>    <chr> <dbl> <int> <dbl> <dbl>  <dbl> <dbl>     <dbl>      <dbl> <int>
#> 1 y        NA       NA     2     3     4    3.5   3.5    -2.85        9.85     7
#> 2 x        NA       NA     3     1     3    2     2      -0.484       4.48     6
#> 3 z        A        75     3    NA    NA   NA    NA      NA          NA       NA
#> 4 z        B        25     1    NA    NA   NA    NA      NA          NA       NA
#> # ℹ 1 more variable: sd <dbl>