| Title: | Construct and Audit Longitudinal Decision Paths |
|---|---|
| Description: | Tools for constructing and auditing longitudinal decision paths from panel data. Implements a decision infrastructure framework for representing institutional AI systems as generators of time-ordered binary decision sequences. Provides functions to build path objects from panel data, summarise per-unit descriptors (dosage, switching rate, onset, duration, longest run), compute the Decision Reliability Index (DRI) following Cronbach (1951) <doi:10.1007/BF02310555>, estimate Shannon decision-path entropy following Shannon (1948) <doi:10.1002/j.1538-7305.1948.tb01338.x>, classify systems by infrastructure type (static, periodic, continuous, human-in-the-loop), and evaluate subgroup disparities in decision exposure and stability. Applications include education, policy, health, and organisational research. |
| Authors: | Subir Hait [aut, cre] (ORCID: <https://orcid.org/0009-0004-9871-9677>) |
| Maintainer: | Subir Hait <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.1.0 |
| Built: | 2026-05-13 08:59:03 UTC |
| Source: | https://github.com/causalfragility-lab/decisionpaths |
Produces an integrated audit summary including path descriptors, the Decision Reliability Index (DRI), Shannon entropy, and optional subgroup equity diagnostics. This is the flagship function of the decisionpaths package and implements the five-step decision infrastructure audit described in Hait (2025).
dp_audit(x, group = NULL)dp_audit(x, group = NULL)
x |
A |
group |
Optional character string naming a group variable for
stratified DRI and equity diagnostics. The variable must exist in the
original data passed to |
An object of class dp_audit, a named list with components:
Output of dp_describe.
Output of dp_dri.
Output of dp_entropy.
Output of dp_equity, or NULL if no
group variable is supplied or found.
The group variable name used (or NULL).
Hait, S. (2025). Artificial intelligence as decision infrastructure: Rethinking institutional decision processes. Preprint.
dat <- data.frame( id = c(1, 1, 1, 2, 2, 2), time = c(1, 2, 3, 1, 2, 3), decision = c(0, 1, 1, 1, 1, 0) ) dp <- dp_build(dat, id, time, decision) aud <- dp_audit(dp) print(aud)dat <- data.frame( id = c(1, 1, 1, 2, 2, 2), time = c(1, 2, 3, 1, 2, 3), decision = c(0, 1, 1, 1, 1, 0) ) dp <- dp_build(dat, id, time, decision) aud <- dp_audit(dp) print(aud)
Converts a longitudinal (panel) data frame into a decision_path
object, the core data structure used by all other functions in the package.
Supports unbalanced panels and optional outcome and group variables.
dp_build( data, id, time, decision, outcome = NULL, group = NULL, decision_labels = c("0", "1") )dp_build( data, id, time, decision, outcome = NULL, group = NULL, decision_labels = c("0", "1") )
data |
A data frame in long format (one row per unit-wave). |
id |
Unquoted name of the unit identifier column. |
time |
Unquoted name of the time/wave column (numeric or integer). |
decision |
Unquoted name of the binary decision column (0/1). |
outcome |
Optional. Unquoted name of the outcome column. |
group |
Optional. Unquoted name of a grouping column for equity analysis. |
decision_labels |
Character vector of length 2 labelling decision values
0 and 1. Default |
An object of class decision_path, which is a list containing:
A tibble with one row per unit-wave (cleaned and sorted).
A named character vector of decision sequences per unit.
Unique unit identifiers.
Sorted unique time points.
Number of units.
Maximum number of observed waves.
Logical: TRUE if all units have the same number of waves.
Logical: TRUE if outcome was supplied.
Logical: TRUE if group was supplied.
Character name of the id column.
Character name of the time column.
Character name of the decision column.
Character or NULL name of the outcome column.
Character or NULL name of the group column.
Character vector of length 2.
dat <- data.frame( id = c(1, 1, 2, 2), time = c(1, 2, 1, 2), decision = c(0, 1, 1, 0) ) dp <- dp_build(dat, id, time, decision) print(dp)dat <- data.frame( id = c(1, 1, 2, 2), time = c(1, 2, 1, 2), decision = c(0, 1, 1, 0) ) dp <- dp_build(dat, id, time, decision) print(dp)
Computes per-unit path descriptors from a decision_path object,
including dosage, switching rate, onset wave, duration, and longest run.
Returns a flat tibble — one row per unit — so that all descriptors are
directly accessible as columns (e.g. desc$dosage).
dp_describe(x, by = NULL)dp_describe(x, by = NULL)
x |
A |
by |
Optional character string naming a group variable for stratified
summaries. Defaults to |
A tibble of class dp_describe with one row per unit and
columns:
Unit identifier (column name matches original data).
Number of observed waves for this unit.
Number of waves with decision = 1.
Proportion of waves with decision = 1.
Proportion of consecutive waves where decision changed.
First wave where decision = 1 (NA if never treated).
Total number of waves with decision = 1 (same as treatment_count).
Length of longest uninterrupted run of decision = 1.
Decision sequence as a string e.g. "0-1-1-0".
Group value (NA if no group variable supplied).
dat <- data.frame( id = c(1, 1, 2, 2), time = c(1, 2, 1, 2), decision = c(0, 1, 1, 0) ) dp <- dp_build(dat, id, time, decision) desc <- dp_describe(dp) desc$dosage desc$pathdat <- data.frame( id = c(1, 1, 2, 2), time = c(1, 2, 1, 2), decision = c(0, 1, 1, 0) ) dp <- dp_build(dat, id, time, decision) desc <- dp_describe(dp) desc$dosage desc$path
Computes the Decision Reliability Index (DRI), defined as one minus the mean switching rate across units. A DRI of 1 indicates perfectly consistent decisions; 0 indicates maximum instability.
dp_dri(x, by = NULL)dp_dri(x, by = NULL)
x |
A |
by |
Optional character string naming a group variable for stratified
output. Defaults to |
A named list of class dp_dri with components:
Group variable name used (NA if none).
Mean switching rate across units.
Decision Reliability Index = 1 - mean_switching_rate.
Per-unit tibble with switching_rate column.
By-group summary tibble (NULL if no group).
Group variable name.
Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16(3), 297–334.
Nunnally, J. C. (1978). Psychometric theory (2nd ed.). McGraw-Hill.
dat <- data.frame( id = c(1, 1, 1, 2, 2, 2), time = c(1, 2, 3, 1, 2, 3), decision = c(0, 1, 1, 1, 1, 0) ) dp <- dp_build(dat, id, time, decision) dri <- dp_dri(dp) print(dri)dat <- data.frame( id = c(1, 1, 1, 2, 2, 2), time = c(1, 2, 3, 1, 2, 3), decision = c(0, 1, 1, 1, 1, 0) ) dp <- dp_build(dat, id, time, decision) dri <- dp_dri(dp) print(dri)
Computes Shannon entropy (H) of the decision-path distribution, grounded in information theory (Shannon, 1948). Entropy is measured in bits.
dp_entropy(x, by = NULL, mutual_info = FALSE)dp_entropy(x, by = NULL, mutual_info = FALSE)
x |
A |
by |
Optional character string naming a group variable for stratified
entropy. Defaults to |
mutual_info |
Logical. Compute mutual information between path and
group? Default |
An object of class dp_entropy, a named list with:
Shannon entropy H in bits.
H divided by log2(number of unique paths).
Tibble of path strings, counts, and proportions.
Number of unique decision paths observed.
By-group entropy tibble (NULL if no group variable).
Mutual information in bits (NULL if not requested).
Group variable name used.
Shannon, C. E. (1948). A mathematical theory of communication. Bell System Technical Journal, 27(3), 379–423.
dat <- data.frame( id = c(1, 1, 2, 2), time = c(1, 2, 1, 2), decision = c(0, 1, 1, 0) ) dp <- dp_build(dat, id, time, decision) ent <- dp_entropy(dp) print(ent)dat <- data.frame( id = c(1, 1, 2, 2), time = c(1, 2, 1, 2), decision = c(0, 1, 1, 0) ) dp <- dp_build(dat, id, time, decision) ent <- dp_entropy(dp) print(ent)
Produces simple subgroup summaries for key decision-path descriptors.
dp_equity(x, group)dp_equity(x, group)
x |
A decision_path object |
group |
Grouping variable name as a character string |
A tibble of grouped summaries
Produces a heatmap or spaghetti plot of sampled decision paths across units and time periods.
## S3 method for class 'decision_path' plot(x, type = "heatmap", sample_n = 50L, ...)## S3 method for class 'decision_path' plot(x, type = "heatmap", sample_n = 50L, ...)
x |
A |
type |
Character. |
sample_n |
Integer. Maximum number of units to display. Default 50. |
... |
Ignored. |
A ggplot2 object.
Produces a multi-panel summary figure combining DRI distribution, prevalence over time, dosage distribution, and equity SMDs. Requires patchwork for the combined layout; falls back to DRI panel alone.
## S3 method for class 'dp_audit' plot(x, ...)## S3 method for class 'dp_audit' plot(x, ...)
x |
A |
... |
Ignored. |
A ggplot2 or patchwork object.
Produces density or histogram plots of path descriptor distributions, optionally stratified by group.
## S3 method for class 'dp_describe' plot(x, metrics = c("dosage", "switching_rate", "onset"), ...)## S3 method for class 'dp_describe' plot(x, metrics = c("dosage", "switching_rate", "onset"), ...)
x |
A |
metrics |
Character vector of metrics to plot. Defaults to
|
... |
Ignored. |
A ggplot2 object.
Produces a histogram or density plot of per-unit switching rates with the overall DRI marked.
## S3 method for class 'dp_dri' plot(x, ...)## S3 method for class 'dp_dri' plot(x, ...)
x |
A |
... |
Ignored. |
A ggplot2 object.
Produces a bar chart of the most frequent decision paths.
## S3 method for class 'dp_entropy' plot(x, top = 10L, ...)## S3 method for class 'dp_entropy' plot(x, top = 10L, ...)
x |
A |
top |
Integer. Number of top paths to display. Default 10. |
... |
Ignored. |
A ggplot2 object.
Produces a dot plot of standardized mean differences (SMDs) across path descriptor metrics and group comparisons.
## S3 method for class 'dp_equity' plot(x, ...)## S3 method for class 'dp_equity' plot(x, ...)
x |
A |
... |
Ignored. |
A ggplot2 object.