| Title: | Longitudinal Bias Auditing for Sequential Decision Systems |
|---|---|
| Description: | Provides tools for detecting, quantifying, and visualizing algorithmic bias as a longitudinal process in repeated decision systems. Existing fairness metrics treat bias as a single-period snapshot; this package operationalizes the view that bias in sequential systems must be measured over time. Implements group-specific decision-rate trajectories, standardized disparity measures analogous to the standardized mean difference (Cohen, 1988, ISBN:0-8058-0283-5), cumulative bias burden, Markov-based transition disparity (recovery and retention gaps), and a dynamic amplification index that quantifies whether prior decisions compound current group inequality. The amplification framework extends longitudinal causal inference ideas from Robins (1986) <doi:10.1016/0270-0255(86)90088-6> and the sequential decision-process perspective in the fairness literature (see <https://fairmlbook.org>) to the audit setting. Covariate-adjusted trajectories are estimated via logistic regression, generalized additive models (Wood, 2017, <doi:10.1201/9781315370279>), or generalized linear mixed models (Bates, 2015, <doi:10.18637/jss.v067.i01>). Uncertainty quantification uses the cluster bootstrap (Cameron, 2008, <doi:10.1162/rest.90.3.414>). |
| Authors: | Subir Hait [aut, cre] (ORCID: <https://orcid.org/0009-0004-9871-9677>) |
| Maintainer: | Subir Hait <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.1.1 |
| Built: | 2026-06-05 10:55:33 UTC |
| Source: | https://github.com/causalfragility-lab/aibias |
Estimates covariate-adjusted bias trajectories by fitting a model for
and computing marginal
predicted disparities by group and time.
aib_adjust( object, formula, method = c("glm", "gam", "mixed"), ref_group = NULL, verbose = TRUE )aib_adjust( object, formula, method = c("glm", "gam", "mixed"), ref_group = NULL, verbose = TRUE )
object |
An |
formula |
A one-sided formula specifying covariates, e.g.
|
method |
One of
|
ref_group |
Character. Reference group. |
verbose |
Logical. |
The aibias object with $adjusted populated, containing:
trajectory: adjusted bias trajectory
marginal_rates: marginal predicted rates by group and time
model: the fitted model object
formula_used: the full formula passed to the model
data(lending_panel) obj <- aib_build(lending_panel, "applicant_id", "year", "race", "approved") obj <- aib_adjust(obj, formula = ~ income + credit_score, method = "glm", ref_group = "White")data(lending_panel) obj <- aib_build(lending_panel, "applicant_id", "year", "race", "approved") obj <- aib_adjust(obj, formula = ~ income + credit_score, method = "glm", ref_group = "White")
Estimates the amplification index ,
which measures how conditioning on prior decision state changes the group
disparity at time . Non-zero amplification indicates that prior
decisions are shaping current disparities—the hallmark of compounding bias.
aib_amplify( object, ref_group = NULL, sign = c("mechanism", "legacy"), verbose = TRUE )aib_amplify( object, ref_group = NULL, sign = c("mechanism", "legacy"), verbose = TRUE )
object |
An |
ref_group |
Character. Reference group. |
sign |
Character; |
verbose |
Logical. |
A decision system exhibits bias amplification if:
for some (disparity grows), AND
(prior decisions drive current disparity), OR
(transition matrices are unequal)
The aibias object with $amplification populated. Contains:
lagged_disparity: for d equal to 0 or 1
index: Amplification index
cumulative: summed over time
matrix_norm:
data(lending_panel) obj <- aib_build(lending_panel, "applicant_id", "year", "race", "approved") obj <- aib_transition(obj, ref_group = "White") obj <- aib_amplify(obj, ref_group = "White") obj$amplification$indexdata(lending_panel) obj <- aib_build(lending_panel, "applicant_id", "year", "race", "approved") obj <- aib_transition(obj, ref_group = "White") obj <- aib_amplify(obj, ref_group = "White") obj$amplification$index
A convenience wrapper that runs the complete audit pipeline:
aib_describe() → aib_transition() → aib_amplify().
Optionally runs aib_bootstrap() for uncertainty quantification.
aib_audit( object, ref_group = NULL, bootstrap = FALSE, B = 500, seed = NULL, verbose = TRUE, ... )aib_audit( object, ref_group = NULL, bootstrap = FALSE, B = 500, seed = NULL, verbose = TRUE, ... )
object |
An |
ref_group |
Character. Reference group. |
bootstrap |
Logical. Run bootstrap CIs? Default FALSE. |
B |
Integer. Bootstrap replicates if |
seed |
Integer. Random seed. |
verbose |
Logical. |
... |
If |
A fully-populated aibias object.
data(lending_panel) result <- aib_audit(lending_panel, id = "applicant_id", time = "year", group = "race", decision = "approved", ref_group = "White") summary(result) plot(result, type = "trajectory")data(lending_panel) result <- aib_audit(lending_panel, id = "applicant_id", time = "year", group = "race", decision = "approved", ref_group = "White") summary(result) plot(result, type = "trajectory")
Computes bootstrap confidence intervals for the bias trajectory and cumulative burden by resampling units (cluster bootstrap).
aib_bootstrap( object, B = 500, conf = 0.95, ref_group = NULL, seed = NULL, verbose = TRUE )aib_bootstrap( object, B = 500, conf = 0.95, ref_group = NULL, seed = NULL, verbose = TRUE )
object |
An |
B |
Integer. Number of bootstrap replicates. Default 500. |
conf |
Numeric. Confidence level. Default 0.95. |
ref_group |
Character. Reference group. |
seed |
Integer. Random seed for reproducibility. |
verbose |
Logical. |
Uses the cluster (unit-level) bootstrap to preserve the panel structure. Units are resampled with replacement; all their time observations are retained.
The aibias object with $bootstrap populated.
data(lending_panel) obj <- aib_build(lending_panel, "applicant_id", "year", "race", "approved") obj <- aib_describe(obj, ref_group = "White") obj <- aib_bootstrap(obj, B = 200, seed = 42)data(lending_panel) obj <- aib_build(lending_panel, "applicant_id", "year", "race", "approved") obj <- aib_describe(obj, ref_group = "White") obj <- aib_bootstrap(obj, B = 200, seed = 42)
Constructs the core aibias S3 object from a panel dataset. Validates
the panel structure and prepares internal data for downstream analysis.
aib_build(data, id, time, group, decision, outcome = NULL, verbose = TRUE)aib_build(data, id, time, group, decision, outcome = NULL, verbose = TRUE)
data |
A data frame in long (panel) format. |
id |
Character. Name of the unit identifier column. |
time |
Character. Name of the time/wave column (integer or factor). |
group |
Character. Name of the protected group column. |
decision |
Character. Name of the binary decision column (0/1). |
outcome |
Character or NULL. Optional downstream outcome column. |
verbose |
Logical. Print validation messages. Default TRUE. |
The function expects a balanced or unbalanced panel where:
id indexes units observed over multiple periods
time is an ordered index (will be coerced to integer rank)
group is a categorical variable indicating protected group membership
decision is a binary 0/1 variable (1 = favorable decision)
An object of class "aibias".
data(lending_panel) obj <- aib_build(lending_panel, id = "applicant_id", time = "year", group = "race", decision = "approved")data(lending_panel) obj <- aib_build(lending_panel, id = "applicant_id", time = "year", group = "race", decision = "approved")
Computes group decision rate trajectories, raw and standardized bias trajectories, and cumulative bias burden.
aib_describe(object, ref_group = NULL, weights = NULL, verbose = TRUE)aib_describe(object, ref_group = NULL, weights = NULL, verbose = TRUE)
object |
An |
ref_group |
Character. Reference group label. If NULL, uses the first group level. |
weights |
Numeric vector of time weights for cumulative burden. Length must equal the number of time points. If NULL, equal weights. |
verbose |
Logical. Print summary output. |
The aibias object with $bias populated. The bias element is
a list with components:
trajectory: raw bias trajectory
trajectory_smd: standardized bias trajectory
cumulative: cumulative bias burden
slope: first differences
curvature: second differences
ref_group: reference group used
data(lending_panel) obj <- aib_build(lending_panel, "applicant_id", "year", "race", "approved") obj <- aib_describe(obj, ref_group = "White") obj$bias$cumulativedata(lending_panel) obj <- aib_build(lending_panel, "applicant_id", "year", "race", "approved") obj <- aib_describe(obj, ref_group = "White") obj$bias$cumulative
Copies the paper figures script to your working directory and optionally runs it. The script produces four publication-ready figures illustrating bias trajectory, transition asymmetry, amplification index, and cumulative burden from a toy simulation (N=20, T=3).
aib_figures(run = TRUE, dest = file.path(tempdir(), "paper_figures.R"))aib_figures(run = TRUE, dest = file.path(tempdir(), "paper_figures.R"))
run |
Logical. If TRUE (default), runs the script immediately. If FALSE, just copies the file for you to inspect and edit first. |
dest |
Character. Destination filename. Default |
The path to the copied script, invisibly.
# Copy and run immediately aib_figures() # Just copy to inspect first aib_figures(run = FALSE, dest = file.path(tempdir(), "paper_figures.R"))# Copy and run immediately aib_figures() # Just copy to inspect first aib_figures(run = FALSE, dest = file.path(tempdir(), "paper_figures.R"))
Check panel balance
aib_panel_info(object)aib_panel_info(object)
object |
An |
A tibble summarizing observation counts per unit.
Compute bias persistence above a threshold
aib_persistence(object, threshold = 0.05)aib_persistence(object, threshold = 0.05)
object |
An |
threshold |
Numeric. Minimum absolute disparity to count. Default 0.05. |
A tibble with group-level persistence counts.
Estimates group-specific Markov transition probabilities
and derives transition disparities, advantage retention, and recovery gaps.
aib_transition(object, ref_group = NULL, verbose = TRUE)aib_transition(object, ref_group = NULL, verbose = TRUE)
object |
An |
ref_group |
Character. Reference group. If NULL, uses first level. |
verbose |
Logical. Print summary. |
The aibias object with $transitions populated. Contains:
probs: Transition probabilities by group and time
pooled: Pooled transition probabilities (time-averaged)
disparity: Transition disparity
recovery_gap: Disparity in 0->1 transitions (recovery)
retention_gap: Disparity in 1->1 transitions (retention)
matrices: Named list of 2x2 transition matrices per group
data(lending_panel) obj <- aib_build(lending_panel, "applicant_id", "year", "race", "approved") obj <- aib_transition(obj, ref_group = "White") obj$transitions$pooleddata(lending_panel) obj <- aib_build(lending_panel, "applicant_id", "year", "race", "approved") obj <- aib_transition(obj, ref_group = "White") obj$transitions$pooled
A synthetic panel dataset simulating loan application decisions over six years for applicants from three racial groups. Designed to illustrate longitudinal bias analysis with AIBias.
The data are generated so that Black and Hispanic applicants face lower approval rates, lower recovery probabilities after denial, and lower retention probabilities after approval — producing compounding disparities over time.
lending_panellending_panel
A data frame with 3,600 rows and 6 columns:
Character. Unique applicant identifier.
Integer. Year of application (2015–2020).
Factor. Racial group: White, Black, Hispanic.
Numeric. Annual income (thousands USD).
Numeric. Credit score (300–850).
Integer. Loan approval decision (1 = approved, 0 = denied).
Transition parameters used in data generation:
| Group | P(approve | prev approved) | P(approve | prev denied) | |———-|————————|———————| | White | 0.82 | 0.65 | | Black | 0.62 | 0.38 | | Hispanic | 0.68 | 0.44 |
Synthetic data generated via data-raw/lending_panel.R.
data(lending_panel) head(lending_panel) table(lending_panel$race, lending_panel$year)data(lending_panel) head(lending_panel) table(lending_panel$race, lending_panel$year)
Visualizes audit results. Supports four plot types:
"trajectory": Bias trajectory over time
"heatmap": Group-time disparity surface
"transition": Group-specific transition probabilities
"amplification": Amplification index over time
## S3 method for class 'aibias' plot( x, type = c("trajectory", "heatmap", "transition", "amplification"), show_ci = TRUE, color_palette = NULL, ... )## S3 method for class 'aibias' plot( x, type = c("trajectory", "heatmap", "transition", "amplification"), show_ci = TRUE, color_palette = NULL, ... )
x |
An |
type |
Character. Plot type. One of |
show_ci |
Logical. Show bootstrap CIs if available. Default TRUE. |
color_palette |
Character vector of colors for groups. If NULL, uses a sensible default. |
... |
Ignored. |
A ggplot2 object.
data(lending_panel) obj <- aib_audit(lending_panel, id = "applicant_id", time = "year", group = "race", decision = "approved", ref_group = "White", verbose = FALSE) plot(obj, type = "trajectory") plot(obj, type = "heatmap")data(lending_panel) obj <- aib_audit(lending_panel, id = "applicant_id", time = "year", group = "race", decision = "approved", ref_group = "White", verbose = FALSE) plot(obj, type = "trajectory") plot(obj, type = "heatmap")
Print an aibias object
## S3 method for class 'aibias' print(x, ...)## S3 method for class 'aibias' print(x, ...)
x |
An |
... |
Ignored. |
Invisibly returns x, called for its side effect of printing a concise summary of the audit object to the console.
Produces a comprehensive audit summary including trajectory statistics, transition gaps, amplification indices, and narrative interpretation.
## S3 method for class 'aibias' summary(object, ...)## S3 method for class 'aibias' summary(object, ...)
object |
An |
... |
Ignored. |
Invisibly returns object, called for its side effect of printing a comprehensive audit summary to the console.