| Title: | Temporal Auditing of Social Interaction Networks |
|---|---|
| Description: | Tools for constructing, auditing, and visualizing temporal social interaction networks from event-log data. Supports graph construction from raw user-to-user interaction logs, longitudinal tracking of network structure, community dynamics, user role trajectories, and concentration of engagement over time. Designed for computational social science, platform analytics, and digital community health monitoring. Includes four longitudinal audit indices: the Network Drift Index ('NDI'), Community Fragmentation Index ('CFI'), Visibility Concentration Index ('VCI'), and Role Mobility Index ('RMI'). 'NDI', 'CFI', 'VCI', and 'RMI' are purpose-built composite scores for longitudinal platform auditing. |
| Authors: | Subir Hait [aut, cre] (ORCID: <https://orcid.org/0009-0004-9871-9677>) |
| Maintainer: | Subir Hait <[email protected]> |
| License: | GPL-3 |
| Version: | 0.1.0 |
| Built: | 2026-05-22 07:45:20 UTC |
| Source: | https://github.com/causalfragility-lab/socialdrift |
Validates and standardizes a data frame of user-to-user interaction events for use in temporal social network analysis with socialdrift.
as_social_events( data, actor = "actor_id", target = "target_id", time = "timestamp", event_type = NULL, weight = NULL, actor_group = NULL, target_group = NULL )as_social_events( data, actor = "actor_id", target = "target_id", time = "timestamp", event_type = NULL, weight = NULL, actor_group = NULL, target_group = NULL )
data |
A data frame or tibble containing interaction events. |
actor |
Column name (character) for the source actor. Default |
target |
Column name (character) for the target actor. Default |
time |
Column name (character) for the event timestamp. Must be
|
event_type |
Optional column name (character) for interaction type
(e.g., |
weight |
Optional column name (character) for edge weight. If |
actor_group |
Optional column name (character) for actor group membership (used in group disparity analyses). |
target_group |
Optional column name (character) for target group membership. |
A tibble of class social_events with standardized columns:
actor_id, target_id, timestamp, event_type, weight, and
optionally actor_group, target_group.
events <- data.frame( from = c("u1", "u1", "u2", "u3"), to = c("u2", "u3", "u3", "u4"), when = as.POSIXct(c("2025-01-01", "2025-01-03", "2025-01-04", "2025-01-06")) ) ev <- as_social_events(events, actor = "from", target = "to", time = "when") evevents <- data.frame( from = c("u1", "u1", "u2", "u3"), to = c("u2", "u3", "u3", "u4"), when = as.POSIXct(c("2025-01-01", "2025-01-03", "2025-01-04", "2025-01-06")) ) ev <- as_social_events(events, actor = "from", target = "to", time = "when") ev
Computes per-group summaries of structural position (degree, betweenness) across all periods in a graph series. Useful for detecting systematic disparities in network access or visibility.
audit_group_disparities( data, graph_series, group_var = "actor_group", window = c("month", "week", "day", "quarter", "year") )audit_group_disparities( data, graph_series, group_var = "actor_group", window = c("month", "week", "day", "quarter", "year") )
data |
A standardized social event tibble with group membership columns. |
graph_series |
A |
group_var |
Column name identifying actor groups. Default |
window |
Aggregation window. Default |
A tibble with one row per period x group, including:
mean_indegree, mean_outdegree, mean_betweenness, isolation_rate
(proportion of isolated nodes), and n_users.
data(sim_social_events) ev <- as_social_events( sim_social_events, actor_group = "actor_group", target_group = "target_group" ) gs <- build_graph_series(ev, window = "month") audit_group_disparities(ev, gs)data(sim_social_events) ev <- as_social_events( sim_social_events, actor_group = "actor_group", target_group = "target_group" ) gs <- build_graph_series(ev, window = "month") audit_group_disparities(ev, gs)
Splits event data into non-overlapping time windows and builds one graph per window. Returns a named list of igraph objects.
build_graph_series( data, window = c("day", "week", "month", "quarter", "year"), directed = TRUE, weighted = TRUE, remove_self_loops = TRUE )build_graph_series( data, window = c("day", "week", "month", "quarter", "year"), directed = TRUE, weighted = TRUE, remove_self_loops = TRUE )
data |
A standardized social event tibble (output of |
window |
Aggregation window: one of |
directed |
Logical; if |
weighted |
Logical; if |
remove_self_loops |
Logical; if |
A named list of igraph objects of class social_graph_series.
Names are ISO date strings representing the start of each period.
data(sim_social_events) ev <- as_social_events(sim_social_events) gs <- build_graph_series(ev, window = "month") length(gs) names(gs)data(sim_social_events) ev <- as_social_events(sim_social_events) gs <- build_graph_series(ev, window = "month") length(gs) names(gs)
Aggregates event-level interactions into a directed or undirected igraph object for a specified time window.
build_graph_snapshot( data, start = NULL, end = NULL, directed = TRUE, weighted = TRUE, remove_self_loops = TRUE )build_graph_snapshot( data, start = NULL, end = NULL, directed = TRUE, weighted = TRUE, remove_self_loops = TRUE )
data |
A standardized social event tibble (output of |
start |
Optional start date/time (inclusive). Events before this are excluded. |
end |
Optional end date/time (exclusive). Events from this point are excluded. |
directed |
Logical; if |
weighted |
Logical; if |
remove_self_loops |
Logical; if |
An igraph object. Returns an empty graph if no events fall in the specified window.
data(sim_social_events) ev <- as_social_events(sim_social_events) g <- build_graph_snapshot(ev) igraph::vcount(g) igraph::ecount(g)data(sim_social_events) ev <- as_social_events(sim_social_events) g <- build_graph_snapshot(ev) igraph::vcount(g) igraph::ecount(g)
Assigns each node a structural role based on its in-degree, out-degree, and betweenness centrality relative to the rest of the network.
classify_user_roles(graph)classify_user_roles(graph)
graph |
An igraph object. |
Role classification rules (applied in order):
isolated — degree = 0 in both directions.
bridge — betweenness in top quartile.
core — both in- and out-degree in top quartile.
popular — in-degree in top quartile, out-degree below.
broadcaster — out-degree in top quartile, in-degree below.
peripheral — all other nodes.
A tibble with columns:
Node name.
In-degree.
Out-degree.
Betweenness centrality.
Assigned structural role.
data(sim_social_events) ev <- as_social_events(sim_social_events) g <- build_graph_snapshot(ev) classify_user_roles(g)data(sim_social_events) ev <- as_social_events(sim_social_events) g <- build_graph_snapshot(ev) classify_user_roles(g)
The global clustering coefficient (transitivity) measures the tendency of nodes to form tightly connected triangles. High values indicate clique-like structure; low values indicate sparse or tree-like networks.
clustering_ts(graph_series)clustering_ts(graph_series)
graph_series |
A |
A tibble with columns period and clustering.
Returns NA for periods with fewer than 3 nodes.
data(sim_social_events) ev <- as_social_events(sim_social_events) gs <- build_graph_series(ev, window = "month") clustering_ts(gs)data(sim_social_events) ev <- as_social_events(sim_social_events) gs <- build_graph_series(ev, window = "month") clustering_ts(gs)
Derives period-over-period changes in community structure metrics. Acts as a first-order approximation of how communities are evolving.
community_drift(community_tbl)community_drift(community_tbl)
community_tbl |
Output of |
The input tibble augmented with columns:
Change in number of communities.
Change in modularity.
Change in size of largest community.
data(sim_social_events) ev <- as_social_events(sim_social_events) gs <- build_graph_series(ev, window = "month") ct <- detect_communities_ts(gs) community_drift(ct)data(sim_social_events) ev <- as_social_events(sim_social_events) gs <- build_graph_series(ev, window = "month") ct <- detect_communities_ts(gs) community_drift(ct)
Computes a composite Community Fragmentation Index for each period, combining modularity, number of communities, and singleton prevalence into a single bounded score between 0 and 1. Higher values indicate a more fragmented, siloed network.
community_fragmentation_index(community_tbl)community_fragmentation_index(community_tbl)
community_tbl |
Output of |
CFI is computed as:
where is min-max scaled modularity, is
scaled community count, and is singleton proportion.
A tibble with columns period and cfi.
data(sim_social_events) ev <- as_social_events(sim_social_events) gs <- build_graph_series(ev, window = "month") ct <- detect_communities_ts(gs) community_fragmentation_index(ct)data(sim_social_events) ev <- as_social_events(sim_social_events) gs <- build_graph_series(ev, window = "month") ct <- detect_communities_ts(gs) community_fragmentation_index(ct)
Computes how concentrated incoming interactions (in-degree) are across the network. High concentration indicates a few nodes receive most attention.
creator_concentration(graph_series, p = 0.1)creator_concentration(graph_series, p = 0.1)
graph_series |
A |
p |
Proportion of top actors to include in concentration measures.
Default |
A tibble with columns:
Time period.
Gini coefficient of in-degree (0 = equal, 1 = monopoly).
Share of total in-degree held by top-p actors.
data(sim_social_events) ev <- as_social_events(sim_social_events) gs <- build_graph_series(ev, window = "month") creator_concentration(gs)data(sim_social_events) ev <- as_social_events(sim_social_events) gs <- build_graph_series(ev, window = "month") creator_concentration(gs)
Uses the Gini coefficient of node degree as a measure of how unequally connections are distributed across the network. A Gini of 0 means perfect equality; 1 means one node holds all connections.
degree_inequality_ts(graph_series, mode = c("all", "in", "out"))degree_inequality_ts(graph_series, mode = c("all", "in", "out"))
graph_series |
A |
mode |
Degree mode: |
A tibble with columns period, degree_gini, and degree_mean.
data(sim_social_events) ev <- as_social_events(sim_social_events) gs <- build_graph_series(ev, window = "month") degree_inequality_ts(gs)data(sim_social_events) ev <- as_social_events(sim_social_events) gs <- build_graph_series(ev, window = "month") degree_inequality_ts(gs)
Applies the Louvain community detection algorithm to each snapshot in a graph series. Directed graphs are coerced to undirected for community detection (standard practice for modularity-based methods).
detect_communities_ts(graph_series)detect_communities_ts(graph_series)
graph_series |
A |
A tibble with columns:
Time period (character).
Number of communities detected.
Modularity score (higher = more modular community structure).
Size of the largest community.
Number of isolated single-node communities.
data(sim_social_events) ev <- as_social_events(sim_social_events) gs <- build_graph_series(ev, window = "month") detect_communities_ts(gs)data(sim_social_events) ev <- as_social_events(sim_social_events) gs <- build_graph_series(ev, window = "month") detect_communities_ts(gs)
Compares average in-degree (received attention) and out-degree (initiated interactions) between two user groups defined in the original event data.
engagement_gap( data, graph_series, group_var = "actor_group", window = c("month", "week", "day", "quarter", "year") )engagement_gap( data, graph_series, group_var = "actor_group", window = c("month", "week", "day", "quarter", "year") )
data |
A standardized social event tibble with |
graph_series |
A |
group_var |
Column in |
window |
Aggregation window matching the one used in |
A tibble with columns period, group, mean_indegree,
mean_outdegree, and engagement_ratio (indegree / outdegree).
data(sim_social_events) ev <- as_social_events( sim_social_events, actor_group = "actor_group", target_group = "target_group" ) gs <- build_graph_series(ev, window = "month") engagement_gap(ev, gs, group_var = "actor_group", window = "month")data(sim_social_events) ev <- as_social_events( sim_social_events, actor_group = "actor_group", target_group = "target_group" ) gs <- build_graph_series(ev, window = "month") engagement_gap(ev, gs, group_var = "actor_group", window = "month")
Network density is the proportion of possible edges that are present. High density indicates a highly connected network; low density indicates sparse interaction.
network_density_ts(graph_series)network_density_ts(graph_series)
graph_series |
A |
A tibble with columns:
Time period (character, ISO date).
Number of active nodes.
Number of edges.
Network density (0–1).
data(sim_social_events) ev <- as_social_events(sim_social_events) gs <- build_graph_series(ev, window = "month") network_density_ts(gs)data(sim_social_events) ev <- as_social_events(sim_social_events) gs <- build_graph_series(ev, window = "month") network_density_ts(gs)
The Network Drift Index quantifies how much a network's structure changes between consecutive time periods. It combines changes in edge turnover, degree distribution, community structure, and centralization into a single composite score.
network_drift(graph_series, w1 = 0.3, w2 = 0.25, w3 = 0.25, w4 = 0.2)network_drift(graph_series, w1 = 0.3, w2 = 0.25, w3 = 0.25, w4 = 0.2)
graph_series |
A |
w1 |
Weight for edge turnover. Default |
w2 |
Weight for degree distribution shift. Default |
w3 |
Weight for community structure change. Default |
w4 |
Weight for centralization change. Default |
Default weights: w1 = 0.30, w2 = 0.25, w3 = 0.25, w4 = 0.20.
Component definitions:
Jaccard distance between edge sets of adjacent snapshots.
Jensen-Shannon-like divergence of degree distributions.
Absolute change in community modularity.
Absolute change in degree centralization.
A tibble with columns period, edge_turnover,
degree_shift, modularity_change, centralization_change, and ndi.
data(sim_social_events) ev <- as_social_events(sim_social_events) gs <- build_graph_series(ev, window = "month") network_drift(gs)data(sim_social_events) ev <- as_social_events(sim_social_events) gs <- build_graph_series(ev, window = "month") network_drift(gs)
Compute the Network Drift Index (alias for network_drift)
network_drift_index(graph_series, w1 = 0.3, w2 = 0.25, w3 = 0.25, w4 = 0.2)network_drift_index(graph_series, w1 = 0.3, w2 = 0.25, w3 = 0.25, w4 = 0.2)
graph_series |
A |
w1 |
Weight for edge turnover. Default |
w2 |
Weight for degree distribution shift. Default |
w3 |
Weight for community structure change. Default |
w4 |
Weight for centralization change. Default |
See network_drift().
Visualises the composite NDI and its four component scores across periods.
plot_network_drift(drift_tbl, show_components = TRUE)plot_network_drift(drift_tbl, show_components = TRUE)
drift_tbl |
Output of |
show_components |
Logical; if |
A ggplot object.
data(sim_social_events) ev <- as_social_events(sim_social_events) gs <- build_graph_series(ev, window = "month") dt <- network_drift(gs) plot_network_drift(dt)data(sim_social_events) ev <- as_social_events(sim_social_events) gs <- build_graph_series(ev, window = "month") dt <- network_drift(gs) plot_network_drift(dt)
Produces a line chart of a metric (or faceted chart for multiple metrics)
from the output of metric functions such as network_density_ts(),
reciprocity_ts(), or summarize_network_series().
plot_network_metrics( data, metric = NULL, title = "Network Metrics Over Time", colour = "#2c7bb6" )plot_network_metrics( data, metric = NULL, title = "Network Metrics Over Time", colour = "#2c7bb6" )
data |
A tibble with a |
metric |
Character vector of column name(s) to plot. If more than one,
a faceted chart is produced. If |
title |
Plot title. Default |
colour |
Line colour. Default |
A ggplot object.
data(sim_social_events) ev <- as_social_events(sim_social_events) gs <- build_graph_series(ev, window = "month") tbl <- network_density_ts(gs) plot_network_metrics(tbl, metric = "density")data(sim_social_events) ev <- as_social_events(sim_social_events) gs <- build_graph_series(ev, window = "month") tbl <- network_density_ts(gs) plot_network_metrics(tbl, metric = "density")
Shows how the proportion of each structural role changes across periods as a stacked bar chart.
plot_role_trajectories(role_tbl, type = c("stacked", "line"))plot_role_trajectories(role_tbl, type = c("stacked", "line"))
role_tbl |
Output of |
type |
|
A ggplot object.
data(sim_social_events) ev <- as_social_events(sim_social_events) gs <- build_graph_series(ev, window = "month") rt <- role_trajectories(gs) plot_role_trajectories(rt)data(sim_social_events) ev <- as_social_events(sim_social_events) gs <- build_graph_series(ev, window = "month") rt <- role_trajectories(gs) plot_role_trajectories(rt)
Reciprocity is the proportion of edges that have a mutual counterpart. Applies to directed graphs only. High reciprocity suggests mutual engagement; low reciprocity suggests broadcast-like interaction.
reciprocity_ts(graph_series)reciprocity_ts(graph_series)
graph_series |
A |
A tibble with columns period and reciprocity.
Returns NA for undirected graphs or periods with no edges.
data(sim_social_events) ev <- as_social_events(sim_social_events) gs <- build_graph_series(ev, window = "month") reciprocity_ts(gs)data(sim_social_events) ev <- as_social_events(sim_social_events) gs <- build_graph_series(ev, window = "month") reciprocity_ts(gs)
Quantifies how much users move between structural roles across time periods. A high RMI indicates a dynamic network where role transitions are frequent; a low RMI indicates structural stability.
role_mobility_index(role_tbl)role_mobility_index(role_tbl)
role_tbl |
Output of |
For each user observed in at least two consecutive periods, the RMI counts the proportion of adjacent-period pairs where the role changed. The overall RMI is the mean across all such users.
A tibble with columns:
Overall Role Mobility Index (0–1).
Number of users with >= 2 observed periods.
Mean number of role transitions per user.
data(sim_social_events) ev <- as_social_events(sim_social_events) gs <- build_graph_series(ev, window = "month") rt <- role_trajectories(gs) role_mobility_index(rt)data(sim_social_events) ev <- as_social_events(sim_social_events) gs <- build_graph_series(ev, window = "month") rt <- role_trajectories(gs) role_mobility_index(rt)
Applies classify_user_roles() to every snapshot in a graph series and
returns a longitudinal tibble of role assignments.
role_trajectories(graph_series)role_trajectories(graph_series)
graph_series |
A |
A tibble with columns period, node, indegree, outdegree,
betweenness, and role.
data(sim_social_events) ev <- as_social_events(sim_social_events) gs <- build_graph_series(ev, window = "month") role_trajectories(gs)data(sim_social_events) ev <- as_social_events(sim_social_events) gs <- build_graph_series(ev, window = "month") role_trajectories(gs)
A synthetic dataset of 591 user-to-user interaction events spanning January through June 2025. Designed for demonstrating the socialdrift workflow. Includes group membership columns for disparity analyses.
A data frame with 591 rows and 7 variables:
Source user ID (character, "u1" to "u60").
Target user ID (character).
Event timestamp (POSIXct, UTC).
Type of interaction: "follow", "reply",
"mention", "like", or "repost".
Interaction weight (integer, always 1 in this dataset).
Group membership of the actor:
"A", "B", or "C".
Group membership of the target:
"A", "B", or "C".
The dataset was generated with preferential attachment: earlier-indexed users have slightly higher probability of being selected as targets, creating realistic degree inequality. Event types are sampled with probabilities approximating typical social platform distributions.
Simulated data. See data-raw/sim_social_events.R for the
generation script.
data(sim_social_events) head(sim_social_events) table(sim_social_events$event_type)data(sim_social_events) head(sim_social_events) table(sim_social_events$event_type)
A convenience wrapper that computes density, reciprocity, clustering, and degree inequality for every period and returns a single wide tibble.
summarize_network_series(graph_series)summarize_network_series(graph_series)
graph_series |
A |
A tibble with one row per period and columns for all key metrics.
data(sim_social_events) ev <- as_social_events(sim_social_events) gs <- build_graph_series(ev, window = "month") summarize_network_series(gs)data(sim_social_events) ev <- as_social_events(sim_social_events) gs <- build_graph_series(ev, window = "month") summarize_network_series(gs)
Checks that a social events tibble contains required columns with valid
values. Called automatically by as_social_events().
validate_social_events(data)validate_social_events(data)
data |
A standardized social event tibble (output of |
The validated tibble, invisibly.
data(sim_social_events) ev <- as_social_events(sim_social_events) validate_social_events(ev)data(sim_social_events) ev <- as_social_events(sim_social_events) validate_social_events(ev)
A composite index measuring whether visibility (incoming attention) is concentrated in a small fraction of users. Combines the Gini coefficient and top-share into a single interpretable score.
visibility_concentration_index(graph_series)visibility_concentration_index(graph_series)
graph_series |
A |
Values near 1 indicate extreme concentration; values near 0 indicate roughly equal attention distribution.
A tibble with columns period and vci.
data(sim_social_events) ev <- as_social_events(sim_social_events) gs <- build_graph_series(ev, window = "month") visibility_concentration_index(gs)data(sim_social_events) ev <- as_social_events(sim_social_events) gs <- build_graph_series(ev, window = "month") visibility_concentration_index(gs)