| Title: | A Tool for Semi-Automating the Statistical Disclosure Control of Research Outputs |
|---|---|
| Description: | A Tool for Semi-Automating the Statistical Disclosure Control of Research Outputs. |
| Authors: | Jim Smith [cre, ctb] (ORCID: <https://orcid.org/0000-0001-7908-1859>), Maha Albashir [aut, ctb], Richard John Preen [aut, ctb] (ORCID: <https://orcid.org/0000-0003-3351-8132>) |
| Maintainer: | Jim Smith <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.1.7 |
| Built: | 2026-05-20 14:05:46 UTC |
| Source: | https://github.com/ai-sdc/acro-r |
Add comments to outputs
acro_add_comments(name, comment)acro_add_comments(name, comment)
name |
The name of the output. |
comment |
The comment. |
No return value, called for side effects
Adds an exception request to an output.
acro_add_exception(name, reason)acro_add_exception(name, reason)
name |
The name of the output. |
reason |
The comment. |
No return value, called for side effects
Compute a simple cross tabulation of two (or more) factors.
acro_crosstab(index, columns, values = NULL, aggfunc = NULL)acro_crosstab(index, columns, values = NULL, aggfunc = NULL)
index |
Values to group by in the rows. |
columns |
Values to group by in the columns. |
values |
Array of values to aggregate according to the factors. Requires |
aggfunc |
If specified, requires |
Cross tabulation of the data
Adds an unsupported output to the results dictionary
acro_custom_output(filename, comment = NULL)acro_custom_output(filename, comment = NULL)
filename |
The name of the file that will be added to the list of the outputs. |
comment |
An optional comment. |
No return value, called for side effects
Turns suppression off during a session
acro_disable_suppression()acro_disable_suppression()
No return value, called for side effects
Turns suppression on during a session
acro_enable_suppression()acro_enable_suppression()
No return value, called for side effects
Creates a results file for checking.
acro_finalise(path, ext)acro_finalise(path, ext)
path |
Name of a folder to save outputs. |
ext |
Extension of the results file. Valid extensions are json or xlsx. |
No return value, called for side effects
Fits Logit or Probit model.
acro_glm(formula, data, family)acro_glm(formula, data, family)
formula |
The formula specifying the model. |
data |
The data for the model. |
family |
Decide whether to fit a logit or probit model. |
Regression Results Wrapper
Histogram
acro_hist( data, column, breaks = 10, freq = TRUE, col = NULL, filename = "histogram.png" )acro_hist( data, column, breaks = 10, freq = TRUE, col = NULL, filename = "histogram.png" )
data |
The object holding the data. |
column |
The column that will be used to plot the histogram. |
breaks |
Number of histogram bins to be used. |
freq |
If False, the result will contain the number of samples in each bin. If True, the result is the value of the probability density function at the bin. |
col |
The color of the plot. |
filename |
The name of the file where the plot will be saved. |
The histogram.
Initialise an ACRO object
acro_init( config = "default", suppress = FALSE, envname = acro_venv, use_conda = NULL )acro_init( config = "default", suppress = FALSE, envname = acro_venv, use_conda = NULL )
config |
Name of a yaml configuration file with safe parameters. |
suppress |
Whether to automatically apply suppression. |
envname |
Name of the Python environment to use. |
use_conda |
Whether to use a Conda environment.
If |
Invisibly returns the ACRO object, which is used internally.
Fits Ordinary Least Squares Regression
acro_lm(formula, data)acro_lm(formula, data)
formula |
The formula specifying the model. |
data |
The data for the model. |
Regression Results Wrapper.
Pivot table
acro_pivot_table( data, values = NULL, index = NULL, columns = NULL, aggfunc = "mean" )acro_pivot_table( data, values = NULL, index = NULL, columns = NULL, aggfunc = "mean" )
data |
The data to operate on. |
values |
Column to aggregate, optional. |
index |
If an array is passed, it must be the same length as the data. The list can contain any of the other types (except list). Keys to group by on the pivot table index. If an array is passed, it is being used as the same manner as column values. |
columns |
If an array is passed, it must be the same length as the data. The list can contain any of the other types (except list). Keys to group by on the pivot table column. If an array is passed, it is being used as the same manner as column values. |
aggfunc |
If list of strings passed, the resulting pivot table will have hierarchical columns whose top level are the function names |
Cross tabulation of the data.
Prints the current results dictionary.
acro_print_outputs()acro_print_outputs()
No return value, called for side effects
Remove outputs
acro_remove_output(name)acro_remove_output(name)
name |
Key specifying which output to remove, e.g., 'output_0'. |
No return value, called for side effects
Rename outputs
acro_rename_output(old, new)acro_rename_output(old, new)
old |
The old name of the output. |
new |
The new name of the output. |
No return value, called for side effects
Survival analysis
acro_surv_func(time, status, output, filename = "kaplan-meier.png")acro_surv_func(time, status, output, filename = "kaplan-meier.png")
time |
An array of times (censoring times or event times). |
status |
Status at the event time. |
output |
A string determine the type of output. Available options are table or plot. |
filename |
The name of the file where the plot will be saved. |
The survival table or plot.
Compute a simple cross tabulation of two (or more) factors.
acro_table(index, columns, dnn = NULL, deparse.level = 0, ...)acro_table(index, columns, dnn = NULL, deparse.level = 0, ...)
index |
Values to group by in the rows. |
columns |
Values to group by in the columns. |
dnn |
The names to be given to the dimensions in the result |
deparse.level |
Controls how the default |
... |
Any other parameters. |
Cross tabulation of the data
The lung dataset contains information about lung cancer survival.
lunglung
A data frame with columns:
institutional identification
Survival time in months.
Survival status (1 = death, 0 = censored).
Age of the patient at the start of the study.
Gender of the patient.
Performance status (Eastern Cooperative Oncology Group).
'Karnofsky' performance status.
'Karnofsky' performance status as assessed by the patient.
Daily caloric intake at the start of the study.
Weight loss in the last six months.
data(lung)data(lung)
This dataset is originated from a hierarchical decision model created to evaluate applications for nursery schools.
nursery_datanursery_data
A data frame with columns: A data frame with 12960 rows and 9 columns:
Parents' occupation
Child's nursery
Form of the family
Number of children
Housing conditions
Financial standing of the family
Social conditions
Health conditions
The ranking of applications for nursery schools
https://www.openml.org/search?type=data&status=active&id=26&sort=runs
data(nursery_data)data(nursery_data)