Skip to content

Toolbox

The groot.toolbox package exposes the Model class that allows easy loading, converting and attacking decision tree ensembles of different formats. The Model class can load tree ensembles from the following formats using:

  • Scikit-learn: Model.from_sklearn
  • JSON file: Model.from_json_file
  • GROOT: Model.from_groot
  • TREANT: Model.from_treant
  • Provably robust boosting: Model.from_provably_robust_boosting

After loading you can then easily determine metrics such as accuracy and adversarial accuracy (against a given perturbation radius epsilon). It is also possible to get access to more information about adversarial robustness than just a metric. The model class has three methods for this:

  • attack_feasibility: Compute for each sample whether or not an adversarial example exists within an radius around it.
  • attack_distances: Compute for each sample the distance it needs to move to turn into an adversarial example.
  • adversarial_examples: Generate adversarial examples for each input sample.

These three methods are theoretically listed in order of increasing complexity. That means that when you only need to know e.g. attack_feasibility and not attack_distances calling only the first function might be faster than calling the second and computing the 'feasibility' from that. For example for the default 'milp' attack, attack_feasibility is orders of magnitude faster than attack_distances and adversarial_examples.

Example

from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier

from groot.toolbox import Model

X, y = load_iris(return_X_y=True)
tree = DecisionTreeClassifier(max_depth=3)
tree.fit(X, y)

model = Model.from_sklearn(tree)
print("Accuracy:", model.accuracy(X, y))

epsilon = 0.3
print("Adversarial accuracy:", model.adversarial_accuracy(X, y, epsilon=epsilon))

X_adv = model.adversarial_examples(X, y)
print("Adversarial examples:")
print(X_adv)

Code reference

groot.toolbox

Model

__init__(self, json_model, n_classes) special

General model class that exposes a common API for evaluating decision tree (ensemble) models. Usually you won't have to call this constructor manually, instead use from_json_file, from_sklearn, from_treant, from_provably_robust_boosting or from_groot.

Parameters:

Name Type Description Default
json_model list of dicts

List of decision trees encoded as dicts. See the XGBoost JSON format.

required
n_classes int

Number of classes that this model predicts.

required

accuracy(self, X, y)

Determine the accuracy of the model on unperturbed samples.

Parameters:

Name Type Description Default
X array-like of shape (n_samples, n_features)

Input samples.

required
y array-like of shape (n_samples,)

True labels.

required

Returns:

Type Description
float

Accuracy on unperturbed samples.

adversarial_accuracy(self, X, y, attack='auto', order=inf, epsilon=0.0, options={})

Determine the accuracy against adversarial examples within maximum perturbation radius epsilon.

Parameters:

Name Type Description Default
X array-like of shape (n_samples, n_features)

Samples to attack.

required
y array-like of shape (n_samples,)

True labels for the samples.

required
attack {"auto", "milp", "tree"}

The attack to use, if "auto" the attack is chosen automatically: - "milp" for optimal attacks on tree ensembles using a Mixed-Integer Linear Programming formulation. - "tree" for optimal attacks on single decision trees by enumerating all possible paths through the tree.

'auto'
order {0, 1, 2, inf}

L-norm order to use. See numpy documentation of more explanation.

inf
epsilon float

Maximum distance by which samples can move.

0.0

Returns:

Type Description
float

Adversarial accuracy given the maximum perturbation radius epsilon.

adversarial_examples(self, X, y, attack='auto', order=inf, options={})

Create adversarial examples for each input sample.

Parameters:

Name Type Description Default
X array-like of shape (n_samples, n_features)

Samples to attack.

required
y array-like of shape (n_samples,)

True labels for the samples.

required
attack {"auto", "milp", "tree"}

The attack to use, if "auto" the attack is chosen automatically: - "milp" for optimal attacks on tree ensembles using a Mixed-Integer Linear Programming formulation. - "tree" for optimal attacks on single decision trees by enumerating all possible paths through the tree.

'auto'
order {0, 1, 2, inf}

L-norm order to use. See numpy documentation of more explanation.

inf
options dict

Extra attack-specific options.

{}

Returns:

Type Description
ndarray of shape (n_samples, n_features)

Adversarial examples.

attack_distance(self, X, y, attack='auto', order=inf, options={})

Determine the perturbation distance for each sample to make an adversarial example.

Parameters:

Name Type Description Default
X array-like of shape (n_samples, n_features)

Samples to attack.

required
y array-like of shape (n_samples,)

True labels for the samples.

required
attack {"auto", "milp", "tree"}

The attack to use, if "auto" the attack is chosen automatically: - "milp" for optimal attacks on tree ensembles using a Mixed-Integer Linear Programming formulation. - "tree" for optimal attacks on single decision trees by enumerating all possible paths through the tree.

'auto'
order {0, 1, 2, inf}

L-norm order to use. See numpy documentation of more explanation.

inf
options dict

Extra attack-specific options.

{}

Returns:

Type Description
ndarray of shape (n_samples,) of floats

Distances to create adversarial examples.

attack_feasibility(self, X, y, attack='auto', order=inf, epsilon=0.0, options={})

Determine whether an adversarial example is feasible for each sample given the maximum perturbation radius epsilon.

Parameters:

Name Type Description Default
X array-like of shape (n_samples, n_features)

Samples to attack.

required
y array-like of shape (n_samples,)

True labels for the samples.

required
attack {"auto", "milp", "tree"}

The attack to use, if "auto" the attack is chosen automatically: - "milp" for optimal attacks on tree ensembles using a Mixed-Integer Linear Programming formulation. - "tree" for optimal attacks on single decision trees by enumerating all possible paths through the tree.

'auto'
order {0, 1, 2, inf}

L-norm order to use. See numpy documentation of more explanation.

inf
epsilon float

Maximum distance by which samples can move.

0.0
options dict

Extra attack-specific options.

{}

Returns:

Type Description
ndarray of shape (n_samples,) of booleans

Vector of True/False. Whether an adversarial example is feasible.

decision_function(self, X)

Compute prediction values for some samples. These values are the sum of leaf values in which the samples end up.

Parameters:

Name Type Description Default
X array-like of shape (n_samples, n_features)

Samples to predict.

required

Returns:

Type Description
ndarray of shape (n_samples) or ndarray of shape (n_samples, n_classes)

Predicted values. Returns a 1-dimensional array if n_classes=2, else a 2-dimensional array.

from_groot(classifier) staticmethod

Create a Model instance from a GrootTree, GrootRandomForest or GROOT OneVsRestClassifier.

Parameters:

Name Type Description Default
classifier GrootTree, GrootRandomForest or OneVsRestClassifier (of GROOT models)

GROOT model to load.

required

Returns:

Type Description
Model

Instantiated Model object.

from_json_file(filename, n_classes) staticmethod

Create a Model instance from a JSON file.

Parameters:

Name Type Description Default
filename str

Path to JSON file that contains a list of decision trees encoded as dicts. See the XGBoost JSON format.

required
n_classes int

Number of classes that this model predicts.

required

Returns:

Type Description
Model

Instantiated Model object.

from_provably_robust_boosting(classifier) staticmethod

Create a Model instance from a Provably Robust Boosting TreeEnsemble.

Parameters:

Name Type Description Default
classifier groot.provably_robust_boosting.TreeEnsemble

Provably Robust Boosting model to load.

required

Returns:

Type Description
Model

Instantiated Model object.

from_sklearn(classifier) staticmethod

Create a Model instance from a Scikit-learn classifier.

Parameters:

Name Type Description Default
classifier DecisionTreeClassifier, RandomForestClassifier or GradientBoostingClassifier

Scikit-learn model to load.

required

Returns:

Type Description
Model

Instantiated Model object.

from_treant(classifier) staticmethod

Create a Model instance from a TREANT decision tree.

Parameters:

Name Type Description Default
classifier groot.treant.RobustDecisionTree

TREANT model to load.

required

Returns:

Type Description
Model

Instantiated Model object.

predict(self, X)

Predict classes for some samples. The raw prediction values are turned into class labels.

Parameters:

Name Type Description Default
X array-like of shape (n_samples, n_features)

Samples to predict.

required

Returns:

Type Description
ndarray of shape (n_samples)

Predicted class labels.

to_json(self, filename, indent=2)

Export the model object to a JSON file.

Parameters:

Name Type Description Default
filename str

Name of the JSON file to export to.

required
indent int

Number of spaces to use for indentation in the JSON file. Can be reduced to save storage.

2