Toolbox

The groot.toolbox package exposes the Model class that allows easy loading, converting and attacking decision tree ensembles of different formats. The Model class can load tree ensembles from the following formats using:

Scikit-learn: Model.from_sklearn
JSON file: Model.from_json_file
GROOT: Model.from_groot
TREANT: Model.from_treant
Provably robust boosting: Model.from_provably_robust_boosting

After loading you can then easily determine metrics such as accuracy and adversarial accuracy (against a given perturbation radius epsilon). It is also possible to get access to more information about adversarial robustness than just a metric. The model class has three methods for this:

attack_feasibility: Compute for each sample whether or not an adversarial example exists within an radius around it.
attack_distances: Compute for each sample the distance it needs to move to turn into an adversarial example.
adversarial_examples: Generate adversarial examples for each input sample.

These three methods are theoretically listed in order of increasing complexity. That means that when you only need to know e.g. attack_feasibility and not attack_distances calling only the first function might be faster than calling the second and computing the 'feasibility' from that. For example for the default 'milp' attack, attack_feasibility is orders of magnitude faster than attack_distances and adversarial_examples.

Example

from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier

from groot.toolbox import Model

X, y = load_iris(return_X_y=True)
tree = DecisionTreeClassifier(max_depth=3)
tree.fit(X, y)

model = Model.from_sklearn(tree)
print("Accuracy:", model.accuracy(X, y))

epsilon = 0.3
print("Adversarial accuracy:", model.adversarial_accuracy(X, y, epsilon=epsilon))

X_adv = model.adversarial_examples(X, y)
print("Adversarial examples:")
print(X_adv)

Code reference

`groot.toolbox`

`Model`

`init(self, json_model, n_classes)` `special`

General model class that exposes a common API for evaluating decision tree (ensemble) models. Usually you won't have to call this constructor manually, instead use from_json_file, from_sklearn, from_treant, from_provably_robust_boosting or from_groot.

Parameters:

Name	Type	Description	Default
`json_model`	`list of dicts`	List of decision trees encoded as dicts. See the XGBoost JSON format.	required
`n_classes`	`int`	Number of classes that this model predicts.	required

`accuracy(self, X, y)`

Determine the accuracy of the model on unperturbed samples.

Parameters:

Name	Type	Description	Default
`X`	`array-like of shape (n_samples, n_features)`	Input samples.	required
`y`	`array-like of shape (n_samples,)`	True labels.	required

Returns:

Type	Description
`float`	Accuracy on unperturbed samples.

`adversarial_accuracy(self, X, y, attack='auto', order=inf, epsilon=0.0, options={})`

Determine the accuracy against adversarial examples within maximum perturbation radius epsilon.

Parameters:

Name	Type	Description	Default
`X`	`array-like of shape (n_samples, n_features)`	Samples to attack.	required
`y`	`array-like of shape (n_samples,)`	True labels for the samples.	required
`attack`	`{"auto", "milp", "tree"}`	The attack to use, if "auto" the attack is chosen automatically: - "milp" for optimal attacks on tree ensembles using a Mixed-Integer Linear Programming formulation. - "tree" for optimal attacks on single decision trees by enumerating all possible paths through the tree.	`'auto'`
`order`	`{0, 1, 2, inf}`	L-norm order to use. See numpy documentation of more explanation.	`inf`
`epsilon`	`float`	Maximum distance by which samples can move.	`0.0`

Returns:

Type	Description
`float`	Adversarial accuracy given the maximum perturbation radius epsilon.

`adversarial_examples(self, X, y, attack='auto', order=inf, options={})`

Create adversarial examples for each input sample.

Parameters:

Name	Type	Description	Default
`X`	`array-like of shape (n_samples, n_features)`	Samples to attack.	required
`y`	`array-like of shape (n_samples,)`	True labels for the samples.	required
`attack`	`{"auto", "milp", "tree"}`	The attack to use, if "auto" the attack is chosen automatically: - "milp" for optimal attacks on tree ensembles using a Mixed-Integer Linear Programming formulation. - "tree" for optimal attacks on single decision trees by enumerating all possible paths through the tree.	`'auto'`
`order`	`{0, 1, 2, inf}`	L-norm order to use. See numpy documentation of more explanation.	`inf`
`options`	`dict`	Extra attack-specific options.	`{}`

Returns:

Type	Description
`ndarray of shape (n_samples, n_features)`	Adversarial examples.

`attack_distance(self, X, y, attack='auto', order=inf, options={})`

Determine the perturbation distance for each sample to make an adversarial example.

Parameters:

Name	Type	Description	Default
`X`	`array-like of shape (n_samples, n_features)`	Samples to attack.	required
`y`	`array-like of shape (n_samples,)`	True labels for the samples.	required
`attack`	`{"auto", "milp", "tree"}`	The attack to use, if "auto" the attack is chosen automatically: - "milp" for optimal attacks on tree ensembles using a Mixed-Integer Linear Programming formulation. - "tree" for optimal attacks on single decision trees by enumerating all possible paths through the tree.	`'auto'`
`order`	`{0, 1, 2, inf}`	L-norm order to use. See numpy documentation of more explanation.	`inf`
`options`	`dict`	Extra attack-specific options.	`{}`

Returns:

Type	Description
`ndarray of shape (n_samples,) of floats`	Distances to create adversarial examples.

`attack_feasibility(self, X, y, attack='auto', order=inf, epsilon=0.0, options={})`

Determine whether an adversarial example is feasible for each sample given the maximum perturbation radius epsilon.

Parameters:

Name	Type	Description	Default
`X`	`array-like of shape (n_samples, n_features)`	Samples to attack.	required
`y`	`array-like of shape (n_samples,)`	True labels for the samples.	required
`attack`	`{"auto", "milp", "tree"}`	The attack to use, if "auto" the attack is chosen automatically: - "milp" for optimal attacks on tree ensembles using a Mixed-Integer Linear Programming formulation. - "tree" for optimal attacks on single decision trees by enumerating all possible paths through the tree.	`'auto'`
`order`	`{0, 1, 2, inf}`	L-norm order to use. See numpy documentation of more explanation.	`inf`
`epsilon`	`float`	Maximum distance by which samples can move.	`0.0`
`options`	`dict`	Extra attack-specific options.	`{}`

Returns:

Type	Description
`ndarray of shape (n_samples,) of booleans`	Vector of True/False. Whether an adversarial example is feasible.

`decision_function(self, X)`

Compute prediction values for some samples. These values are the sum of leaf values in which the samples end up.

Parameters:

Name	Type	Description	Default
`X`	`array-like of shape (n_samples, n_features)`	Samples to predict.	required

Returns:

Type	Description
`ndarray of shape (n_samples) or ndarray of shape (n_samples, n_classes)`	Predicted values. Returns a 1-dimensional array if n_classes=2, else a 2-dimensional array.

`from_groot(classifier)` `staticmethod`

Create a Model instance from a GrootTree, GrootRandomForest or GROOT OneVsRestClassifier.

Parameters:

Name	Type	Description	Default
`classifier`	`GrootTree, GrootRandomForest or OneVsRestClassifier (of GROOT models)`	GROOT model to load.	required

Returns:

Type	Description
`Model`	Instantiated Model object.

`from_json_file(filename, n_classes)` `staticmethod`

Create a Model instance from a JSON file.

Parameters:

Name	Type	Description	Default
`filename`	`str`	Path to JSON file that contains a list of decision trees encoded as dicts. See the XGBoost JSON format.	required
`n_classes`	`int`	Number of classes that this model predicts.	required

Returns:

Type	Description
`Model`	Instantiated Model object.

`from_provably_robust_boosting(classifier)` `staticmethod`

Create a Model instance from a Provably Robust Boosting TreeEnsemble.

Parameters:

Name	Type	Description	Default
`classifier`	`groot.provably_robust_boosting.TreeEnsemble`	Provably Robust Boosting model to load.	required

Returns:

Type	Description
`Model`	Instantiated Model object.

`from_sklearn(classifier)` `staticmethod`

Create a Model instance from a Scikit-learn classifier.

Parameters:

Name	Type	Description	Default
`classifier`	`DecisionTreeClassifier, RandomForestClassifier or GradientBoostingClassifier`	Scikit-learn model to load.	required

Returns:

Type	Description
`Model`	Instantiated Model object.

`from_treant(classifier)` `staticmethod`

Create a Model instance from a TREANT decision tree.

Parameters:

Name	Type	Description	Default
`classifier`	`groot.treant.RobustDecisionTree`	TREANT model to load.	required

Returns:

Type	Description
`Model`	Instantiated Model object.

`predict(self, X)`

Predict classes for some samples. The raw prediction values are turned into class labels.

Parameters:

Name	Type	Description	Default
`X`	`array-like of shape (n_samples, n_features)`	Samples to predict.	required

Returns:

Type	Description
`ndarray of shape (n_samples)`	Predicted class labels.

`to_json(self, filename, indent=2)`

Export the model object to a JSON file.

Parameters:

Name	Type	Description	Default
`filename`	`str`	Name of the JSON file to export to.	required
`indent`	`int`	Number of spaces to use for indentation in the JSON file. Can be reduced to save storage.	`2`

Toolbox

Example

Code reference

groot.toolbox

Model

__init__(self, json_model, n_classes) special

accuracy(self, X, y)

adversarial_accuracy(self, X, y, attack='auto', order=inf, epsilon=0.0, options={})

adversarial_examples(self, X, y, attack='auto', order=inf, options={})

attack_distance(self, X, y, attack='auto', order=inf, options={})

attack_feasibility(self, X, y, attack='auto', order=inf, epsilon=0.0, options={})

decision_function(self, X)

from_groot(classifier) staticmethod

from_json_file(filename, n_classes) staticmethod

from_provably_robust_boosting(classifier) staticmethod

from_sklearn(classifier) staticmethod

from_treant(classifier) staticmethod

predict(self, X)

to_json(self, filename, indent=2)

`groot.toolbox`

`Model`

`init(self, json_model, n_classes)` `special`

`accuracy(self, X, y)`

`adversarial_accuracy(self, X, y, attack='auto', order=inf, epsilon=0.0, options={})`

`adversarial_examples(self, X, y, attack='auto', order=inf, options={})`

`attack_distance(self, X, y, attack='auto', order=inf, options={})`

`attack_feasibility(self, X, y, attack='auto', order=inf, epsilon=0.0, options={})`

`decision_function(self, X)`

`from_groot(classifier)` `staticmethod`

`from_json_file(filename, n_classes)` `staticmethod`

`from_provably_robust_boosting(classifier)` `staticmethod`

`from_sklearn(classifier)` `staticmethod`

`from_treant(classifier)` `staticmethod`

`predict(self, X)`

`to_json(self, filename, indent=2)`