Decision forests options - 5.2 English - 68552

AOCL API Guide (68552)

Document ID
68552
Release Date
2025-12-29
Version
5.2 English

The available Python options are detailed in the aoclda.decision_tree.decision_tree() and aoclda.decision_forest.decision_forest() class constructors.

The following options can be set using da_options_set_?:

Table 4.16 Table of Options for Decision Forests.#

Option name

Type

Default

Description

Constraints

category split strategy

string

\(s=\) ordered

How to split categorical features: split one category from all other or consider them ordered.

\(s=\) one-vs-all, or ordered.

maximum bins

integer

\(i=256\)

Maximum number of bins in histograms.

\(2 \le i \le 65535\)

histogram

string

\(s=\) no

Choose whether to use histograms constructed from the data matrix X.

\(s=\) no, or yes.

feature threshold

real

\(r=1e-05\)

Minimum difference in feature value required for splitting.

\(0 \le r\)

storage order

string

\(s=\) column-major

Whether data is supplied and returned in row- or column-major order.

\(s=\) c, column-major, f, fortran, or row-major.

check data

string

\(s=\) no

Check input data for NaNs prior to performing computation.

\(s=\) no, or yes.

minimum split score

real

\(r=1e-05\)

Minimum score needed for a node to be considered for splitting.

\(0 \le r \le 1\)

maximum features

integer

\(i=0\)

Set the number of features to consider when ‘features selection’ is set to ‘custom’. 0 means take all the features.

\(0 \le i\)

number of trees

integer

\(i=100\)

Set the number of trees to compute.

\(1 \le i\)

seed

integer

\(i=-1\)

Set random seed for the random number generator. If the value is -1, a random seed is automatically generated. In this case the resulting classification will create non-reproducible results.

\(-1 \le i\)

node minimum samples

integer

\(i=2\)

Minimum number of samples to consider a node for splitting.

\(1 \le i\)

maximum depth

integer

\(i=29\)

Set the maximum depth of trees.

\(0 \le i \le 29\)

scoring function

string

\(s=\) gini

Select scoring function to use.

\(s=\) cross-entropy, entropy, gini, misclass, misclassification, or misclassification-error.

minimum impurity decrease

real

\(r=0\)

Minimum score improvement needed to consider a split from the parent node.

\(0 \le r\)

block size

integer

\(i=256\)

Set the size of the blocks for parallel computations.

\(1 \le i \le 2147483647\)

features selection

string

\(s=\) sqrt

Select how many features to use for each split. ‘custom’ reads the ‘maximum features’ option, proportion reads the ‘proportion features’ option. ‘all’, ‘sqrt’ and ‘log2’ select respectively all, the square root or the base-2 logarithm of the total number of features.

\(s=\) all, custom, log2, proportion, or sqrt.

bootstrap

string

\(s=\) yes

Select whether to bootstrap the samples in the trees.

\(s=\) no, or yes.

bootstrap samples factor

real

\(r=1\)

Proportion of samples to draw from the data set to build each tree if ‘bootstrap’ was set to ‘yes’.

\(0 < r \le 1\)

proportion features

real

\(r=0.1\)

Set the proportion of features to consider when ‘features selection’ is set to ‘proportion’.

\(0 < r \le 1\)