The BestBinaryNumericSplit is a splitting function for decision trees that will exhaustively search a numeric dimension for the best binary split. More...
Classes | |
| class | AuxiliarySplitInfo |
Static Public Member Functions | |
template < typename ElemType > | |
| static size_t | CalculateDirection (const ElemType &point, const arma::Col< ElemType > &classProbabilities, const AuxiliarySplitInfo< ElemType > &) |
| Given a point, calculate which child it should go to (left or right). More... | |
template < typename ElemType > | |
| static size_t | NumChildren (const arma::Col< ElemType > &, const AuxiliarySplitInfo< ElemType > &) |
| Returns 2, since the binary split always has two children. More... | |
| template<bool UseWeights, typename VecType , typename WeightVecType > | |
| static double | SplitIfBetter (const double bestGain, const VecType &data, const arma::Row< size_t > &labels, const size_t numClasses, const WeightVecType &weights, const size_t minimumLeafSize, const double minimumGainSplit, arma::Col< typename VecType::elem_type > &classProbabilities, AuxiliarySplitInfo< typename VecType::elem_type > &aux) |
| Check if we can split a node. More... | |
The BestBinaryNumericSplit is a splitting function for decision trees that will exhaustively search a numeric dimension for the best binary split.
| FitnessFunction | Fitness function to use to calculate gain. |
Definition at line 27 of file best_binary_numeric_split.hpp.
|
static |
Given a point, calculate which child it should go to (left or right).
| point | Point to calculate direction of. |
| classProbabilities | Auxiliary information for the split. |
| aux | (Unused) auxiliary information for the split. |
Referenced by BestBinaryNumericSplit< FitnessFunction >::NumChildren().
|
inlinestatic |
Returns 2, since the binary split always has two children.
Definition at line 69 of file best_binary_numeric_split.hpp.
References BestBinaryNumericSplit< FitnessFunction >::CalculateDirection().
|
static |
Check if we can split a node.
If we can split a node in a way that improves on 'bestGain', then we return the improved gain. Otherwise we return the value 'bestGain'. If a split is made, then classProbabilities and aux may be modified.
| bestGain | Best gain seen so far (we'll only split if we find gain better than this). |
| data | The dimension of data points to check for a split in. |
| numCategories | Number of categories in the categorical data. |
| labels | Labels for each point. |
| numClasses | Number of classes in the dataset. |
| minimumLeafSize | Minimum number of points in a leaf node for splitting. |
| classProbabilities | Class probabilities vector, which may be filled with split information a successful split. |
| aux | Auxiliary split information, which may be modified on a successful split. |