Full Embedding SearchSource:
Calculates Gamma for all combinations of a set of input predictors
A vector or matrix whose columns are proposed inputs to a predictive function
A vector of double, the output variable that is to be predicted
Logical, set this to FALSE if you don't want progress bar displayed
Integer number of near neighbors to use in RANN search, passed to gamma_test
The error limit for the approximate near neighbor search. This will be passed to gamma_test, which will pass it on to the ANN near neighbor search. Setting this greater than zero can significantly reduce search time for large data sets.
An invisible data frame with two columns, mask - an integer mask representing a subset of the predictors, and Gamma, the value of Gamma using those predictors. The rows are sorted from lowest to highest Gamma. The return value also has an attribute named target_V, the target variance. To get the vratio (estimated fraction of target variance due to noise), divide any of the Gammas by target_v.
Given a set of predictors and a target that is to be predicted, this search
will run the gamma test on every combination of the inputs. It returns the
results in order of increasing gamma, so the best combinations of inputs for
prediction will be at the beginning of the list. As this is a fully
combinatoric search, it will start to get slow beyond about 16 inputs. By default,
fe_search will display a progress bar showing the time to completion.
fe_search() returns a data.frame with two columns: Gamma, a sorted vector of
Gamma values, and mask, an integer column containing the masks representing the inputs
used to calculate each Gamma. To reconstruct the predictor set for a Gamma,
use its mask with int_to_intMask and select_by_mask as shown in their examples.