select_features

Description

creating a data frame from the input data frame containing only features in the corresponding feature set to the ordered_covariates_or_features input. If the items in the list are covariate names, the corresponding feature set is the set of covariates and all of their historical values in the input data, otherwise the feature set is the set of features mentioned in the list.

Usage

predict.select_features(data, ordered_covariates_or_features)

Parameters

#

Input Name

Input Description

1






data






type: data frame or string
default: -
details: a data frame or address of a data frame containing
preprocessed data. This data frame must have a column name format
conforming to Fig. 5.

example: ‘my_directory/my_data.csv’
2










ordered_covariates
_or_features









type: list<string>
default: -
details: a list of covariates or features which are selected from
the input data frame.
If the list contains covariate names, the selected feature set
includes the covariates and all their historical values in the input
data frame.
To specify a covariate, its name must be written with a suffix ‘ t’
for temporal covariates and with a suffix ‘ t+’ for futuristic covariates.

example: [‘temperature t’, ‘population’, ‘cofirmed_cases t’,
‘social distancing t+’] or [‘temperature t-2’, ‘population’, ‘cofirmed_cases t’
, ‘temperature t-1’]

Returns

#

Output Name

Output Description

1





data





type: data frame
default: -
details: a data frame containing only covariates (or features)
which are selected by feature selection process based on
ordered_covariates_or_features.

Example

import pandas as pd
from stpredict.predict import select_features

df = pd.read_csv('./historical_data h=1.csv')
data = select_features(data = df, ordered_covariates_or_features =['temperature t', 'population',
                                                              'cofirmed_cases t', 'temperature t-1'])