From ed0e04221f7f10fe876702173d0fb18aaa892c65 Mon Sep 17 00:00:00 2001 From: AKSHAYA MADHURI <76612327+akshaya-hub@users.noreply.github.com> Date: Sat, 1 Apr 2023 20:58:26 +0530 Subject: [PATCH] Update Feature Selection.md --- .../notes/Data Processing/Feature Selection.md | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/data-science-notes/notes/Data Processing/Feature Selection.md b/data-science-notes/notes/Data Processing/Feature Selection.md index 36b870443..e16c1e4f1 100644 --- a/data-science-notes/notes/Data Processing/Feature Selection.md +++ b/data-science-notes/notes/Data Processing/Feature Selection.md @@ -9,8 +9,17 @@ updated: 2023-01-20T23:04:48-08:00 # Feature Selection ## Summary +Feature selection is a critical step in the data science process, as it involves identifying the most important variables or features that are relevant to predicting a particular outcome. Here is a summary of the key points to keep in mind about feature selection: + +-Feature selection involves choosing a subset of the available features that are most relevant to the outcome variable. +-The goal of feature selection is to reduce the dimensionality of the data, which can improve model performance, reduce overfitting, and simplify interpretation. +-There are three main approaches to feature selection: filter methods, wrapper methods, and embedded methods. +-Filter methods involve ranking features based on statistical tests or other metrics and selecting the top features. +-Wrapper methods involve evaluating different feature subsets using a machine learning algorithm and selecting the best performing subset. +-Embedded methods incorporate feature selection into the model building process, by optimizing the feature subset during training. +-It is important to carefully evaluate the performance of the selected features using appropriate validation techniques, as overfitting can occur if the feature selection process is not properly validated. +-Different feature selection methods may be appropriate for different types of data and modeling tasks, and there is often a trade-off between model complexity and performance. ---- ## Related Topics @@ -22,4 +31,4 @@ updated: 2023-01-20T23:04:48-08:00 ## Footnotes ---- \ No newline at end of file +---