Introduction to Feature Engineering in Data Science

preview_player
Показать описание
Feature engineering is a critical task that data scientists have to perform prior to training the AI/ML models.

As a data scientist, you may need to:
- Highlight important information in the data
- Remove/isolate unnecessary information (e.x.: outliers).
- Add your own expertise and domain knowledge to the alter the data.

Feature engineering is an art of introducing new features that weren’t existing before.

Data scientists spend 80% of their time performing feature engineering. The remaining 20% is the easy part which includes training the model and performing hyper parameters optimization.

Performing proper feature engineering is crucial to improve AI/ML model performance.

As a data scientist, you need to answer the following questions:
- What are the capabilities of the ML model I have?
- Which features should I select?
- Can I add my domain knowledge to use less features?
- Can I come up with new features from the data I have at hand?
- What should I put in the missing data locations?

It is important to choose features that are most relevant to the problem.

Adding new features that are unnecessary will increase the computational requirements needed to train the model (curse of dimensionality).

I hope you guys enjoyed this video. Please subscribe for more videos.

happy Learning

#FeatureEngineering #DataScience #MachineLearning
Рекомендации по теме
Комментарии
Автор

I wouldn't call imputing nulls, removing dups, etc. feature engineering, per se, because no new features are being created. I'd consider those steps data prep or cleaning or wrangling.

jta
Автор

Sir u r awesome we want more videos on this topic

apoorvshrivastava