Random Forest in Python - Machine Learning From Scratch 10 - Python Tutorial

Показать описание

Get my Free NumPy Handbook:

In this Machine Learning from Scratch Tutorial, we are going to implement a Random Forest algorithm using only built-in Python modules and numpy. We will also learn about the concept and the math behind this popular ML algorithm.

~~~~~~~~~~~~~~ GREAT PLUGINS FOR YOUR CODE EDITOR ~~~~~~~~~~~~~~

📓 Notebooks available on Patreon:

If you enjoyed this video, please subscribe to the channel!

The code can be found here:

You can find me here:

#Python #MachineLearning

----------------------------------------------------------------------------------------------------------
* This is a sponsored link. By clicking on it you will not have any additional costs, instead you will support me and my project. Thank you so much for the support! 🙏

Patrick Loeber

Рекомендации по теме

Комментарии

I thoroughly enjoyed learning (1) Decision Tree and (2) Random Forest from your videos. Thanks a lot. The Decision Tree program was sleek and modular and is easy to understand and remember. If you throw in something from your lecture as comments, it will be a great learning tool.

airesearch

Again, great work, as in the others. I learned how to code ML algorithms from you. Thanks a lot. I added random features and random row selection sections to your algorithm. Interested friends can try it.

import numpy as np
from dt import DecisionTree
from collections import Counter

def bootstrap_sample(X, y):
n_samples, n_columns = X.shape
# random rows selection
n_samples_row = int(n_samples/10)
row_idxs = np.random.choice(n_samples, size=n_samples_row, replace=False)
# random features selection
n_samp_col = int(np.sqrt(n_columns))
col_idxs = np.random.choice(n_columns, size=n_samp_col, replace=False)
# creating sub dataset
Xidx = X[row_idxs]
newX = np.zeros([n_samples_row, n_samp_col])
for i in range(n_samp_col):
newX[:, i] = Xidx[:, col_idxs[i]]

return newX, y[row_idxs], col_idxs

def most_common_label(y):
counter = Counter(y)
most_common = counter.most_common(1)[0][0]
return most_common

class RandomForest:

def __init__(self, n_trees=100, min_samples_split=2, max_depth=10, n_feats=None):
self.n_trees = n_trees
self.min_samples_split = min_samples_split
self.max_depth = max_depth
self.n_feats = n_feats
self.trees = []

def fit(self, X, y):
self.n_feats = X.shape[1] if not self.n_feats else min(self.n_feats, X.shape[1])
self.trees = []
self.rand_feats = []
for _ in range(self.n_trees):
X_sample, y_sample, rand_feat = bootstrap_sample(X, y)
tree = DecisionTree(min_samples_split=self.min_samples_split,
max_depth=self.max_depth, n_feats=self.n_feats)
tree.fit(X_sample, y_sample)
self.trees.append(tree)

def predict(self, X):
# creating prediction array
y_pred = []
for j in range(self.n_trees):
# selection of necessary features
len_feat = len(self.rand_feats[1])
new_X = np.zeros([len(X), len_feat])
new_feats = self.rand_feats[j]
for i in range(len_feat):
new_X[:, i] = X[:, new_feats[i]]
# prediction is made for each tree

# majority vote
y_pred = np.swapaxes(y_pred, 0, 1)
y_pred_final = [most_common_label(tree_pred) for tree_pred in y_pred]
return np.array(y_pred_final)

muratsahin

Your code looks likes Bagging, but in Random Forest we randomly choose the number of features in each node of decision tree

The process of building a tree is randomized: at the stage of choosing the optimal feature for which the splitting will take place, it is not searched among the entire set of features, but among a random subset of the size of q.

Special attention should be paid to the fact that a random subset of size q is selected anew every time it is necessary to split another vertex. This is the main difference between this approach and the method of random subspaces, where a random subset of features was selected once before constructing the basic algorithm.

Am i wrong?

fedorlaputin

hi..nice tutorial..what about plotting each tree in notebook?

abseenahabeeb

For Regression instead of most_common_label we can calculate mean of the predictions right?

dheerajkumark

what about make playlist like this for deep learning, it will be very helpful!

fedorlaputin

You referred your previous video to better understand this video but you did not keep a link of your previous video anywhere. This is not good. You should mention the previous video link in description or a pinned comment.

MdMainuddincse

Shouldn't it be
np.random.choice(n_sampe, size = subset_sample_value, replace = True)
so that it choose a subset of data.

BTW you should make a paid course, I will surely purchase.
I had bought lot's of udemy courses, yours is the best and it's free lol.

All the course just give a slight overview (theory) then import scikitLean,
1. import data: do feature scaling and engineering, do some necessary data engineering stuff.
then import model.
2. trainTestsplit.
3. fit.
4. predict
5. accuracy
done lol.

yours is the best.

jacjacl

Here I see random forest class is using almost all the __init__ from decision tree. Can I use super() to get the init from Decision tree class instead of manually copying the parameters?

jasonyam

hi
np.swapaxes is giving this error
"numpy.AxisError: axis2: axis 1 is out of bounds for array of dimension 1"

prashantsharmastunning

Can you please one more Adaboost video to complete this series

jasony

Random Forest in Python - Machine Learning From Scratch 10 - Python Tutorial

Machine Learning Tutorial Python - 11 Random Forest

Random Forest Algorithm Explained with Python and scikit-learn

What is Random Forest?

How to implement Random Forest from scratch with Python

60 - How to use Random Forest in Python?

Random Forest in Python - Machine Learning From Scratch 10 - Python Tutorial

Random Forest Regression in Python: A Hands-On Tutorial

Random Forest Regressor in Python: A Step-by-Step Guide

Day 25 of my AI/ML Journey: Trained: Decision Tree, Random Forest #shorts #youtubeshorts #coding #py

Implementing Random Forest In Python|How to Implement Random Forest In Python|Random Forest ML

Random Forest Explained Simply: Boost Accuracy with Python!

Python Machine Learning Tutorial #5 - Decision Trees and Random Forest Classification

StatQuest: Random Forests Part 1 - Building, Using and Evaluating

Easily Create a Random Forest Model with Jupyter! #randomforest #randomforestclassifier #jupyter

Aprende RANDOM FORESTS con Python | Machine Learning 101

Random Forest Algorithm Clearly Explained!

Random Forest Classification | Machine Learning | Python

Random Forest Regression | Python

Random Forest Machine Learning Tutorial in Python for Lithology Prediction - Includes Overview

RandomForest - очень просто о том, как использовать классификацию в Python...

Nathan Epstein - Using Random Forests in Python

Python Tutorial. Random Forest Regression

Random Forest Python Example from Scratch using SKLearn - [Deployment Included]

Random Forest In Python