Immersing self in machine learnings, regression and classification problems can be solved through a variety of steps. For this week the focus is on:
What new skills have you learned?
📦 K Nearest Neighbors
📦 Decision Trees
📦 Random Forests
K Nearest Neighbors
KNN is a classification algorithm that classifies elements in a dataset based on features of the closest (nearest) points. K is used to set the no. of nearest neighbors that is used to classify an entity.
Key components used in creating the classifier are;
📭 Distance Metric
🔎 No. of
Nearest neighbors to look at.
⛲ Optional weighting function
💥 Method of aggregating neighboring points.;Usually defaults to Simple majority vote
Here is a notebook on fruit classification 🍎 using K Nearest Neighbors.
Decision trees are a widely used models for classification and regression tasks. A set of splitting rules is used to segment the predictor via a hierarchy of “if-else” questions, leading to a decision.
🎋 Nodes : Split the value of attributes.
🌴 Edges : These are outcomes of a split to the next node.
🌲 Root : Node that does the first split.
🍃 Terminal nodes that predict the outcome.
Each node in the tree either represents a question, or a terminal node (also called a leaf) which contains the answer.
The edges connect the answers to a question with the next question you would ask.
Random forests incorporates use of many trees with a random sample of features for every single tree at every single split.
Each time a split in a tree is considered, a random sample of m predictors is chosen as split candidates from the full set of p predictors. The split is allowed to use only one of those m predictors.
m, is typically chosen to be (squareroot of P == m).
(that is, the number of predictors considered at each split is approximately equal to the square root of the total number of predictors )
By randomly leaving out features, random forests decorellates the trees providing an improvement over the trees.
Check out this Decision Tree and Random Forests notebook working on sample kyphosis dataset - (excessive outward curvature of the spine) among patients
So that was the Seventh week.. 🔏