Foundations6 min read

Features & Labels

Ingredients are features, the dish name is the label — teach your model what to look at and what to predict

features:Input columns · what the model seeslabels:Output column · what the model predictsfeature engineering:Critical · often matters more than the algorithm

Imagine you're looking at a recipe card. On one side, you've got the ingredients: flour, sugar, eggs, butter, vanilla extract. On the other side, you've got the dish name: chocolate cake.

In machine learning, the ingredients are called features and the dish name is called the label.

Features = the information the model uses to make a prediction (the inputs).
Label = the thing the model is trying to predict (the output).

That's it. Every supervised ML problem boils down to: "Given these features, predict this label."

A concrete example

Say you're predicting whether a student will pass or fail an exam. Here's your data:

Hours Studied	Hours Slept	Attended Review?	Result
6	8	Yes	Pass
2	4	No	Fail
7	7	Yes	Pass
1	5	No	Fail

The first three columns — Hours Studied, Hours Slept, Attended Review — are features. They're the clues.

The last column — Result — is the label. That's the answer the model learns to predict.

In code, features are usually called X (capital, because it's a matrix of many columns), and labels are called y (lowercase, because it's a single column).

Extracting Features & Labels

import pandas as pd
from sklearn.tree import DecisionTreeClassifier

# Raw data as a DataFrame
data = pd.DataFrame({
    'hours_studied': [6, 2, 7, 1, 5, 8, 3, 4],
    'hours_slept':   [8, 4, 7, 5, 6, 9, 3, 7],
    'attended_review': [1, 0, 1, 0, 1, 1, 0, 0],
    'result':        [1, 0, 1, 0, 1, 1, 0, 0],  # 1=pass, 0=fail
})

# Split into features (X) and label (y)
X = data[['hours_studied', 'hours_slept', 'attended_review']]
y = data['result']

print("Features (X):")
print(X.head(3))
print("\nLabels (y):")
print(y.head(3))

# Train a model
model = DecisionTreeClassifier()
model.fit(X, y)

# Predict: studied 5hrs, slept 7hrs, attended review
print("\nPrediction:", model.predict([[5, 7, 1]]))

Output

Features (X):
   hours_studied  hours_slept  attended_review
0              6            8                1
1              2            4                0
2              7            7                1

Labels (y):
0    1
1    0
2    1

Prediction: [1]

Good features vs. bad features

Not all features are created equal. A good feature is relevant to the prediction. A bad feature is noise that confuses the model.

Predicting house price?

Good features: square footage, number of bedrooms, neighborhood, age of house
Bad features: the color of the mailbox, the owner's favorite movie, what day you scraped the listing

The process of choosing, creating, and transforming features is called feature engineering — and experienced ML practitioners will tell you it's often more important than which algorithm you pick.

Types of features

Numerical: numbers like age, salary, temperature (ready to use)
Categorical: categories like color, country, "yes/no" (need to be converted to numbers)
Text: raw text like reviews or tweets (need heavy processing)
Derived: new features you create — like "age of house" from "year built" minus "current year"

Feature Engineering: Creating Better Features

import pandas as pd

data = pd.DataFrame({
    'year_built': [1990, 2005, 2018, 1975],
    'sqft': [1400, 2200, 1800, 1100],
    'bedrooms': [3, 4, 3, 2],
    'bathrooms': [2, 3, 2, 1],
})

# Derived feature: age of house
data['age'] = 2026 - data['year_built']

# Derived feature: sqft per bedroom
data['sqft_per_bed'] = data['sqft'] / data['bedrooms']

# Derived feature: bathroom-to-bedroom ratio
data['bath_ratio'] = data['bathrooms'] / data['bedrooms']

print(data[['age', 'sqft_per_bed', 'bath_ratio']])

Output

   age  sqft_per_bed  bath_ratio
0   36    466.666667    0.666667
1   21    550.000000    0.750000
2    8    600.000000    0.666667
3   51    550.000000    0.500000

Note: A common beginner mistake: accidentally including the label (or information derived from the label) as a feature. If you're predicting whether someone will buy a product and you include "receipt amount" as a feature — that's cheating! The model will get perfect scores during training but learn nothing useful. This is called data leakage.

Quick check

You're building a model to predict whether it will rain tomorrow. Which of these is a FEATURE?

Challenge

What Is Machine Learning?

Teaching computers to learn from examples instead of following rigid rules

→

Types of Machine Learning

Three classrooms, three teaching styles — supervised, unsupervised, and reinforcement

→

Train-Test Split

Practice with homework, get graded on new questions — why you must split your data

→

Overfitting & Underfitting

Goldilocks and the three models — too simple, too complex, just right

→