Scalars, Vectors, and Matrices: Definitions
What you'll learn: The three fundamental mathematical objects that form the foundation of all machine learning data representation.
Understanding the Building Blocks
In machine learning, we work with data in structured mathematical forms. Think of these as different containers for numbers, each serving a specific purpose:
Scalars
A scalar is simply a single number. It's the most basic unit of data.
Real-world analogy: The temperature outside (72 degrees), your age (25), or the price of an item ($49.99) are all scalars—just individual numbers with no direction or complexity.
Vectors
A vector is an ordered list of numbers. Imagine it as a single row or column of values.
Real-world analogy: Your daily step count for a week [5000, 7200, 6500, 8000, 5500, 9000, 4500] is a vector. Each position has meaning (Monday through Sunday), and together they represent a complete data point.
In ML, a vector often represents a single data sample. For instance, a house might be represented as [1500, 3, 2], meaning 1500 square feet, 3 bedrooms, 2 bathrooms.
Matrices
A matrix is a rectangular grid of numbers arranged in rows and columns. Think of it as multiple vectors stacked together.
Real-world analogy: A spreadsheet with student test scores—rows for students, columns for different subjects. Each row is one student's complete score vector; together they form a matrix.
In ML, matrices hold entire datasets. If you have 100 houses, each with 3 features, you'd store them in a 100×3 matrix.
Why This Matters
These structures allow computers to efficiently process data. A dataset of thousands of images or millions of text samples can be organized as matrices, enabling mathematical operations that power learning algorithms.
Key Takeaway: Scalars are single numbers, vectors are ordered lists of numbers, and matrices are rectangular grids of numbers—these three structures represent all data in machine learning.