BiteSizedChunks.comLearn one small thing at a time.

Course contentsShow

Machine Learning and Deep Learning

1Scalars, Vectors, and Matrices: Definitions
2Vector Operations: Addition and Scalar Multiplication
3Dot Product and Vector Similarity
4Vector Norms and Distance Metrics
5Matrix-Vector Multiplication
6Matrix-Matrix Multiplication
7Matrix Transpose and Symmetry
8Identity Matrix and Matrix Inverse
9Systems of Linear Equations
10Linear Independence and Span
11Basis and Dimension
12Column Space and Null Space
13Rank of a Matrix
14Determinants and Their Properties
15Trace of a Matrix
16Eigenvalues and Eigenvectors: Definitions
17Computing Eigenvalues and Eigenvectors
18Eigendecomposition of Matrices
19Diagonalization and Its Applications
20Orthogonality and Orthonormal Vectors
21Orthogonal Matrices and Their Properties
22Singular Value Decomposition (SVD): Concept
23Computing and Interpreting SVD
24Matrix Approximation with SVD
25Positive Definite and Semidefinite Matrices
26Quadratic Forms
27Matrix Calculus: Gradients of Matrix Expressions
28Numerical Stability in Linear Algebra
29Functions and Continuity
30Limits: The Foundation of Derivatives
31The Derivative Definition
32Geometric Interpretation of Derivatives
33Basic Differentiation Rules
34Product and Quotient Rules
35The Chain Rule
36Derivatives of Exponential Functions
37Derivatives of Logarithmic Functions
38Derivatives of Trigonometric Functions
39Higher-Order Derivatives
40Implicit Differentiation
41Partial Derivatives: Introduction
42The Gradient Vector
43Directional Derivatives
44The Multivariable Chain Rule
45Critical Points and Extrema
46The Hessian Matrix
47Second Derivative Test in Multiple Dimensions
48Taylor Series and Approximations
49L'Hôpital's Rule
50The Jacobian Matrix
51Integration Fundamentals
52Numerical Differentiation
53Sample Spaces and Events
54Probability Axioms and Basic Rules
55Conditional Probability
56Independence of Events
57Bayes' Theorem
58Random Variables: Discrete and Continuous
59Probability Mass Functions
60Probability Density Functions
61Cumulative Distribution Functions
62Expectation and Mean
63Variance and Standard Deviation
64Common Discrete Distributions: Bernoulli and Binomial
65Poisson Distribution
66Uniform Distribution
67Normal (Gaussian) Distribution
68Exponential and Gamma Distributions
69Joint Probability Distributions
70Marginal and Conditional Distributions
71Covariance and Correlation
72Independence of Random Variables
73Law of Large Numbers
74Central Limit Theorem
75Population vs Sample
76Descriptive Statistics: Central Tendency
77Descriptive Statistics: Spread and Variability
78Percentiles and Quantiles
79Covariance and Correlation
80The Law of Large Numbers
81Central Limit Theorem
82Sampling Distributions
83Point Estimation Fundamentals
84Bias and Variance of Estimators
85Maximum Likelihood Estimation
86Method of Moments
87Confidence Intervals
88Bootstrap Resampling
89Hypothesis Testing Framework
90Type I and Type II Errors
91Common Statistical Tests
92Multiple Testing Correction
93What is Mathematical Optimization?
94Unconstrained vs Constrained Optimization
95Local vs Global Optima
96Convex Sets
97Convex Functions
98First-Order Optimality Conditions
99Second-Order Optimality Conditions
100The Gradient Descent Algorithm
101Learning Rate and Step Size
102Convergence Guarantees for Gradient Descent
103Lipschitz Continuity and Smoothness
104Strong Convexity
105Stochastic Gradient Descent Basics
106Momentum Methods
107Newton's Method
108Quasi-Newton Methods
109Coordinate Descent
110Constrained Optimization and Lagrange Multipliers
111KKT Conditions
112Subgradients and Non-Smooth Optimization

Machine Learning and Deep Learning

1Scalars, Vectors, and Matrices: Definitions
2Vector Operations: Addition and Scalar Multiplication
3Dot Product and Vector Similarity
4Vector Norms and Distance Metrics
5Matrix-Vector Multiplication
6Matrix-Matrix Multiplication
7Matrix Transpose and Symmetry
8Identity Matrix and Matrix Inverse
9Systems of Linear Equations
10Linear Independence and Span
11Basis and Dimension
12Column Space and Null Space
13Rank of a Matrix
14Determinants and Their Properties
15Trace of a Matrix
16Eigenvalues and Eigenvectors: Definitions
17Computing Eigenvalues and Eigenvectors
18Eigendecomposition of Matrices
19Diagonalization and Its Applications
20Orthogonality and Orthonormal Vectors
21Orthogonal Matrices and Their Properties
22Singular Value Decomposition (SVD): Concept
23Computing and Interpreting SVD
24Matrix Approximation with SVD
25Positive Definite and Semidefinite Matrices
26Quadratic Forms
27Matrix Calculus: Gradients of Matrix Expressions
28Numerical Stability in Linear Algebra
29Functions and Continuity
30Limits: The Foundation of Derivatives
31The Derivative Definition
32Geometric Interpretation of Derivatives
33Basic Differentiation Rules
34Product and Quotient Rules
35The Chain Rule
36Derivatives of Exponential Functions
37Derivatives of Logarithmic Functions
38Derivatives of Trigonometric Functions
39Higher-Order Derivatives
40Implicit Differentiation
41Partial Derivatives: Introduction
42The Gradient Vector
43Directional Derivatives
44The Multivariable Chain Rule
45Critical Points and Extrema
46The Hessian Matrix
47Second Derivative Test in Multiple Dimensions
48Taylor Series and Approximations
49L'Hôpital's Rule
50The Jacobian Matrix
51Integration Fundamentals
52Numerical Differentiation
53Sample Spaces and Events
54Probability Axioms and Basic Rules
55Conditional Probability
56Independence of Events
57Bayes' Theorem
58Random Variables: Discrete and Continuous
59Probability Mass Functions
60Probability Density Functions
61Cumulative Distribution Functions
62Expectation and Mean
63Variance and Standard Deviation
64Common Discrete Distributions: Bernoulli and Binomial
65Poisson Distribution
66Uniform Distribution
67Normal (Gaussian) Distribution
68Exponential and Gamma Distributions
69Joint Probability Distributions
70Marginal and Conditional Distributions
71Covariance and Correlation
72Independence of Random Variables
73Law of Large Numbers
74Central Limit Theorem
75Population vs Sample
76Descriptive Statistics: Central Tendency
77Descriptive Statistics: Spread and Variability
78Percentiles and Quantiles
79Covariance and Correlation
80The Law of Large Numbers
81Central Limit Theorem
82Sampling Distributions
83Point Estimation Fundamentals
84Bias and Variance of Estimators
85Maximum Likelihood Estimation
86Method of Moments
87Confidence Intervals
88Bootstrap Resampling
89Hypothesis Testing Framework
90Type I and Type II Errors
91Common Statistical Tests
92Multiple Testing Correction
93What is Mathematical Optimization?
94Unconstrained vs Constrained Optimization
95Local vs Global Optima
96Convex Sets
97Convex Functions
98First-Order Optimality Conditions
99Second-Order Optimality Conditions
100The Gradient Descent Algorithm
101Learning Rate and Step Size
102Convergence Guarantees for Gradient Descent
103Lipschitz Continuity and Smoothness
104Strong Convexity
105Stochastic Gradient Descent Basics
106Momentum Methods
107Newton's Method
108Quasi-Newton Methods
109Coordinate Descent
110Constrained Optimization and Lagrange Multipliers
111KKT Conditions
112Subgradients and Non-Smooth Optimization

← Machine Learning and Deep Learning

Lesson 1 of 3,538·1. Mathematical Foundations for Machine LearningFree lesson

Scalars, Vectors, and Matrices: Definitions

Learn the fundamental building blocks: what scalars, vectors, and matrices are and how they represent data in ML.

Scalars, Vectors, and Matrices: Definitions

What you'll learn: The three fundamental mathematical objects that form the foundation of all machine learning data representation.

Understanding the Building Blocks

In machine learning, we work with data in structured mathematical forms. Think of these as different containers for numbers, each serving a specific purpose:

Scalars

A scalar is simply a single number. It's the most basic unit of data.

Real-world analogy: The temperature outside (72 degrees), your age (25), or the price of an item ($49.99) are all scalars—just individual numbers with no direction or complexity.

Vectors

A vector is an ordered list of numbers. Imagine it as a single row or column of values.

Real-world analogy: Your daily step count for a week [5000, 7200, 6500, 8000, 5500, 9000, 4500] is a vector. Each position has meaning (Monday through Sunday), and together they represent a complete data point.

In ML, a vector often represents a single data sample. For instance, a house might be represented as [1500, 3, 2], meaning 1500 square feet, 3 bedrooms, 2 bathrooms.

Matrices

A matrix is a rectangular grid of numbers arranged in rows and columns. Think of it as multiple vectors stacked together.

Real-world analogy: A spreadsheet with student test scores—rows for students, columns for different subjects. Each row is one student's complete score vector; together they form a matrix.

In ML, matrices hold entire datasets. If you have 100 houses, each with 3 features, you'd store them in a 100×3 matrix.

Why This Matters

These structures allow computers to efficiently process data. A dataset of thousands of images or millions of text samples can be organized as matrices, enabling mathematical operations that power learning algorithms.

Key Takeaway: Scalars are single numbers, vectors are ordered lists of numbers, and matrices are rectangular grids of numbers—these three structures represent all data in machine learning.