Basic Mathematics of Data Science

Desi Ratna Ningsih
4 min readMar 20, 2020

The key topics to master to become a better data scientist

Mathematics is the bedrock of any contemporary discipline of science. Almost all the techniques of modern data science, including machine learning, have a deep mathematical underpinning. In this chapter, we will go over the basics of the following topics:
• Basic symbols/terminology
• Logarithms/exponents
• The set theory
• Calculus
• Matrix (linear) algebra

Mathematics as a discipline

Mathematics, as a science, is one of the oldest known forms of logical thinking by mankind. Since ancient Mesopotamia and likely before (3,000 BCE), humans have been relying on arithmetic and more challenging forms of math to answer life’s biggest questions. Basic Mathematics Today, we rely on math for most aspects of our daily life.

Basic symbols and terminology

First, let’s take a look at the most basic symbols that are used in the mathematical process as well as some more subtle notations used by data scientists.

Vectors and matrices

A vector is defined as an object with both magnitude and direction. This definition, however, is a bit complicated for our use. For our purpose, a vector is simply a 1-dimensional array representing a series of numbers. Put in another way, a vector is a list of numbers.

A matrix is a two-dimensional array of numbers, having a fixed number of rows and columns, and containing a number at the intersection of each row and each column. A matrix is usually delimited by square brackets.

If a matrix has only one row or only one column it is called a vector. A matrix having only one row is called a row vector.

Logarithms/exponents

Key Points

  • An exponent of −1−1 denotes the inverse function. That is, f−1(x)f−1(x) is the inverse of the function f(x)f(x).
  • An inverse function is a function that undoes another function: If an input xx into the function ff produces an output yy, then inputting yy into the inverse function gg produces the output xx, and vice versa (i.e., f(x)=yf(x)=y, and g(y)=xg(y)=x).
  • The logarithm to base bb is the inverse function of f(x)=bxf(x)=bx: logb(b)x=xlogb(b)=xlogb⁡(b)x=xlogb⁡(b)=x
  • The natural logarithm ln(x)ln(x) is the inverse of the exponential function exex:b=elnbb=elnb

Set theory

The set theory involves mathematical operations at a set level. It is sometimes
thought of as a basic fundamental group of theorems that governs the rest of
mathematics. For our purpose, we use the set theory in order to manipulate groups of elements.

Set theory is a branch of mathematical logic that studies sets, which informally are collections of objects. Although any type of object can be collected into a set, set theory is applied most often to objects that are relevant to mathematics. The language of set theory can be used to define nearly all mathematical logic.

Calculus

Whether you loved or hated it in college, calculus pops up in numerous places in data science and machine learning. It lurks behind the simple-looking analytical solution of an ordinary least squares problem in linear regression or embedded in every back-propagation your neural network makes to learn a new pattern. It is an extremely valuable skill to add to your repertoire. Here are the topics to learn:

  • Functions of a single variable, limit, continuity, differentiability
  • Mean value theorems, indeterminate forms, L’Hospital’s rule
  • Maxima and minima
  • Product and chain rule
  • Taylor’s series, infinite series summation/integration concepts
  • Fundamental and mean value-theorems of integral calculus, evaluation of definite and improper integrals
  • Beta and gamma functions
  • Functions of multiple variables, limit, continuity, partial derivatives
  • Basics of ordinary and partial differential equations

Linear Algebra

This is an essential branch of mathematics for understanding how machine-learning algorithms work on a stream of data to create insight. Everything from friend suggestions on Facebook, to song recommendations on Spotify, to transferring your selfie to a Salvador Dali-style portrait using deep transfer learning involves matrices and matrix algebra. Here are the essential topics to learn:

  • Basic properties of matrix and vectors: scalar multiplication, linear transformation, transpose, conjugate, rank, determinant
  • Inner and outer products, matrix multiplication rule and various algorithms, matrix inverse
  • Special matrices: square matrix, identity matrix, triangular matrix, an idea about sparse and dense matrix, unit vectors, symmetric matrix, Hermitian, skew-Hermitian and unitary matrices
  • Matrix factorization concept/LU decomposition, Gaussian/Gauss-Jordan elimination, solving Ax=b linear system of equation
  • Vector space, basis, span, orthogonality, orthonormality, linear least square
  • Eigenvalues, eigenvectors, diagonalization, singular value decomposition

I hope you learned something today. Feel free to leave a message if you have any feedback, and share it with anyone that might find this useful.

See this pose:

  • Sinan Ozdemir-Principles of Data Science (Packt)

--

--

Desi Ratna Ningsih

Data Science Enthusiast, Remote Worker, Course Trainer, Archery Coach, Psychology and Philosophy Student