As PCA and linear autoencoder have a close relation, this post introduces again PCA as a powerful dimension reduction tool while skipping many mathematical proofs. PCA is a statistical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables (entities each of which takes on various numerical values) into a set of values of linearly...

## deep learning: Linear Autoencoder with Keras

This post introduces using linear autoencoder for dimensionality reduction using TensorFlow and Keras. What is a linear autoencoder An autoencoder is a type of artificial neural network used to learn efficient data codings in an unsupervised manner. The aim of an autoencoder is to learn a representation (encoding) for a set of data, typically for dimensionality reduction, by training the network...

## Math for ML – Vector norms quick note

Vector norms are used in many machine learning and computer science problems. This article covers some common norms and related applications. From a high school entrance exam… Remember the day (?/?/1998) when I took an exam to a high school, there was a problem of finding the shortest path from A to B knowing that the person can only go left/right or up/down given the following grid of m x...

## Sparse Matrices for Machine Learning quick note

In machine learning, many matrices are sparse. It is essential to know how to handle this kind of matrix. Sparse vs Dense Matrix First, it is good to know that sparse matrix looks similar to a normal matrix, with rows, columns or other indexes. But a sparse matrix is comprised of mostly zero (0s) values. They are distinct from dense matrices with mostly non-zero values. A matrix is sparse if many...

## Recurrent neural network – predict monthly milk production

In part 1, we introduced a simple RNN for time-series data. To continue, this article applies a deep version of RNN on a real dataset to predict monthly milk production. The data Monthly milk production: pounds per cow. Jan 1962 – Dec 1975. You can download the data using this link. Download: CSV file The data contains the production of 168 months (14 years). We will use an RNN to predict...