By Kevin David Farinango

t-SNE is a machine learning technique for dimensionality reduction that helps you to identify relevant patterns. The main advantage of t-SNE is the ability to preserve the local structure This means, approximately, that the points that are close to each other in the High-dimensional data set will tend to be close to each other on the chart. t-SNE too It produces beautiful visualizations. t-SNE is executed in two steps: first build a probability distribution on pairs of samples in the original space, so that similar samples receive high probability of being chosen, while very different samples receive low probability If chosen. The concept of “similarity” is based on the distance between points and density in the Proximities of a point. As described by the authors: “The similarity between point xj and point xj is the conditional probability that xi would choose xj as your neighbor if the neighbors were chosen proportionally to their probability density under a Gaussian curve centered on xi ”

**Gaussian curve**

The bell curve or Gaussian curve is one of the most famous mathematical curves. We see that It appears in a large number of specific situations, statistics and probabilities. its importance is due primarily to the frequency with which different variables associated with Natural and everyday phenomena follow, approximately, this distribution. The distribution of a normal variable is completely determined by two parameters, its mean and its standard deviation, usually denoted by and. With this notation, the Normal density is given by the equation:

Second, t-SNE takes the points of the high dimensional space to the low space Dimensionality randomly, defines a probability distribution similar to the view in the target space (the low dimensional space), and minimizes the so-called divergence Kullback-Leibler between the two distributions with respect to the positions of the points in the map (the Kullback-Leibler divergence measures the similarity or difference between two functions of probability distribution). In other words: t-SNE tries to reproduce the distribution that existed in the original space in the final space.

The t-SNE algorithm models the probability distribution of neighbors around each point. Here, the term neighbors refers to the set of points closest to each point. In the space Original high dimension, this is modeled as a Gaussian distribution. In the space of Two-dimensional output, this is modeled as a t-distribution. The purpose of the procedure is find a mapping in two-dimensional space that minimizes the differences between these two distributions in all points. The thickest tails of a t distribution in comparison with a Gaussian help to spread the points more evenly in space two-dimensional.

Colab:

References:

https://www.displayr.com/using-t-sne-to-visualize-data-before-prediction/

https://distill.pub/2016/misread-tsne/

https://images.math.cnrs.fr/La-courbe-en-cloche.html?lang=fr

http://www.sc.ehu.es/ccwbayes/docencia/mmcc/docs/tema-infoestadistica-0506