Stacked Denoising Autoencoder =============== To use a stacked denoising autoencoder [1], we first create an ordinary stacked autoencoder (SAE). For our example we use the well-known MNIST [2] dataset. .. code-block:: python from clustpy.deep.neural_networks import StackedAutoencoder from clustpy.data import load_mnist import torch data, labels = load_mnist(return_X_y=True) data = torch.from_numpy(data).float() SAE = StackedAutoencoder(layers=[data.shape[1], 256, 128, 64, 10]) In this example, the SAE has three hidden layers with the sizes 256, 128, and 64. The resulting embedding has 10 features. Now we could already train the SAE using the default parameters. However, the data is usually normalized beforehand. .. code-block:: python data_mean = data.mean() data_std = data.std() data = (data - data_mean) / data_std We also want to train the SAE with denoising in mind. In other words, we need a suitable corruption function. In our case, we choose simple salt and pepper noise. Note that because the data has been normalized, it does not lie within [0, 1] or [0, 255]. Our corruption function must take this into account. .. code-block:: python data_min = data.min() data_max = data.max() def my_corruption(data, data_min, data_max, amount_noise=0.02): apply_noise = torch.rand(data.shape) data[apply_noise < amount_noise] = data_max data[apply_noise > 1 - amount_noise] = data_min return data corruption_fn = lambda data: my_corruption(data, data_min=data_min, data_max=data_max) Now that we have a suitable corruption function, let us look at its effect regarding a sample. .. code-block:: python from clustpy.utils import plot_image sample = data[0].cpu().numpy().reshape((28, 28)) plot_image(sample, black_and_white=True) corrupted_sample = corruption_fn(sample) plot_image(corrupted_sample, black_and_white=True) Finally, we can start the actual training of our stacked denoising autoencoder. .. code-block:: python SAE.fit(data=data, corruption_fn=corruption_fn) [1] Vincent, Pascal, et al. "Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion." Journal of machine learning research 11.12 (2010). [2] LeCun, Yann, et al. "Gradient-based learning applied to document recognition." Proceedings of the IEEE 86.11 (1998): 2278-2324.