View on GitHub

wiki

A Walk with Deep Learning

Introduction

What is Machine Learning? According to Arthur Samuel (1949): “Suppose we arrange for some automatic means of testing the effectiveness of any current weight assignment in terms of actual performance and provide a mechanism for altering the weight assignment so as to maximize the performance. We need not go into the details of such a procedure to see that it could be made entirely automatic and to see that a machine so programmed would “learn” from its experience”.

There are several types of machine learning designed to address different problems:

Supervised learning (data with labels)
- Regression
- Classification
Unsupervised learning (data without labels)
- Clustering
- Anomaly detection
Reinforcement learning (states and actions) (agent interacting with an environment, receiving some reward)

In the space domain, these can be used for various tasks, including:

Time series forecasting (a regression task) for forecasting of space weather proxies
Determining if a pair of space objects will, or will not, have a conjunction (a classification task)
Grouping asteroids into families (a clustering task)
Autonomous landing on asteroids (a reinforcement learning task)

Examples of applications of these different techniques to some of these problems are given in research output.

Deep Learning Fundamentals

The term Deep Learning in general refers to Machine Learning using neural networks.

Fundamental Concepts

Neural networks consist of:

Neurons: inputs have associated weights, aggregation function, activation function
Matrix multiplication is all you need: learn weights and biases so output close to input
Hidden layers (“going deep” – multiple hidden layers)
Activation functions: Non linearities applied after the matrix multiplication. The rectified linear unit, or ReLU, has been the most popular in the past decade.
Loss functions: Loss Functions are used to frame the problem to be optimized within deep learning. Most popular ones are:
- Cross Entropy Loss (classification)
- Mean Squared error (regression)
Stochastic optimisation: Used to train neural networks. They iteratively take a mini-batch of data, hence stochastic, and perform gradient descent on the loss function for that batch. Most popular methods are:
- SGD
- Adam

Architectures

Depending on the problem, different deep learning architectures are appropriate. The most typical and popular architectures are:

Feed Forward neural networks
Convolution neural networks(CNNs)
Recurrent neural networks (RNNs)
Long short term memory networks (LSTMs)
Generative Adversarial Networks (GANs)
Transformers [1]

See: an overview of neural networks architectures for a more extensive list.

The Deep Learning Revolution

Why is DL booming?

Decreasing GPU costs and increasing availability for researchers
Open deep learning frameworks
Free learning resources
Kaggle competitions

DL milestones?

Image Classification: AlexNet
Text generation: GTP-2
AlphaGo

Getting Started

Getting a GPU Deep Learning Server

Free online platforms with GPUs:

Google Colab
- Storage on Google drive account
- Lots of pre-installed libraries for ML
Paperspace Gradient
- Full Jupyter Notebook instance
- Provides some space to save notebooks and models (5 GB)
- Only 1 Free Notebook can be run at a time
- All Notebooks will be set to public and cannot be set to private
BlazingSQL Notebooks
- 26 GB free HDD
- JupyterLab instance

Deep Learning Libraries

Deep learning 101 with Pytorch and fastai tutorials.

Deep learning frameworks:

TensorFlow
Keras
PyTorch
Fastai
mxnet

Development Tips

Experiment Tracking

Why should you track your experiments? Any experiment that isn’t tracked is doomed to be repeated…

Single scientists:

Coming back to old ideas
Compare and visualize runs
Hyperparameter tuning

Teams:

Share ideas and insights
Store experiment metadata
Onboarding new members easier

Tools for ML experiment tracking (See a comparative in: https://neptune.ai/blog/best-ml-experiment-tracking-tools)

Neptune
Weights & Biases
Comet
Sacred
MLFlow
TensorBoard

Weights & Biases (wandb):

Created for deep learning experiment tracking
Easy integration with the most popular ML libraries (Tensorflow, Keras, Pytorch, fast.ai, scikit-learn, )
Customizable visualisation and reporting tools

See wandb project for this talk: https://wandb.ai/vrodriguezf/TS-III

Software development in Jupyter Notebooks

See an example of a nbdev project in: https://github.com/stardust-r/walk-with-deep-learning

Free Resources

Free online courses:

Free books:

References

[1]: Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., … & Polosukhin, I. (2017). Attention is all you need.