A Walk with Deep Learning
Contents
Introduction
What is Machine Learning? According to Arthur Samuel (1949): “Suppose we arrange for some automatic means of testing the effectiveness of any current weight assignment in terms of actual performance and provide a mechanism for altering the weight assignment so as to maximize the performance. We need not go into the details of such a procedure to see that it could be made entirely automatic and to see that a machine so programmed would “learn” from its experience”.
There are several types of machine learning designed to address different problems:
- Supervised learning (data with labels)
- Regression
- Classification
- Unsupervised learning (data without labels)
- Clustering
- Anomaly detection
- Reinforcement learning (states and actions) (agent interacting with an environment, receiving some reward)
In the space domain, these can be used for various tasks, including:
- Time series forecasting (a regression task) for forecasting of space weather proxies
- Determining if a pair of space objects will, or will not, have a conjunction (a classification task)
- Grouping asteroids into families (a clustering task)
- Autonomous landing on asteroids (a reinforcement learning task)
Examples of applications of these different techniques to some of these problems are given in research output.
Deep Learning Fundamentals
The term Deep Learning in general refers to Machine Learning using neural networks.
Fundamental Concepts
Neural networks consist of:
- Neurons: inputs have associated weights, aggregation function, activation function
- Matrix multiplication is all you need: learn weights and biases so output close to input
- Hidden layers (“going deep” – multiple hidden layers)
- Activation functions: Non linearities applied after the matrix multiplication. The rectified linear unit, or ReLU, has been the most popular in the past decade.
- Loss functions: Loss Functions are used to frame the problem to be optimized within deep learning. Most popular ones are:
- Cross Entropy Loss (classification)
- Mean Squared error (regression)
- Stochastic optimisation: Used to train neural networks. They iteratively take a mini-batch of data, hence stochastic, and perform gradient descent on the loss function for that batch. Most popular methods are:
- SGD
- Adam
Architectures
Depending on the problem, different deep learning architectures are appropriate. The most typical and popular architectures are:
- Feed Forward neural networks
- Convolution neural networks(CNNs)
- Recurrent neural networks (RNNs)
- Long short term memory networks (LSTMs)
- Generative Adversarial Networks (GANs)
- Transformers [1]
See: an overview of neural networks architectures for a more extensive list.
The Deep Learning Revolution
Why is DL booming?
- Decreasing GPU costs and increasing availability for researchers
- Open deep learning frameworks
- Free learning resources
- Kaggle competitions
DL milestones?
- Image Classification: AlexNet
- Text generation: GTP-2
- AlphaGo
Getting Started
Getting a GPU Deep Learning Server
Free online platforms with GPUs:
- Google Colab
- Storage on Google drive account
- Lots of pre-installed libraries for ML
- Paperspace Gradient
- Full Jupyter Notebook instance
- Provides some space to save notebooks and models (5 GB)
- Only 1 Free Notebook can be run at a time
- All Notebooks will be set to public and cannot be set to private
- BlazingSQL Notebooks
- 26 GB free HDD
- JupyterLab instance
Deep Learning Libraries
Deep learning 101 with Pytorch and fastai tutorials.
Deep learning frameworks:
- TensorFlow
- Keras
- PyTorch
- Fastai
- mxnet
Development Tips
Experiment Tracking
Why should you track your experiments? Any experiment that isn’t tracked is doomed to be repeated…
Single scientists:
- Coming back to old ideas
- Compare and visualize runs
- Hyperparameter tuning
Teams:
- Share ideas and insights
- Store experiment metadata
- Onboarding new members easier
Tools for ML experiment tracking (See a comparative in: https://neptune.ai/blog/best-ml-experiment-tracking-tools)
- Neptune
- Weights & Biases
- Comet
- Sacred
- MLFlow
- TensorBoard
Weights & Biases (wandb):
- Created for deep learning experiment tracking
- Easy integration with the most popular ML libraries (Tensorflow, Keras, Pytorch, fast.ai, scikit-learn, )
- Customizable visualisation and reporting tools
See wandb project for this talk: https://wandb.ai/vrodriguezf/TS-III
Software development in Jupyter Notebooks
See an example of a nbdev project in: https://github.com/stardust-r/walk-with-deep-learning
Free Resources
Free online courses:
- Practical deep learning for coders (fast.ai)
- Stanford CS229: Machine Learning
- Stanford CS231n: Convolutional Neural Networks for Visual Recognition
- Fundamentals of deep learning
Free books:
- Deep learning, by Ian GoodFellow, Yoshua Bengio and Aaron Courville
- Fastbook: Deep learning for coders with fastai & Pytorch: Applications of AI without a PhD
- Approaching (Almost) Any Machine Learning Problem
References
[1]: Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., … & Polosukhin, I. (2017). Attention is all you need.