This tutorial provides a conceptual framework and practical insights to work in the Machine Learning field using python programming language. The content of the tutorial combines theoretical concepts with programming examples about how to use these algorithms through the Scikit learn library from Python. All the examples are related to the application of machine learning in the finance domain.
We give readers conceptual tools to understand the algorithms and the workflow of a machine learning project. In order to work on a machine learning project, there are a lot of steps and actions that we should take care to guarantee good results of the models.
Some of these steps are the quality and quantity of the dataset, the right balance between the bias and variance of the model to avoid overfitting or under fitting, the correct selection of the features that will be used to train the model as well as the creation of new features with the original data.
The tutorial starts describing the first steps on a machine learning projects such as data preprocessing and the exploration of the features of the dataset to get a sense about the relationship between them and the variable that they aim to predict.
Then we explain the process of training and testing a dataset for Supervised learning problems as well as the importance of the definition and interpretation of the loss function when we evaluate the ability of a certain model. Afterwards we explain some issues in the training step such as the bias-variance-trade-off and the need of good balance between these elements.
Once we have pointed out the main concepts of the workflow of a machine learning project, the subsequent sections describes popular algorithms from Supervised Learning as well as useful techniques to prevent overfitting issues such as the K-Fold Cross Validation technique.
In addition to the theoretical explanation of the algorithms, the tutorial provides interesting examples of Supervised learning models such as Multiple Linear Regression and Classification problems. The subsequent section describes the Unsupervised Learning field with a complete example of cluster analysis.
Finally we explain basic concepts of Neural Networks and their structure. This is an advanced technique that is suitable for financial applications as it is capable to understand complex relationships among the data and work with different distributions of financial time series.