This tutorial provides a conceptual framework and practical insights to work in the Machine Learning field using python programming language. The content of the tutorial combines theoretical concepts with programming examples about how to use these algorithms through the Scikit learn library from Python. All the examples are related to the application of machine learning […]

# Machine Learning in Finance Using Python

## What is Machine Learning?

Machine Learning is the field which applies statistical analysis and computer science for employing algorithms that learn how to perform tasks such as prediction or classification of a target variable as well as grouping data. These algorithms learn from data and are widely diverse as they range from traditional statistical models based on inference to […]

## Data Preprocessing in Data Science and Machine Learning

Data preprocessing is where data scientist spent most of their time. These tasks involve selecting the appropriate features as well as clean and prepare them to become the inputs or independent variables in a machine learning model. Model performance is strictly related with the selection and cleaning of the features. Below we describe common tasks […]

## Feature Selection in Machine Learning

Feature Selection is one of the core concepts in machine learning and has a high impact on the performance of the model. Irrelevant or partially irrelevant features can negatively impact the model performance. In this process those features which contribute most to the prediction variable are selected. In order to get an idea about which […]

## Train-Test Datasets in Machine Learning

Once we have cleaned the data and have selected the features from the data for building the model, the next step is to generate the train and test dataset. We will divide our data into two different data sets, namely training and testing datasets. The model will be built using the training set and then […]

## Evaluate Model Performance – Loss Function

The predictions that the model returns will be compared with the real observation of the variable to obtain a measure of model performance. In order to evaluate the model performance, both classification and regression problems need to define a function called loss function that will compute the error of the model. This loss function is […]

## Model Selection in Machine Learning

Model selection refers to choose the best statistical machine learning model for a particular problem. For this task we need to compare the relative performance between models. Therefore the loss function and the metric that represent it, becomes fundamental for selecting the right and non-overfitted model. We can state a machine learning supervised problem with […]

## Bias Variance Trade Off

The interesting property of a machine learning model is its capacity to predict or categorize new unseen data (data that was not used in training the model). For this reason the important measure is the MSE error with test data, which is denominated as test MSE. The goal is to choose a model where the […]

## Supervised Learning Models

As we pointed out earlier, both classification and regression models are in the field of Supervised Learning. These models are characterized by having a group of features or independent variables and a target variable that is the variable that the model aims to predict. This target variable is called the labelled data and is the […]

## Multiple Linear Regression

The multiple linear regression algorithm states that a response y can be estimated with a set of input features x and an error term ɛ. The model can be expressed with the following mathematical equation: βTX is the matrix notation of the equation, where βT, X ϵ ʀp+1 and ɛ ~ N(μ,σ2) βT(transpose of β) […]