Machine Learning

Definitions

LSTM (long short term memory) models are a type of recurrent neural network (RNN) that are well-suited to time series forecasting. Their effectiveness stems from their ability to capture long-term dependencies in data while mitigating the vanishing gradient problem typically associated with traditional RNNs.
GBDTs (gradient boosted decision trees) are an ensemble learning method that combines multiple decision trees to create a powerful predictive model. GBDT models iteratively learn from the mistakes of previous models to improve their accuracy, making them highly effective in financial forecasting applications.

The two primary types of machine learning models Concrete employs are long short-term Memory (LSTM) models and gradient boosted decision trees (GBDTs).
While incredibly powerful tools for financial applications, machine learning models are susceptible to overfitting and false positives. As such, machine learning models are not the primary basis of Concrete’s risk engine(s) but are one tool in Concrete’s toolkit.

Basic LSTM diagram
LSTM models consist of a network of interconnected memory cells, each with an input gate, a forget gate, and an output gate. These gates regulate the flow of information within the model and allow it to retain or discard relevant information over time. The high-level functioning of LSTM models can be summarized as follows:
- Input and forget gates The input gate determines which information from the current input should be stored in the memory cell. The forget gate decides which information from the previous memory cell state should be discarded.
- Memory cells The memory cell stores and updates the information over time. It incorporates the input gate and forget gate to determine the relevant information to be stored.
- Output gates The output gate decides what information from the memory cell should be exposed as the output of the LSTM model. It filters the information based on the current input and the memory cell state.

Basic GBDT diagram
GBDT models consist of an ensemble of decision trees, where each tree makes predictions based on a subset of features. The trees are constructed recursively by splitting the data based on feature values to minimize the prediction error. This is where the “decision tree” part of their name comes from.
The predictions from multiple decision trees are combined to generate the final prediction of the GBDT model. Typically, the model uses a weighted sum of the predictions from each tree, where the weights are determined during the training process.
GBDTs iteratively train decision trees to correct the errors made by previous trees. In each iteration, the model calculates the gradient of the loss function with respect to the predictions made by the previous trees. It then fits a new tree to the residuals, seeking to minimize the loss function. This is where the “gradient boosted” part of their name comes from.
GBDTs can handle both numerical and categorical features, capture nonlinear relationships, and automatically handle missing data. They are also robust to outliers and can effectively handle high-dimensional data, making them suitable for analyzing large financial datasets.