Credit Scoring Using Machine Learning Algorithms and Blockchain Technology

TLDR

  • Paper proposes a gradient boost model to evaluate creditworthiness of crypto borrowers. Code is not provided, but results confirm that gradient boosted approaches are not just simple to implement but highly effective.

Key learnings

  • Gradient boosted model approaches to credit scoring and other financial applications are best practices and easy to implement out of the box.

  • Best practices for determining efficacy is a tri-metric measurement of accuracy, precision, and recall.

Details

  • It is highly unclear what the authors of the paper meant y “creditworthiness”. To be specific, no example value or relative value is given as an example and no code is provided.

  • All approaches tested are forms of ensemble learning methodologies, which have been proven to be highly effective in banking and credit scoring scenarios.

  • 5 models are tested

    • XGBoost

    • Random forest classification

    • LightGBM

    • K-nearest neighbors

    • Logistic regressions

  • All data was sourced from AAVE borrower histories directly off chain. This data seems to be incomplete/incorrectly categorized.

  • Events were classified via machine learning. 5 events were classified (in order of frequency):

    • Unknown

    • Deposit

    • Borrow

    • Repay

    • Liquidation

  • These events were classified in order to train the ML model. The model of value was an XGBoost approach and trained on AAVE data with little to no data transformations/modifications.

  • Performance evaluation was conducted via two interesting methodologies:

    • Cross validation with 5 folds

    • Ensemble modeling on top of base results

  • XGBoost approaches proved to be the more efficacious - mean accuracy of 88.4%.

Concerns/challenges/comments

  • No code is provided, and the reputation of the publishing institution calls into question the validity of the presented results.

  • Classification of AAVE events was subpar - the largest classification group was unknown events.

Further reading

Last updated