This article describes how to train Random Forest (RF) and Gradient Boosted Tree (GBT) models using PySpark API and the databricks notebook.
Data is from the Kaggle Credit Card Fraud Data Set.
It is a large set with over 280K lines, so it should give a fair estimation for the models.