Dual-Model Architecture

Our system ”ensembles” two Random Forest models operating on distinct temporal resolutions, one derived using intraday features calculated from hourly bars and intraday features across days.

The intraday model processes hourly data, developing metrics that quantify aggressive trading behavior, volume price divergences, and cross-venue arbitrage signals.
The daily model operates on lower-frequency data, identifying persistent trends and structural shifts in market dynamics. Its feature set encompasses multi-period momentum indicators, volatility regime metrics, and cross-asset correlation structures.

At the time of writing, we have over 20 different variables that we combine in the modeling exercise across the intraday and daily frequencies. Going forward, we will continue to update the list of predictors that enter our model.

The prediction target is formulated as the return:

r_{i,t+1} = \frac{P_{i,t+1}}{P_{i,t}} - 1

with the Random Forest out-putting continuous return predictions:

\hat{r}_{t+1} = f(\mathbf{X}_t)

where X_t represents the feature vector at time t.

Feature preprocessing employs rank transformation and to mitigate the impact of extreme observations common in cryptocurrency data. The feature engineering emphasizes normalized measures that remain stationary across varying market conditions. The models undergo periodic retraining on a frequent basis, ensuring adaptation to evolving market conditions while maintaining sufficient historical context for robust pattern recognition. We set our hyperparameters for the models to be conservative.

After both models are trained separately, the models are combined through a risk parity framework that allocates weights such that both the intraday and daily models contribute equally to portfolio risk, preventing dominance by the more volatile predictions.

PreviousRandom Forest Methodology NextPerformance / Backtest

Last updated 3 months ago