# Dual-Model Architecture

Our system ”ensembles” two Random Forest models operating on distinct temporal resolutions, one derived using **intraday features calculated from hourly bars** and **intraday features across days**.&#x20;

* The intraday model processes hourly data, developing metrics that quantify aggressive trading behavior, volume price divergences, and cross-venue arbitrage signals.&#x20;
* The daily model operates on lower-frequency data, identifying persistent trends and structural shifts in market dynamics. Its feature set encompasses multi-period momentum indicators, volatility regime metrics, and cross-asset correlation structures.&#x20;

At the time of writing, we have over 20 different variables that we combine in the modeling exercise across the **intraday** and **daily** frequencies. Going forward, we will continue to update the list of predictors that enter our model.

The prediction target is formulated as the return:

$$
r\_{i,t+1} = \frac{P\_{i,t+1}}{P\_{i,t}} - 1
$$

\
\
with the Random Forest out-putting continuous return predictions:

$$
\hat{r}\_{t+1} = f(\mathbf{X}\_t)
$$

where ***X***<sub>*t*</sub> represents the feature vector at time ***t***.&#x20;

Feature preprocessing employs rank transformation and winsorization[^1] to mitigate the impact of extreme observations common in cryptocurrency data. The feature engineering emphasizes normalized measures that remain stationary across varying market conditions. The models undergo periodic retraining on a frequent basis, ensuring adaptation to evolving market conditions while maintaining sufficient historical context for robust pattern recognition. We set our hyperparameters for the models to be conservative.

**After both models are trained separately, the models are combined through a risk parity framework that allocates weights such that both the intraday and daily models contribute equally to portfolio risk, preventing dominance by the more volatile predictions.**

[^1]: <https://en.wikipedia.org/wiki/Winsorizing>
