About What We Actually Do
We've been teaching machine learning for streaming financial data since 2014. The work is technical, specific, and requires understanding both statistics and market microstructure.
How This Started
Most educational programs treat financial data like it's static. You download historical prices, run some regressions, maybe build a classifier. But that's not how markets work when you're actually trading or building real systems.
Streaming data behaves differently. Tick feeds arrive out of order. Timestamps get messy across exchanges. Your models need to update continuously without retraining from scratch every time. The statistical properties change throughout the trading day.
We started because there wasn't good training for this. People came out of traditional ML courses knowing gradient descent and neural architectures but had no idea how to handle sequence data that updates every millisecond or deal with the non-stationarity that makes financial streams unique.
- Order book reconstruction from L2/L3 feeds
- Feature engineering for microsecond-latency decisions
- Online learning algorithms that adapt to regime changes
- Handling market data anomalies and exchange outages
Two Ways to Learn
Some people learn better in groups where they can discuss problems with others working on similar challenges. Others need focused individual attention to work through specific technical obstacles in their own projects.
Structured Cohorts
Groups of 8-12 participants work through the same curriculum. You get exposure to how other people approach problems, which is valuable when debugging complex pipelines or understanding different modeling strategies.
Sessions run live with scheduled times. There's homework between sessions and you submit code that gets reviewed. The pace is fixed, so you need to keep up, but that structure helps many people actually finish instead of drifting off.
Personalized Pacing
One-on-one sessions let you work on your specific problems. Maybe you're trying to implement a particular research paper, or you have proprietary data with unusual characteristics, or you're stuck on optimizing inference latency.
You schedule sessions when they fit your timezone and availability. The content adapts to what you already know and what you're trying to build. This costs more but makes sense when you have specific technical goals that don't match a standard curriculum.
Typical Learning Progression
Data Infrastructure
Setting up ingestion pipelines, handling tick data formats, understanding exchange protocols and timestamp alignment
Feature Development
Building stateful features from order books, calculating microstructure signals, dealing with sparse updates and irregular sampling
Model Implementation
Online learning algorithms, incremental updates, managing concept drift, backtesting with realistic latency constraints
Production Considerations
Performance optimization, error handling in live systems, monitoring model behavior, managing state across restarts