A structured path from Python fundamentals to production ML β 18 modules, 5 phases, designed to build skills progressively with real-world projects at every step.
The essential building blocks every data scientist needs before anything else
Master Python from scratch β variables, control flow, functions, OOP, error handling, generators, async/await, and file I/O.
Numerical computing with N-dimensional arrays β vectorized operations, broadcasting, linear algebra, FFT, and random number generation.
Consume and build APIs β HTTP fundamentals, requests, JSON parsing, authentication, pagination, rate limiting, async calls, FastAPI, and web scraping.
Query relational databases β SELECT, JOINs, CTEs, window functions, subqueries, indexes, views, triggers, and SQLite with Python.
Master Git from basics to advanced β branching, merging, pull requests, rebasing, hooks, LFS for large files, and GitHub Actions CI/CD for data science teams.
Load, clean, transform, and explore real-world datasets at scale
The industry-standard data analysis library β DataFrames, groupby, merge/join, datetime, string ops, pivot tables, and cleaning pipelines.
Next-generation DataFrame library β multi-threaded execution, LazyFrame query optimizer, Arrow-native, and 10β100Γ faster than pandas on large data.
The mathematical backbone of ML β distributions, hypothesis tests, Bayesian inference, bootstrap methods, multiple testing correction, and survival analysis.
Handle missing data, outliers, inconsistencies, encoding, scaling, and create powerful features β the skill that improves model performance more than algorithms.
Think probabilistically β Bayes' theorem, priors & posteriors, Bayesian A/B testing, Monte Carlo simulation, MCMC, and Bayesian optimization.
Communicate insights through static, statistical, and interactive charts
Full control over every chart element β subplots, twin axes, custom colormaps, publication-quality figures, and GIF animations.
Beautiful statistical graphics β violin plots, pair grids, FacetGrid multi-panel figures, regression plots, and custom themes on top of Matplotlib.
Web-ready interactive charts β hover tooltips, zoom, animated scatter, geographic maps, 3D surfaces, and Dash for full web dashboards.
Build, evaluate, and interpret predictive models on structured and unstructured data
Complete ML toolkit β regression, classification, clustering, pipelines, hyperparameter tuning, stacking ensembles, SHAP explainability, and time-series CV.
Forecasting and temporal pattern analysis β ARIMA, Prophet, decomposition, wavelet analysis, VAR models, walk-forward CV, and anomaly detection.
From tokenization to transformers β NER, sentiment analysis, TF-IDF, topic modeling, BERT fine-tuning, zero-shot classification, and text summarization.
Neural networks, generative models, web apps, and production ML systems
PyTorch from scratch β autograd, CNNs with skip connections, Transformer attention, LSTMs, VAEs, GANs, and transfer learning with frozen backbones.
Turn Python scripts into shareable web apps β dashboards, ML predictors, live streaming, batch predictions with file upload, Plotly charts, and session state.
Ship models to production β FastAPI serving, Docker, MLflow tracking, data drift detection, feature stores, A/B testing, canary deployments, and multi-armed bandits.
What you can build and where these skills lead
Extract insights from business data, build dashboards, and drive decisions with statistics and visualization.
Design, train, and deploy machine learning models that power data-driven products at scale.
Push the boundaries of AI β develop new architectures, train large models, and publish research.
Build language-powered systems β chatbots, search engines, document extraction, and LLM applications.
Model financial markets, demand forecasting, and supply chain predictions with temporal ML.
Build the infrastructure that keeps models reliable, monitored, and continuously improving in production.
Get the most out of each guide
Every module ships with a study_guide.ipynb. Open it in Jupyter or VS Code and run each cell β muscle memory matters more than reading.
Each section ends with a real-world practice exercise. Try to solve it before reading the starter code β struggling productively is how you actually learn.