Production ML with Hugging Face

By Coursera on Coursera · Technology
Price
Free

About This Course

Learn to deploy ML models to production using the Sovereign Rust Stack—a pure Rust implementation with zero Python runtime dependencies. This hands-on course teaches you to work with three critical model formats (GGUF, SafeTensors, APR), implement MLOps pipelines with CI/CD and observability, and deploy models across GPU, CPU, WebAssembly, and edge targets. Through real-world projects including a Python-to-Rust transpiler (Depyler), browser-based speech recognition (Whisper.apr), and LLM inference benchmarking (Qwen), you'll master format conversion, cryptographic model signing, and performance optimization. The course culminates in a capstone project deploying Qwen2.5-Coder across all three formats with benchmarking. What makes this course unique: instead of relying on Python frameworks, you'll build with production-grade Rust tooling that compiles to native binaries and WebAssembly. Learn to run sub-millisecond inference in browsers, bundle models into executables, and achieve 2x performance gains over standard tools. Ideal for ML engineers and software developers ready to move beyond notebooks into production deployment.

Instructor

Noah Gift

Frequently Asked Questions

How much does Production ML with Hugging Face cost?
Visit the Production ML with Hugging Face course page for current pricing and available discounts.
Who teaches Production ML with Hugging Face?
Production ML with Hugging Face is taught by Noah Gift, Duke University.
What skill level is Production ML with Hugging Face for?
This course is designed for all levels learners.