About
I'm Michał Wojdylak, an AI Infrastructure Engineer who builds and operates the systems that take machine learning from notebooks to reliable, scalable production services. My work sits at the intersection of machine learning, distributed systems, and cloud infrastructure.
I focus on the parts of AI that have to work at 3am: serving large language models with predictable latency, designing inference platforms that scale with demand, optimizing GPU utilization and cost, and building the MLOps tooling that lets teams ship models safely and often.
I care about clean architecture, observability, reproducibility, and systems that are simple to reason about. This blog is where I share what I learn building production AI infrastructure.
Skills
AI Infrastructure
- GPU cluster orchestration
- Distributed training
- Model serving
- Autoscaling inference
AWS
- EKS
- SageMaker
- EC2 / GPU instances
- S3
- Lambda
- IAM
LLM Deployment
- vLLM
- TGI
- Triton Inference Server
- Quantization
- KV caching
MLOps
- MLflow
- Kubeflow
- CI/CD for models
- Feature stores
- Model registry
Computer Vision
- PyTorch
- ONNX
- TensorRT
- Real-time pipelines
- Edge deployment
Cloud Architecture
- Kubernetes
- Terraform
- Docker
- Service mesh
- Observability
Inference Optimization
- Batching & streaming
- Latency tuning
- Throughput scaling
- Cost optimization