Python · gRPC · Protobuf · PyTorch · Docker
Concurrent gRPC Model Inference Server
A model-inference service with concurrent clients, cached predictions, batched inputs, and model updates.
Role
Backend + ML systems
Year
2025
Stack
Python · gRPC · Protobuf · PyTorch · Docker
Links
The server defines a Protocol Buffers interface, handles remote prediction requests over gRPC, protects shared model/cache state with explicit locking, invalidates stale cache entries on model update, and serves concurrent clients through a thread pool.