Multi-tenant Machine Learning Model Serving Systems on GPU Clusters