Premium Content
Sign in to see the full question
Get access to the full problem, solutions, follow-up questions, and discussion.
Get access to the full problem, solutions, follow-up questions, and discussion.
You are serving an LLM. The inference latency for a single request is a nonnegative random variable T, and you are told only:
E[T] = 1 minute
The interviewer asks a sequence of probability...