Offline · Serverless
Estimate compute, memory, and bandwidth to hit a target tokens/sec and TTFT for an open-source LLM.