What are LLM output configurations?

 title: 'A detailed view of a geometrically faceted, colorful crystal.'

LLMs have various configuration options that control the model’s output[1]. Effective prompt engineering requires setting these configurations optimally for your task[1].

Common configuration settings that determine how predicted token probabilities are processed to choose a single output token are temperature, top-K, and top-P[1]. An important configuration setting is also the number of tokens to generate in a response[1]. Generating more tokens requires more computation from the LLM, potentially slower response times, and higher costs[1].