Hey Martin, I actually checked again today, and want to clarify something.
Whilst those models can have a maximum token window of 128k tokens, they have a limited output token. For gpt-4-0125-preview, that limit was 4096, so your question is valid. gpt-3.5 has this same limit stated on their website. On Microsoft’s website, gpt-4o is confirmed to have the same limit. Claude 3 as well.
Gemini seems to have the most amongst the big models, for now.