The latest version of OpenAI's language model, GPT-4.1, has been benchmarked against competing models like Claude Sonnet 3.7 and Gemini 2.5 in generating high-quality code reviews from GitHub pull requests. In a study of 200 pull requests, GPT-4.1 provided better suggestions in 55% of cases, showcasing its strengths in both precision and comprehensiveness. Despite this, some users found GPT-4.1's performance to be underwhelming, especially considering its relatively short knowledge cutoff of June 2024 compared to Gemini 2.5's January 2025. Users are calling for more transparent benchmarking of models, especially regarding long max token usage, as many limit practical utility significantly. The primary access to GPT-4.1 will be through the API, which differs from ChatGPT's ongoing updates, allowing for more stable business applications.