The UC Berkeley crew has now shown the value of AI-based optimization work by having OpenEvolve work out a more efficient approach to load balancing across GPUs handling LLM inference.