The quick summary is that memory returned to the OS is often requested back very rapidly. Hence waiting a bit of time before returning unused memory improves performance without increasing practical memory footprint.
The performance improvements comes from not needing to break up hugepages in order to return memory to the OS. Additionally we don't spend time in the OS returning memory, or faulting it back into the memory of the process.
We don't make a practical increase in the RAM that an application needs because the memory is typically requested after a short interval, and there's insufficient time to be able to schedule anything in the interval.