Optimistically delay using thread-local storage for the door reply buffer until we actually need to allocate from the heap.