Extra bytes in heapSTRUCT_SIZE?

Using v7.2.0, it looks like there are an extra 8 bytes added for every pvPortMalloc call due to the computation of heapSTRUCT_SIZE. This is true for heap_2 and heap_4. Currently, static const unsigned short heapSTRUCT_SIZE = ( sizeof( xBlockLink ) + portBYTE_ALIGNMENT – ( sizeof( xBlockLink ) % portBYTE_ALIGNMENT ) ); and in my case of a Cortex-M4 and a Keil compiler, portBYTE_ALIGNMENT is 8, and sizeof(xBlockLink) is 8. This line of code results in a heapSTRUCT_SIZE of 16 when the actual structure size is already on an 8 byte boundary. Would it be better to do something like, static const unsigned short heapSTRUCT_SIZE = (sizeof(xBlockLink) & portBYTE_ALIGNMENT_MASK)?
        (sizeof(xBlockLink) + (portBYTE_ALIGNMENT – (sizeof(xBlockLink) & portBYTE_ALIGNMENT_MASK))):