tl;dr: There are two kinds of pixels: device/physical/hardware pixels (i.e. the actual dots) and logical/css pixels, which are (on high resolution devices) whole multiples of hardware ones, because the hardware ones are too small to be useful. End of story.
But why is everybody acting like it’s complicated? It isn’t.
First of all, high-resolution devices have been around for a long time! As long ago as the previous century, you could have a computer with a 72 DPI screen and a 300 DPI printer, and if you printed something, one pixel on the screen came out as 4 dots on the paper. This was considered normal; you didn’t expect your drawings to come out of the printer four times smaller.
So, the concept of “logical” versus “device” pixels is not new! In modern-day terms, this printer had a dppx ratio of 4. And it didn’t confuse people. Why then, are phones with a dppx ratio of 4 confusing?