This is imaging using a ton of math. They don't go into a lot of detail but I would guess each "pixel" in the chip is tagged with time and position vs all the others. How the light is spectrally segregated isn't clear. Given that IC chips are built to submicron scale tolerances, the optical phase across the chip is uniform. But the light reaching it can be randomly phased due to atmospheric effects and target line of sight so some way is needed to reference the phase of the light in each pixel to determine relative tilt. A Hartmann sensor does this by grouping four elements in a 2x2 array and putting it behind a tiny lenslet. Light gathering power (the ability to see dim targets) still must rely on size so that would mean a huge number of these chips would be needed. For a very bright target, you could get away with less but for resolution, the chips would need to be spread over a large area and then to be phased to each other, they would have to be bonded to a common optically flat substrate. If your physical envelope is thin this might let you package an imaging system equivalent to a much larger sensor. Sounds messy but electronics is the one area of science that consistently beats science fiction.
https://www.youtube.com/watch?v=ryxt4gKt1vY
https://www.youtube.com/watch?v=ryxt4gKt1vY