The site confuses the inference engine in the Edge TPU with the datacenter TPU. They are two unrelated projects. Based on the paper they're borrowing from, I think they are trying to go for a much older datacenter inference-only TPU, or only implementing the inference capabilities of the datacenter TPU.
Are there recent papers on datacenter TPU?
Yes.
David Patterson overview (2023), https://www.cs.ucla.edu/wp-content/uploads/cs/PATTERSON-10-L...
TPU v4 (2023), https://arxiv.org/abs/2304.01433
"Google Edge TPU devices", 100 comments (2019), https://www.hackerneue.com/item?id=19130896 & https://www.hackerneue.com/item?id=19313813
"Coral Edge TPU review", 100 comments (2020), https://www.hackerneue.com/item?id=24808755
"TPU transformation: 10 years of our AI-specialized chips", 60 comments (2024), https://www.hackerneue.com/item?id=41148532