Talk:AMD K5
From Wikipedia, the free encyclopedia
[edit] ?
I don't think it is fair to call K5's implementation of branch prediction inferior. It's actually a very innovative design.
The prediction information is stored together with each cache line of the L1 cache. There're 2 perdictions per cache line, each as an offset to the jump address in the cache line (so it only takes a few bits instead of the whole 32-bit).
This makes look up exceptionally fast and space effecient. Although the number of entries in total is 4 times that of the P5, it occupies less space and lookup consist of only 2 searches without having to go through the whole branch prediction table.
The con is that it's limited to 2 branches per cache line, so codes exceeding that will cause trashing. The predcition counter is also less sophiciated than the P5 (a design choice, not an inherit problem). When a cache line is replaced, the counter would obviously have to reset, so branch information could not survive very long. All in all, it allows a fast and huge branch prediction table that is less accurate. I believe they supplement the K6 with an additional table to provide very good prediction rate.
Interestingly when they removed the thing from the K7, they said that it is an overkill for the generation of the CPU. So I guess the bottom line is does it really improve performance if they have to put in a secound "normal" branch prediction table anyway. unsigned
The text of this article appears to have been pulled from here, or vice versa: http://www.cpu-collection.de/?tn=0&l0=co&l1=AMD&l2=K5
Aluvus 04:52, 11 March 2006 (UTC)

