Archives
Recent Articles
- GeForce GF100 Fermi - Architecture preview - Part 32

- GeForce GF100 Fermi - Architecture preview - Part 31

- GeForce GF100 Fermi - Architecture preview - Part 30

- GeForce GF100 Fermi - Architecture preview - Part 29

- GeForce GF100 Fermi - Architecture preview - Part 28

- GeForce GF100 Fermi - Architecture preview - Part 27

- GeForce GF100 Fermi - Architecture preview - Part 26

- GeForce GF100 Fermi - Architecture preview - Part 25

- GeForce GF100 Fermi - Architecture preview - Part 24

- GeForce GF100 Fermi - Architecture preview - Part 23

Categories
Links
-
GeForce GF100 Fermi - Architecture preview - Part 8
Each SM integrates 4 texture unites, for a total of 64 units present inside a GF100 GPU with 512 CUDA cores. The texture units feature a dedicated cache inside the specific SM. Completing the cache hierarchy, there’s also an L2 cache to the CPU, unified between the many SMs, with 768 Kbytes. The cache hierarchy makes it that each thread manages the integrated 64 Kbytes cache in each SM, divided in a shared block, and in a second L1 block with the sizes that can be of 16K/48K or 48K/16K.
The L1 cache dedicated to load and store operations wasn’t implemented by nVIDIA in the GT200 architecture; in the GF100, it allows the system to enhance the performances especially when it comes to physics and ray tracing. The shared cache was present but with 16 Kbytes on the GT200 solutions: in the GF100 it can be up to 16 Kbytes or 48 Kbytes, alternating with the dedicated L1, so the system can have more cache for recurring data in many threads.

