Inside a modern GPU: from architecture to trillions of calculations
We often speak about the performance of modern GPUs in abstract terms — trillions of operations, ever-increasing compute power. What remains less visible is the structure that makes it possible.
That is where a video by Branch Education comes in, taking the GA102 chip from NVIDIA’s RTX 3090 as a reference point to show what happens inside a modern graphics card.
At the core of the video is a simple but revealing question: how can a single device perform tens of trillions of calculations per second? According to the video, some high-end GPUs are capable of exceeding 36 trillion operations per second, with newer generations pushing that figure even further.
The answer lies in the way GPUs are designed. Unlike CPUs, which handle tasks sequentially, GPUs are built for parallel processing, allowing thousands of smaller operations to be executed at the same time. This is typically described as a Single Instruction, Multiple Data (SIMD) approach, where the same instruction is applied across many data points simultaneously.
This architecture, combined with high-bandwidth memory such as GDDR6X, enables GPUs to handle data-intensive workloads efficiently — from rendering and object transformations to increasingly complex AI tasks.
Rather than a single breakthrough, Branch Education in its video shows that performance is the result of how these elements work together within a tightly integrated system, reflecting the broader role GPUs now play across high-performance computing.



