Trust Matters 2025 March: Cracking the Code: Highlights from the CPU-GPU Security Workshop

TRUST matters

March 2025

Cracking the Code: Highlights from the CPU-GPU Security Workshop

On December 13, 2024, IITB Trust Lab sponsored a hands-on CPU-GPU Security Workshop, hosted by the CASPER group at IIT Bombay’s CSE Department. Designed for students, researchers, and tech enthusiasts, the workshop took a deep dive into hardware security challenges—a fast-evolving field that’s critical yet often overlooked.

From side-channel attacks that exploit tiny hardware-level leaks to fault injection techniques that can break cryptographic defenses, the sessions explored real-world threats and the hidden risks lurking in modern CPU and GPU architectures. The hands-on activities gave participants a chance to experiment with security tools and understand why hardware security matters now more than ever.

Catch the key takeaways, expert insights, and why securing the hardware layer is just as crucial as software in today’s digital world. Read on!

In cybersecurity, network security and hardware security address different aspects of protecting systems. Network security focuses on securing data in transit between devices, servers, and users, employing techniques like encryption, to prevent threats like data breaches and unauthorised access. In contrast, hardware security is concerned with safeguarding the physical components of computing devices, such as Central Processing Units (CPUs) and memory.

Hardware security vulnerabilities are more challenging to exploit compared to software flaws. Many require physical access to devices, or specialised knowledge of microarchitecture and advanced tools. While network attacks can often be executed remotely, hardware exploits demand a deep understanding of hardware behaviour and precise execution.

However, it is also true that hardware vulnerabilities pose significant risks as they can bypass software defences and persist across updates, potentially compromising the entire system.

CPU Security

Over the past few decades, CPU design has largely prioritised performance improvements, due to an ever-increasing demand for faster computation. However, this focus on performance comes at a price: it introduces trade-offs in security.

CPU security is fundamentally about ensuring three key properties:

Confidentiality, which refers to preventing unauthorised access to data
Integrity, which ensures that unauthorised modifications do not occur
Availability, which ensures that other processes are not delayed or denied service

Key architectural advancements, such as the introduction of cache hierarchies (L1, L2, L3), have significantly enhanced speed by reducing latency. Caches work by storing frequently accessed data closer to the processor, thereby avoiding a time-consuming journey all the way to the Random Access Memory (RAM) module.

But this very cache hierarchy, while being supremely beneficial for performance, creates opportunities for information leakage, which attackers can exploit.

Side-channel Attacks

One such method is the side-channel attack. In this, the victim remains unaware that their data is being targeted by a malicious agent. Instead of exploiting direct software vulnerabilities, attackers analyze indirect “leaks” from the system, such as power usage or memory access patterns, and then use this information to their advantage. These leakages are a byproduct, and an inevitable consequence of the way algorithms and hardware operate.

A Flush & Reload attack is one such attack. Here, the attacker flushes specific cache entries so that any subsequent victim access forces the system to fetch the data from RAM, increasing latency. The key to the attack lies in the timing differences observed. If the victim process has recently accessed the shared memory location, it will likely be present in the cache, leading to a faster access time. Conversely, if the victim hasn’t accessed the memory location recently, it will need to be fetched from main memory, resulting in a longer access time. By carefully measuring these timing differences, the attacker can gain crucial information.

Covert-channel Attack

The other kind of attack is a covert-channel attack, which occurs when two entities who are not supposed to be able to communicate, bypass established security policies and establish a connection. These attacks exploit unintended or unmonitored pathways to transmit information.

For instance, an attacker could exploit a shared file system to create a covert channel. The sender could modify file attributes, such as timestamps, to encode data. The receiver could then monitor these attributes for changes and decode the hidden information.

One such notable high-profile breach was Spectre. The Spectre vulnerability is a serious security flaw that affects modern processors with speculative execution capabilities.It allows attackers to potentially steal sensitive data from other programs running on the same system.

Breakdown of Spectre

Speculative Execution: Modern processors use speculative execution to optimise performance. They predict the outcome of certain instructions and execute them ahead of time, assuming the prediction is correct.
Branch Prediction: A key part of speculative execution is branch prediction, where the processor guesses which path the program will take at a branch (e.g., if/else statements).
Exploiting Speculation: Spectre exploits this speculative execution by tricking the processor into loading data from memory locations that it shouldn’t have access to. Even if the speculation is later found to be incorrect and the data is discarded, the attacker can still extract information through subtle timing differences or other side channels.

Graphic Processing Unit (GPU) Security

In recent years, GPUs have expanded far beyond their traditional role in rendering graphics. With applications in artificial intelligence, genomics, and autonomous vehicles, they are now a cornerstone of general-purpose computing.

This rapid adoption and technological advancement has even outpaced CPU growth. Unlike CPUs that prioritise low latency for sequential operations, GPUs are optimised for throughput, enabling them to process massive workloads concurrently across their many cores. This makes them ideal for tasks that can be parallelised, that is, tasks where many similar calculations can happen concurrently.

Despite their advantages, GPUs face several challenges, particularly as they scale for increasingly demanding workloads. The growth in peak memory bandwidth has not kept pace with the surge in peak throughput, limiting scalability. Additionally, physical die-size limitations constrain further growth. Security and reliability issues also arise due to the heavy use of shared memory in GPU architectures, which introduces new vulnerabilities.

Memory Coalescing in GPUs

A key feature of GPUs is their ability to save bandwidth through memory access coalescing. GPUs handle numerous threads accessing memory simultaneously, which could lead to overwhelming memory bandwidth demands.

To mitigate this, GPUs employ a coalescing unit that groups memory requests targeting the same cache line. This is very similar to how carpooling reduces the load on a transportation system, by allowing individuals going towards the same destination, to travel together. This optimisation reduces the number of memory accesses, lowering execution time and improving performance. However, this very mechanism can be exploited by attackers who analyze memory access patterns to extract sensitive information or manipulate workloads.

Faults and Cryptography

In theory, cryptography operates within the confines of mathematical abstractions. There is a certain guarantee that theorems promise in their proofs. However, in practice, cryptographic algorithms must run on physical devices, where real-world factors introduce new vulnerabilities.

And hence cryptographers must rely on engineers to bridge this gap, as physical devices can give away unintended information through characteristics such as time consumption and power usage. These side-channel leaks provide malicious agents with opportunities to compromise cryptographic systems in ways not accounted for in purely theoretical models.

One critical threat to cryptographic systems is fault injection — a technique used to introduce errors or disturbances into a device or software. By causing faults during cryptographic computations, attackers can extract sensitive information compromising the system. These can be executed using local or remote means.

Local Fault Injection Techniques

Voltage-Glitching

Voltage-glitching manipulates the voltage supplied to a chip during its operation, creating errors in execution. For example, by briefly reducing or increasing voltage during critical moments, attackers can cause incorrect computations, potentially bypassing security checks, or revealing cryptographic keys.

Clock-Glitching

In clock-glitching, attackers modify the clock signal driving the processor to disrupt its timing. By introducing irregularities, such as speeding up or delaying clock cycles, errors are induced during execution, leading to exploitable vulnerabilities in system operations.

Remote Fault Injection: Rowhammer

Unlike local techniques, remote fault injection can be executed over a network, making it particularly dangerous for shared systems and servers. A notable example is Rowhammer, a software-induced hardware fault technique that targets vulnerabilities in DRAM (Dynamic Random Access Memory). DRAM differs from other memory types due to its reliance on capacitors to store data, which must be periodically refreshed to retain information.

The Rowhammer attack involves running code that repeatedly accesses (hammers) a specific row in DRAM, causing electrical interference that leaks into adjacent rows. This interference can induce bit flips, where stored values are unintentionally altered. In cryptographic systems, such bit flips can compromise keys or sensitive data, exposing them to attackers. The ability to execute Rowhammer remotely makes it a potent and widespread threat.

Fault Analysis

The aim of analysing faults is to exploit errors or disturbances introduced into a cryptographic system’s execution, which gives insights into its inner workings. The purpose is both offensive — adversaries attempting to break the system — and defensive — researchers and engineers aiming to strengthen cryptographic implementations. They are subdivided into two types.

Discrete Fault Analysis (DFA)

DFA involves inducing specific faults at known stages of cryptographic computation and analysing the resulting faulty outputs. By carefully controlling when and where faults occur, attackers can use the discrepancies between correct and faulty outputs to deduce secret keys.

Statistical Fault Analysis (SFA)

SFA takes a more probabilistic approach, inducing random faults and using statistical methods to extract information. Instead of requiring precise control, SFA relies on patterns in the distribution of faults to uncover cryptographic secrets.

Privacy-preserving ML

As the demand for privacy-preserving technologies grows, confidential compute has become a focal point in the field of cybersecurity and privacy. Traditional encryption techniques protect data at rest (when stored) and in transit (when transferred across networks). Comparatively, there has been lesser research undertaken on securing data in use — that is, to ensure that data remains encrypted during computations, providing a secure environment where information is protected at every stage.

One approach to privacy-preserving machine learning is Secure Multi-Party Computation (SMPC). SMPC allows multiple parties, who may not trust each other, to collaborate on training a machine learning model without disclosing their individual datasets. For example, two hospitals may want to combine their patient data to train a model for disease diagnosis. However, each hospital is hesitant to share sensitive patient information with the other, as this would be a breach of trust of their customer.

Thus, SMPC allows parties to pool their data together, train the model, and learn the resulting model weights, all while keeping each party’s data private. The data itself never leaves its original location, and no single party gains access to the other’s private data.

SMPC operates in two phases:

Pre-processing/Offline Phase: In this phase, the parties prepare their data, performing necessary transformations or encryptions without revealing any sensitive information. This phase ensures that all data remains secure until the online phase begins.

Online Phase: During this phase, the parties securely collaborate to compute the model, exchanging intermediate results in a way that no party learns anything about the other’s data. The final model weights are learned collaboratively, but individual data points remain private throughout the process.

Secret Sharing

A central concept in SMPC is secret sharing, where a secret value is divided into multiple “shares” distributed across different parties. The key idea is that no individual share reveals any useful information about the secret itself. For example, a secret S can be split into n shares such that:

Each share, on its own, does not reveal any information about the secret
A sufficient number of shares, denoted by t (threshold), can be combined to reconstruct the secret

The security of this scheme ensures that anyone with fewer than t shares cannot gain any more information about the secret than someone with zero shares. In the context of SMPC, this method helps to ensure that no single party can learn more than they are supposed to, even while contributing to the collaborative computation.