Posts

Showing posts from May, 2021

RIDL: Rogue In-Flight Data Load

Image
This post is based on the attack covered in the paper: Stephan van Schaik, Alyssa Milburn, Sebastian Ă–sterlund, Pietro Frigo, Giorgi Maisuradze, Kaveh Razavi, Herbert Bos, and Cristiano Giuffrida. 2019.  "RIDL:Rogue In-flight Data Load". In S&P. (Link opens a new tab with PDF ~ 2.2MB) Source (link opens a new tab): https://mdsattacks.com/ This post covers the attack is brief however I recommend reading the paper to get the full insight about the attack and various little details that made the exploit possible. Alt: An illustrative diagram of RIDL where data is wrongly forwarded based on speculation The picture is made using https://excalidraw.com/ (link will open a new tab) Value Prediction As an aggressive method to increase Instruction Level Parallelism, modern processors have started speculating on value for a particular long latency cache miss using the program counter and the data address. There are competitions around value prediction - most notable one being the C

Spectre: Exploiting speculative execution

Image
This article is based on the paper Kocher, P., Horn, J., Fogh, A., Genkin, D., Gruss, D., Haas, W., Hamburg,M., Lipp, M., Mangard, S., Prescher, T., Schwarz, M., and Yarom, Y. "Spectreattacks: Exploiting speculative execution". In S&P(2019) (link opens a new tab with PDF ~ 294kB) Source (link opens a new tab): https://meltdownattack.com In this article we'll take a brief look at Spectre attack however I highly recommend reading the paper mentioned above that goes into more depth about the cause and methods of exploit. Spectre attacks are of two variants. The first one is similar to Meltdown that exploits out of order speculative execution to leak secrets. This post looks at the second variant of attack that uses indirect branches to launch a device that leaks data. Alt: A diagram showing the general idea behind the spectre attack. The picture is made using https://excalidraw.com/ (link will open a new tab)   Cause for Spectre Spectre attacks are cause as a result

M1RACLES: M1ssing Register Access Controls Leak EL0 State

Image
This bug in Apple M1 was discovered by Hector Martin (link opens a new tab) during his research for adding GNU/Linux support for Apple M1 through his Asahi Linux Project (link opens a new tab). His website discussing this bug in detailed is titled M1RACLES: M1ssing Register Access Controls Leak EL0 State (link opens a new tab). If you are familiar with privilege levels and register accesses, you an directly head over to Hector's blog and read the much more detailed and in-depth review of the bug he found. Please do go check out his awesome awesome work after reading this article. Alt: A Schematic diagram of M1RACLES showing two processes transmitting data to each other over a covert channel The picture is made using https://excalidraw.com/ (link will open a new tab) Privilege Levels in ARM-v8 Hector's blog goes into details of the bug but I'm writing this from the perspective of myself, a complete beginner, and summarizing the concept on the way up. First I would like t

Branch Prediction

Image
In this post, we'll take a brief look at branch prediction. To learn more about branch prediction and the hardware implementation of branch predictors, you can watch the lectures on branch prediction by Onur Mutlu - Branch Prediction I (link opens a new tab) and Branch Prediction II (link opens a new tab) from Digital Design and Computer Architecture playlist (link opens a new tab) of Spring Semester, 2020 at ETH ZĂ¼rich.   Alt: An illustration showing the need for branch prediction to speculate the direction of conditional branches for speculative execution The picture is made using https://excalidraw.com/ (link will open a new tab) Need for branch prediction When a processor encounters a branch instruction - it has to decode the instruction to find the target of the branch. In case of a conditional branch, the direction of branch also depends on the values in the Flag Registers - you can read more about these registers in Wikipedia article titles FLAGS register (link opens a

Meltdown: Reading Kernel Memory from User Space

Image
 This article is based on the paper: M. Lipp, M. Schwarz, D. Gruss, T. Prescher, W. Haas, A. Fogh, J. Horn, S. Mangard, P. Kocher, D. Genkin, Y. Yarom, and M. Hamburg, “Meltdown: Reading Kernel Memory from User Space,”  Tech. Rep.,2018. (link opens a new tab with pdf of paper ~ 258kB) Source (link opens a new tab): https://meltdownattack.com/ In this article we'll take a brief look at Meltdown attack however I highly recommend reading the paper mentioned above that goes into more depth about the cause and methods of exploit. Alt: An illustration of Meltdown on how data at target can be captured by a location in a device The picture is made using https://excalidraw.com/ (link will open a new tab) Memory layout Note : This as the memory layout for programs before Operating Systems implemented mitigations for Meltdown. Now every access into kernel space is trapped and checked for validity. Before Meltdown became a known threat, the virtual memory layout for any program looked as foll

Speculative Execution

Image
 Most of the recent CPU vulnerability discovered depend on the property of modern processors known as Speculative Execution. In this post, we'll take a deeper look at where and why modern processors speculate. Alt: A illustration that shows speculative execution and how the results are committed or discarded based on result of speculation - whether it was correct or not respectively. The picture is made using https://excalidraw.com/ (link will open a new tab) Pipelined Execution Most modern processors are pipelined and are often superscalar in nature running instructions out of order, with multiple stages of execution of instruction running concurrently. Alt: Pipeline of a supersalar processor with 2 instruction fetch every clock cycle and a 5 stage pipeline. Source (link opens a new tab): https://en.wikipedia.org/wiki/Superscalar_processor In case of a long latency cache miss, permission check for an access or conditional branches, the processor generally has to stall waiting for

Power Analysis

Image
In our last few posts we looked at Timing Analysis and techniques such as Flush and Reload and Prime and Probe that exploit timing difference to deduce operations of victim. Looking at Timing Analysis, we observe how we exploit the lack of uniformity within the system to observe the victim. In timing analysis, we exploit the timing difference between loading data from cache and loading data from main memory as a device to observe bits of program victim is executing. If this execution pattern depends on some secret held by the victim, we can infer details about the secret from the patter on execution of victim thread. Alt: An icon representing power Source: icons made by Freepik (link opens a new tab) from Flaticons (link opens a new tab) Energy and Execution Modern microprocessors are complex feat with multiple physical cores and multiple different functional unit - Arithmetic and Logic Unit (ALU) to process integers, Floating Point Unit, Single Instruction Multiple Data (SIMD) un

Prime and Probe

Image
This is the third post in our series of Timing Analysis and I high recommend reading the first two posts - Timing Analysis (link opens a new tab), and Flush and Reload (link opens a new tab). That said if you are aware about cache hierarchy, shared libraries, shared pages, and cache placement policies, you can continue reading this article. In this post we look at Prime and Probe . Alt: An illustration of prime and probe Made with https://excalidraw.com/ (link opens a new tab) Cache Replacement Strategies The placement of data in cache is discussed briefly in Flush and Reload post. Here we will take a look at how data is replaced in cache. There are many different strategies to evict data from cache when a new data is needed to be stored in a filled cache set. Some of them are (some more practical than others): First In First Out (FIFO) - The logic behind the First In First Out is that the data that came first is probably the one that won't be used again for long time and henc

Flush and Reload

Image
This is the second post in series of timing analysis. If you haven't read the first post - Timing Analysis (link opens a new tab), I highly recommend reading it to understand why these attacks are possible. If you have an idea of inclusive cache hierarchy and how OS loads shared library into virtual memory, you can continue with reading this article. In this post, we will look at Flush and Reload   Alt: An illustration depicting flush and reload attack using timing diagram Made with https://excalidraw.com/ (link opens a new tab)   Cache Placement Cache can be designed in few different ways. Cache has to store address and the data at that address. We have to map an address to a particular cache line and depending on design, there is a tradeoff between data retention in case of conflict and time to access. The following are the popular mapping (taught at an undergraduate level) for mapping addresses to cache line: Direct Mapping (1:1 associative) - Each address can be uniquely map