2020. 7. 1. 13:08ㆍComputer Architecture/Cache security
author: Yuval Yarom, Katrina Falkner, The University of Adelaide
writter: Yu-gyoung Yun
searchien@dgist.ac.kr
[Abstract]
Page sharing exposes processes to information leaks.
We demonstrate the efficacy of the FLUSH+RELOAD attack by using it to extract the private encryption keys from a victim program running GnuPG 1.4.13.
[1.Introduction]
To reduce the memory footprint of a system, the system software shares identical memory pages.
Unlike the original attack, FLUSH+ReLOAD is a cross-core attack, allowing the spy and the victim to execute in parallel on different execution cores.
Two properties of the FLUSH+RELOAD attack make it more powerful, and hence more dangerous, than prior micro-architectural side-channel attacks.
1. the attack identifies access to specific memory lines, whereas most prior attacks identify access to larger classes of locations, such as specific cache sets.
2. it focueses on the LLC, which is the cache level furthest from the processors cores(i.e., L2 in processors with two cache levels and L3 in processors with three).
By observing a single singning or decryption round, the attack extracts 98.7% of the bits on average in the same OS scenario and 96.7% in the cross-VM scenario, with a worst case of 95% and 90%, respectively.
[2. Preliminaries]
[2.1 Page Sharing]
Sharing memory can be used as an inter-process communication mechanisms between two co-operating processes and it can be used for reducing memory footprint by avoiding replicated copies of identical contents.
As memory pages can be shared between non co-operating processes, the system must protect the contents of the pages to prevent malicious processes from modifying the shared contents.
[2.2 Cache Architecture]
Retrieving data form memory or from cache levels closer to memory takes longer than retieving it from cache levels closer to the core. This difference in timing has been exploited for side-channel attacks.
Most prior work on cache side-channel attacks relies on the victim and spy executin gwithin the same processing core.
One reason for that is that many of the attacks suggested require the victim to be stopped while the spy performs the attack. To that aim, the attack is combined with an attack on the scheduler that allows the spy process to interrupt and block the victim.
[2.3 RSA]
RSA는 공개키를 이용하는 암호화 방식으로 전자서명이 가능하다. 큰 수의 소인수분해가 어려운 점을 이용한다.
RSA는 공개키만으로는 개인키를 알아낼 수 없는 안정성을 갖춤.
RSA 암호의 안전성의 근거는 1. 큰 수의 소인수 분해의 어려움, 2. 나머지 연산의 역연산의 어려움.
대칭키(AES, DES, 수신자와 발신자 간에 키를 공유하는 과정이 필요해서 문제)의 암호화: 암호화와 복호화를 동일한 암호키를 이용.
대칭키(비밀키): 키분배, 디지털 서명 불가능, 보안에 취약
비대칭키(공개키, 비공개키)보다는 약 1000배 빠르다.
[3. The FLUSH+RELOAD Technique]
A round of attack consists of three phases.
1. During the first phase, the monitored memory line is flushed from the cache hierarchy.
2. The spy, then, waits to allow the victim time to access the memory line before the third phase.
3. In the third phase, the spy reloads the memory line, measuring the time to load it.
(A),(B): timing of the attack phases without and with victim access.
(C): the victim access can overlap the reload phase of the spy.
the victime access will not trigger a cache fill.
Instead, the victim will use the cached data from the reload phase. --> the spy will miss the access.
(D): the reload operation partially overlaps the victim access.
the reload phase starts while the victim is waiting for the data.
the reload benefits from the victim access and terminates faster than if the data has to be loaded from memory.
However, the timing may still be longer than a load from the cache.
(E): the likelihood of missing the loop is small.
The code measures the time to read the data at a memory address and then evicts the memory line from the cache.
* Exploit timing difference between cache hits and cache miss.
The crux of the technique is the ability to evict specific memory lines from the cache.
The line14. clflush instruction evicts the specific memory line from all the cache hierarchy, including the L1 and L2 caches of all cores.
Evicting the line from all cores ensures that the next time the victim accesses the memory line it will be loaded into L3.
[4. Attacking GnuPG]
To extract the components of the private key from the GnuPG implementation of RSA.
The approach we take is to trace the execution of the victim program.
In Fig. 6. As the displayed area is below the threshold, the diagram only displays the memory lines that were retrieved from the cache, showing the activity of the GnuPG encryption.
beetween time slots 3,903 and 3,906 the calculated sequence is Square-Reduce, which is followed by a Square, indicating that in these time slots the victim was processing a clear bit.
Figure 7 also demonstrates the effects of speculative execution.
By recognising sequences of operations, an attacker can recover the bits of the exponent.
Table 1 shows the time slots corresponding to each bit.
The shell script overestimates the number of errors.
On the HP machine we observe better results and significantly less noise than on the Dell machine.
[7. Conclusions]
How to use FLUSH+RELOAD techinuque to extract GnuPG private keys across multiple processor cores and across virtual machine boundaries.
GnuPG is a very popular cryptographic package.
Hence, vulnerable versions of GnuPG are not safe for multi-tenant systems or for any system that may run untrusted code.
While significant, the attack on GnuPG are not safe for multi-tenant systems or for any system that may run untrusted code.
The FLUSH+RELOAD technique exploits the lack of restrictions on the use of the clflush instruction.
Preventing page sharing also blocks the FLUSH+RELOAD technique.
We, therefore, recommend that memory de-duplication be switched off.