Abstract

HIGHT is a lightweight block cipher which has been adopted as a standard block cipher. In this paper, we present a bit-level algebraic fault analysis (AFA) of HIGHT, where the faults are perturbed by a stealthy HT. The fault model in our attack assumes that the adversary is able to insert a HT that flips a specific bit of a certain intermediate word of the cipher once the HT is activated. The HT is realized by merely 4 registers and with an extremely low activation rate of about 0.000025. We show that the optimal location for inserting the designed HT can be efficiently determined by AFA in advance. Finally, a method is proposed to represent the cipher and the injected faults with a merged set of algebraic equations and the master key can be recovered by solving the merged equation system with an SAT solver. Our attack, which fully recovers the secret master key of the cipher in 12572.26 seconds, requires three times of activation on the designed HT. To the best of our knowledge, this is the first Trojan attack on HIGHT.

1. Introduction

The resource-constrained devices such as RFID tags and smart cards have been pervasively used in the daily activities of human society, such as intelligent transportation, modern logistics, and food safety [1, 2]. As these devices have inherent constrains in storage space, computation ability, and power supply, modern cryptographic primitives like DES, AES, or RSA are difficult to be deployed on them. Hence, the research of lightweight cryptography, which aims at designing and implementing security primitives fitting the needs of low-resource devices, has been focused on a large scale ‎[3]. Particularly, the lightweight block cipher is one of the most studied metrics, which has been extensively explored in numerous prior papers. There have existed a lot of lightweight block ciphers, such as PRESENT ‎[4], LED ‎[5], SIMON ‎[6], mCrypton ‎[7], and HIGHT ‎[8, 9].

Hardware Trojan is a circuit maliciously inserted into integrated circuit (IC) that typically functions to deactivate the host circuit, change its functionality, or provide covert channels through which sensitive information can be leaked ‎[10, 11]. They can be implemented as hardware modifications to ASICs, commercial-off-the-shelf (COTS) parts, microprocessors, microcontrollers, network processors, or digital-signal processors (DSPs) and can also be implemented as firmware modifications to, for example, FPGA bitstreams [12]. An adversary is expected to make a Trojan stealthy in nature, that is, to evade detection by methods such as postmanufacturing test, optical inspection, or side-channel analysis [1315]. Due to outsourcing trend of the semiconductor design and fabrication, hardware Trojan attacks have emerged as a major security concern for integrated circuits (ICs) [13].

Differential Fault Analysis (DFA) [16] was one of the earliest techniques invented to attack block ciphers by provoking a computational error. DFA retrieves the secret key based on information of the characteristics of the injected faults and the difference of the ciphertexts and faulty ciphertexts. However, since DFA relies on manual analysis, it often has inherit inherent limitations in scenarios that have very high complexity, for example, when faults are located in deeper rounds of the cipher or when the exact location of the injected faults in a deep round is unknown.

In eSmart 2010, Courtois and Pieprzyk combine algebraic cryptanalysis [17] with fault analysis to propose a more powerful fault analysis technique called algebraic fault analysis (AFA) [18]. The basic idea of AFA is to convert both the cipher and the injected faults into algebraic equations and recover the secret key with automated solvers such as SAT instead of the manual analysis on fault propagations in DFA, hence making it easier to extend AFA to deep rounds and different ciphers and fault models. AFA has been successfully used to improve DFA on the stream ciphers such as Trivium [19] and Grain [20] and block ciphers such as AES [21], LED [22, 23], KASUMI [24], and Piccolo [25].

1.1. Motivation

HIGHT is a lightweight block cipher that has attracted a lot of attention because it is constructed by only ARX operations (modular addition, bitwise rotation, bitwise shift, and XOR), which exhibits high performance in terms of hardware compared to other block ciphers. HIGHT has been selected as a standardized block cipher by Telecommunications Technology Association (TTA) of Korea and ISO/IEC 18033-3 [9].

It is noted that both the DFA and AFA require high precision in the fault injection in terms of location and timing. In practice, low-cost fault injection techniques like reduction of the feeding voltage or clock manipulation do not achieve the required accuracy, while highly precise methods such as pinpointed irradiation of desired fault sites by intensive laser light are difficult to perform and require costly equipment [26]. However, if the adversary is able to insert hardware Trojan (HT) to the underlying cryptographic hardware [10], AFA can be easily achieved. A well designed HT can precisely inject any type of faults to enable AFA and evade detections, by having low cost and with low activation rate.

In addition, since the design of lightweight block ciphers is compact, especially for HIGHT whose construction only based on ARX operations, it is simple to represent the cipher as a set of algebraic equations. It is also easier to implant hardware Trojans into devices that adopt such lightweight algorithms because these devices are normally used in RFID system and composed of sorts of IPs, and they are typically designed and manufactured by offshore design houses or foundries. In theory, any parties involving into the design or manufacturing stages can make alterations in the circuits for malicious purpose [15], and thus these circuits are more vulnerable to algebraic fault attacks which inject faults by triggering HT.

1.2. Contribution

In this paper, we show that the lightweight block cipher HIGHT is prone to algebraic fault analysis, which can be feasible with a stealthy HT. The proposed analysis of HIGHT is implemented on SASEBO-GII board soldering a 65 nm Virtex-5 FPGA [27] and recovers the 128-bit secret master key with only 3 faults. The main contributions of the paper are summarized as follows:

(1) We design a stealthy FSM-based HT by using 4 flip-flops overhead which is a 1.63% additional cost in flip-flops for HIGHT implemented on SASEBO-GII board and with an extremely low activation rate of about 0.000025. The HT enables the adversary to induce a single-bit fault precisely in both location and time when it is activated and thus make the bit-level AFA efficiently.

(2) Some properties of faults are given to maximize the utilization of the fault leakages and show that the adversary can predetermine the optimal location for the HT by AFA to maximize the attack efficiency.

(3) A very simple and efficient method is proposed to describe HIGHT and the injected faults as a merged set of algebraic equations and transform the problem of searching for the secret master key into solving the merged equation system with an SAT solver.

(4) It is proven that the lower bound for the number of the required faults is 3 and an efficient distinguisher is proposed to uniquely determine the secret master key.

1.3. Organization

The rest of this paper is organized as follows. Section 2 introduces the related works. Section 3 lists the notations used in the paper and briefly describes the HIGHT algorithm and the overview of the attack. Section 4 presents some important properties of the faults and the details of the HT are given in Section 5. Then, Section 6 describes our attack on HIGHT and the experimental results are shown in Section 7. Finally, Section 8 concludes the paper.

Since the proposal of HIGHT, there have been many studies on the security of HIGHT. The preliminary security analysis [8], conducted during the HIGHT design process, includes the assessment of the cipher with respect to different cryptanalytic attacks such as differential cryptanalysis, related-key attack, saturation attack, and algebraic attack and the designers claim that at least 20 rounds of HIGHT are secure against these attacks. But in 2007, Lu [28] presents the first public cryptanalysis of reduced versions of HIGHT which indicates the reduced versions of HIGHT are less secure than the designers claimed. Then in 2009, Lu’s attack results were improved by Özen et al. [29] by presenting an impossible differential attack on 26-round HIGHT and a related-key impossible differential attack on 31 round HIGHT. At CANS 2009, Zhang et al. [30] present a 22-round saturation attack on HIGHT including full whitening keys with 262.04 chosen plaintext and 2118.71 22-round encryptions. The first attack on full HIGHT was proposed by Koo et al. at ICISC 2010 [31] using related-key rectangle attack based on a 24-round related-key distinguisher with the data complexity of 257.84 chosen plaintext and the time complexity of 2123.17 encryptions. The second attack on full HIGHT was proposed by Hong et al. at ICISC 2011 [32] with a Biclique cryptanalysis of the full HIGHT which recovers the 128-bit secret master key with the computational complexity of 2126.4, faster than exhaustive search. In [33], Lee et al. present the first DFA against HIGHT. In this attack, authors claimed that the full secret master key of HIGHT can be recovered in a few minutes or seconds with a success rate of 96%, computational complexity of , and memory complexity of by injecting 12 faults based on a random byte fault model.

The main idea of this attack is to collect pairs of correct and faulty ciphertexts by injecting adequate faults and use them to distinguish where the faults are injected. Once the fault locations are determined, a number of equations can be built based on manual analysis of the fault propagations to filter out the wrong subkey candidates and thus to recover the secret master key. However, since the adversary analyzes fault propagations and filters out wrong subkey candidates manually, the fault leakages are not maximally utilized and the attack can be further improved.

In this paper, we elaborate an algebraic fault analysis of HIGHT with a stealthy HT. The fault model we choose in this attack is the one in which the adversary is assumed to inject a single-bit fault precisely in both location and the time of the disturbance by a HT which is activated just by choosing certain plaintexts. The attack converts both the cipher and the injected faults into algebraic equations automatically and recovers the secret master key with an SAT solver. The attack recovers the secret master key with a success rate of 96% within 12,572.26 seconds and requires only 3 faults. We summarize our results as well as the major previous results in Table 1.

3. Preliminaries

In this section, the notations used in the paper are listed in Section 3.1. Then, we briefly describe the HIGHT algorithm in Section 3.2 and the overview of the attack is given in Section 3.3.

3.1. Notations

In the rest of the paper, the following notations are used:1.: means , where .2.: bitwise XOR and concatenation operations.3.: -bit left rotation of an 8-bit value .4.: sign for denoting faulty ciphertext or intermediate values.5.: the 64-bit plaintext, ciphertext, and faulty ciphertext.6.: the 16 bytes master key.7.: the whitening keys, .8.: the round keys, .9.: the 64-bit input of the ()th round, .10.: the th bit of , .

3.2. Brief Description of HIGHT Cipher

HIGHT is a lightweight block cipher with 64-bit block length and 128-bit key length. The encryption process of HIGHT is as follows.

(1) The KeySchedule is performed to generate 8 bytes whitening keys and 128 bytes :(2) The InitialTransformation is performed to transform the 64-bit plaintext P to the input of the first round by using four bytes whitening keys , , , and . (3) For , RoundFunction is performed to transform into as follows: For , The two auxiliary functions and are defined as follows:(4) The FinalTransformation transforms into the ciphertext :For complete description of HIGHT, the reader is referred to [8, 9].

3.3. Overview of the Attack

As illustrated in Figure 1, our attack consists of four steps.

(1) Inducing the Designed HT in a Selected Location. The task of this step is to design a HT and insert it in the cipher chip. The optimal location of inserting a HT should be in a deeper round to enable the fault to involve the whole master key bytes during its propagation. It also should ensure that the injected HT escapes detections by having low cost and with extremely low activation rate.

(2) Constructing Boolean Equations for the Cipher. In this step, the target cipher and its key schedule are described by a set of Boolean equations , which contain unknowns (master key bits, whitening key bits, subkey bits, and intermediate variables) and constants (plaintext and ciphertext bits). The most important and difficult part in this step for HIGHT is to describe nonlinear operations like addition mod and complicated linear functions like and .

(3) Constructing Boolean Equations for the Faults. After the fault injections, the faults are also represented with a set of Boolean equations . It is obvious that the more secret variables contains, the more master key bits that can be recovered. Therefore, the key point of this step is how to make contain secret variables that were involved during the fault propagation as many as possible in an efficient and simple way.

(4) Solving the Algebraic Equation System. The problem of searching for the secret master key is now transformed into solving the merged equation system and . Many automatic tools [25, 3437] can be leveraged.

4. Some Properties of the Faults

This section is devoted to presenting the fault properties, which are helpful to our attack. For the sake of simplicity, we denote the deduction of from by equation by

Property 1. Assume that a fault was induced to , then defineas the set of subkey bytes and whitening key bytes that were involved by the fault during its propagation form round to FinalTransformation, where , , , , and . Then we have

Proof. Without loss of generality, we assume that the fault was induced to .
(1) For , , and , the fault will propagate to and in the next round as shown in Figure 2; thus we have .
When the fault was injected in , the fault will propagate to and in the final round. Then, we have .
(2) For , , and , the fault will only propagate to in the next round as shown in Figure 3; thus we have .
In the similar way, the fault will propagate to and in the next round for . Then, we have .

Property 2. Assume that a fault was induced to , then define as the set of the master key bytes that were involved during the propagation of the fault. Then we have the following conclusion.

Proof. From Section 3.2, for the FinalTransformation of HIGHT, we have the following formula:For the KeySchedule of HIGHT, we have the following formula:For (14)~(15), we haveThus, we have and . Moreover, according to Property 2, then we have the desired conclusions which are shown in Table 2.
Note that, to fully recover the master key, the entire master key bytes must be included in the merged equation system. That is, the master key bytes can possibly be recovered only for the case that .

Property 3. Given that a single-bit fault is inserted in , the fault propagation paths are shown in Figure 4. The intermediate words , , , , , and are all corrupted that , , , , and . Then the intermediate words are included in More generally, if we use 8-bit words and to denote the inputs of modular addition, and to denote the difference of the inputs, and to denote the corresponding output difference in (18), then the above two equations can be simplified as That is, the intermediate words, whitening keys and subkeys can be recovered by solving the two equations.

5. The Proposed Trojan Circuit

In this section, we give the details of the HT. In general, a hardware Trojan consists of two parts: trigger logic (TL) and payload logic (PL). The TL is used to judge whether the values of signal lines and states meet the activation condition which is referred to the values of signal lines and states set by the adversary in advance. Once the activation condition is satisfied, the PL executes attacks. Attacks of Trojan circuit may deactivate the circuit (denial-of-service), change its functionality, or provide covert channels through which the protected secret information can be leaked.

5.1. Assumption of Trojan Circuits

In this paper, we make three assumptions about the design of hardware Trojan circuits.

(1) HIGHT is implemented in a cryptographic intellectual property (IP) with advanced protections like sensors from an untrusted IP vendor or system integrator. The prototype is on a Xilinx FPGA device implementing a cryptographic IP. In fact, it is a common practice to deploy physical sensors alongside cryptographic IP in industrial designs.

(2) The adversary is assumed to be able to assign the plaintext to be encrypted. And he is also assumed to be able to insert a smart but functional hardware Trojan in Register Transfer Level (RTL) by either modifying the RTL or the corresponding logic elements in the postplace or route netlist. But he only has the access to the Xilinx Design Language (XDL) file and no access to the design stage.

(3) The hardware Trojan is designed to introduce a fault by flipping only one bit of a certain intermediate word of the cipher when it was activated.

5.2. Trigger Design

The FSM-based Trojans [38] have two prominent advantages over many other Trojans: one is that they can be designed to be arbitrarily complicated with the same amount of resources and can reuse both combinational logic and flip-flops of the original circuit, and the other is that the FSM-based Trojans are bidirectional which means they can have state transitions leading back to the previous or initial state, thus causing the final Trojan state to be reached only if the entire state sequence is satisfied in consecutive clock cycles. The above two advantages both make the FSM-based Trojans harder-to-detect than other Trojans.

As shown in Figure 6, to design a hard-to-detect Trojan circuit, the TL of the proposed HT is designed based on a finite state machine (FSM). In this FSM, a 3-bit register is used to store the current state. The Trojan circuit undergoes state transition under the certain state transition diagram which is defined by the adversary in advance and shown in Figure 5. Moreover, only the adversary knows the predefined state transition diagram. The 3-bit input is derived from any three of the four different 8-bit intermediate words , , , and , randomly. And it is assigned as the transition condition of the FSM that causes the state transition. If the input agrees with the current state, the FSM will transition to the next state; otherwise the FSM will go back to the previous state. When the FSM reaches the final state , the Trojan output is activated (the single act is “1” ) and the PL will cause a single-bit fault in the original circuit. In the next clock cycle, the Trojan will automatically go back to the initial state ; thus the Trojan can be disguised as a random fault.

Since a 3-bit register is able to store 8 different states, the test space that is to activate the trigger logic is 8! (>215); that is, the probability of activating the HT is Pr ≈ 0.000024 which is an extremely low probability. However, since according to InitialTransformation (see (5)), the required four plaintext bytes P1, P3, P5, and P7 can be directly deduced by X0,1, X0,3, X0,5, and X0,7. Hence, the adversary can trigger the HT by carefully choosing , P3, P5, and P7. The total logic overhead of the implemented trigger logic is three flip-flops and four 3-input LUTs.

5.3. Payload Design

For clarity, the th encryption and plaintext are denoted as and , respectively. A pair of correct and faulty ciphertexts () is required to be collected for the same plaintext Pm. The payload component PL(A) is designed to inject a single-bit fault in round during . When the HT is triggered by carefully choosing some certain plaintexts, a “1” is stored in the flip-flop which waits for the target round . A signal Rflag, derived from state machine, indicates whether the current round is the target round or not. The value of () is determined by AFA which will be described in detail in Section 7.2.1. Once the Trojan is triggered, the th bit of , that is, , is flipped due to PL(A) in . This is realized by function as shown in Figure 6. The total costs of implementing the payload logic are a flip-flop and a 3-input payload gate that can be implemented by 1 LUT in both 4-input and 6-input FPGA series.

6. The AFA with a HT of HIGHT

6.1. The Optimal Location Selection

Let be the location where the HT is inserted, , , and . In order to search the optimal location, four properties are desired:

(1) Note that the secret master key can be recovered only for the case that they are involved during the fault propagation; thus the number of elements in should be equal to 16.

(2) The required number of faults to recover the secret master key and the reduced key search space after the injection to should be both minimized to make the attack more practical.

(3) The average time of the solver to solve the merged equation system should be minimized to increase the effectiveness of the attack.

(4) should be in a deeper round to maximize utilize the fault leakages and to evade the detection.

In order to search the optimal bit location for the HT, AFA is used to enumerate every possible (). The attempts are conducted in advance, which can guide the logic designs of the HT and reduce costs. Since AFA is executed as machine-based automation, all possible key candidates will be eventually checked along the fault propagation paths. The utilization of fault leakages is maximized. The automation shows its advantage over traditional manual analysis, such as DFA, especially when the analysis goes into the deeper round.

6.2. Constructing Algebraic Equations for Encryption of HIGHT

The task of this stage is to represent HIGHT cipher with a large system of low degree Boolean equations. Suppose and are the 64-bit input of round and ciphertext, respectively. Since the key schedule of HIGHT is very simple, we mainly focus on the encryption of HIGHT which is shown in Algorithm 1. From Algorithm 1, the most important yet difficult problem is to construct the equations for ARX operations.

for to do
end for
for do
end for

It is stressed that in general the adversary will not choose a very deep round as the target round. That is, the rounds between the target round and FinalTransformation are not very large. Therefore, instead of constructing equations for the full rounds of the cipher, we only construct equations for the rounds from the target round to the FinalTransformation which will result in a smaller equation script and thus will accelerate the solving procedure.

According to Algorithm 1, for every fault that is injected in , there are variables and ANF equations were introduced to the equation system . In addition, variables and ANF equations are required for round keys, 64 variables and ANF equations are for the whitening keys, and 128 variables and ANF equations are for the master keys.

6.2.1. The Equations for Addition mod

Assume , , are the two inputs and output of addition modulo , where , , and with , , and being the least significant bit, respectively. Then addition modulo can be described as Boolean equations as follows:

6.2.2. The Equations for and

Given that the input and output of and are and , respectively, then and can be described as the following Boolean equations:

6.3. Constructing Equations for the Injected Faults

This stage illustrates the method of constructing equations for the injected faults. To clarify the method, the example is shown in Figure 7.

Given that every time the HT was activated, a single-bit fault was introduced to flip the most significant bit of . The fault propagation paths are shown by bold line in Figure 7. The correct and faulty 64-bit inputs to the th round are denoted by and , respectively. Then, the complex fault propagation paths can be described as a set of algebraic equations with the variables that were involved. Since the fault flips the most significant bit of , we havewhere . For the fault propagation paths that from round 25 to the FinalTransformation, they can be described by equations as Algorithm 2.

;
while do
end while
for to do
end for
for do
end for

Algorithm 2 constructs the equations for the injected faults. The main idea is that every time a fault was induced, the intermediate variables from round to round 32 were viewed as new variables . Then, we reconstruct the equations for the encryption by replacing with . Furthermore, for variables that were not involved along the fault propagation paths which can be deduced by the function SearchFaultyInterVal(), we have . Thus, there are variables and ANF equations were introduced to the equation system for every fault that was injected in .

The function, SearchFaultyInterVal() searches the faulty intermediate variables automatically according to the fault location Xr,i and finally returns them. The main idea is explained earlier in Section 3. Algorithm 3 describes the procedure.

for to do
while do
   if then
   if then
    
   end if
   if then
    
   end if
   end if
   if then
   if then
    
   end if
   if then
    
   end if
   end if
end while
end for
Return
6.4. Solving the Equations System

After activating the inserted HT to introduce single-bit faults and constructing the merged equation system for both the cipher and faults, the whole secret master key can be fully recovered by solving the merged equation system with an automatic solver. Since the SAT-based solvers [39, 40] have prominent advantage of the memory usage when solving large equations systems over many other automatic tools, such as mutantXL algorithm [35, 37] and Gröbner basis-based [41] solvers and recently further significant improvements have been made to SAT-based solvers, we have chosen the CryptoMiniSAT v4.4 which is a DPLL-based SAT solver developed from MiniSAT to solve the equation system. The readers can refer to [25, 35, 40, 42] for details of how to generate equations and how to feed them to the solvers.

7. Theoretical and Simulation Results

In order to verify the effectiveness of the proposed attack on HIGHT and optimize the implementation of HT, we conduct many experiments and report the results in this section.

In the phase of searching the optimal location for the HT, we conduct the fault injection with software level simulations. The HIGHT software implementation was written in C and the CryptoMiniSAT 4.4 solver is running on a PC with Intel Core i7-4790, 3.60 GHZ, 12 G memory, and Windows 7 64-bit OS. An instance refers to one run of our attack on a set of (P, MK, C). The instance fails if the solver does not give an output within 48 hours (172800 seconds). In the online phase, the HIGHT hardware implementation and HT are both running in SASEBO-GII board soldering a 65 nm Virtex-5 FPGA.

7.1. Data Complexity Analysis

Our aim is to fully recover the entire master key, which is mainly depending on solving the two equations (19) and (20) in Section 4. Our task is to investigate the number of queries () and () to solve the equations. We notice that this issue was already explored from a theoretical point of view in [41]. And a worst case lower bound on the number of queries () to solve (16) is a constant 3, and the corresponding number of queries () to solve (20) in the worst case is (), where t is the position of the least significant “1” of and .

Additionally, we use N to denote the amount of faults required to recover the entire master key bits. In our case, every encryption the master key bits are assumed to be fixed while the plaintext was chosen randomly by the adversary. And since queries () and () are introduced by activating the HT to flip one fixed bit of a certain intermediate word, the lower bound on the number of HT activated in the worst case is .

7.2. Experimental Results
7.2.1. Cost-Optimization Implementation of the HT

6-input LUT is the mainstream look-up-table (LUT) architecture widely used from the 65 nm Virtex-5 FPGAs to the 20 nm Ultrascale FPGAs. In these devices, slice is the fundamental logic unit and each slice contains four 6-input LUTs. A single 6-input LUT is able to implement either one Boolean equation up to 6 inputs or two Boolean equations with no more than 5 different input signals in total. The structure of the 6-input LUT is shown in Figure 8.

According to Section 5, both the payload gate which is illustrated in Figure 6 and the LUTs required to implement the trigger logic have 3 inputs. Moreover, occupied LUTs with 3 or less used inputs can be found by searching the XDL. In this stage, the payload gate and the required four 3-input LUTs can be implemented by five arbitrarily occupied LUTs with no more than 3 used inputs, by just modifying the corresponding slice instances on XDL. Since the Trojan LUTs are implemented with existing logic, the eventual cost is 4 extra flip-flops. The experiment result is shown in Table 3, which reports a 1.63% additional cost in flip-flops for the HT implemented on a 65 nm Virtex-5 FPGA.

7.2.2. The Optimal Location Selection for Inserting the HT

(1) Determining k. According to Section 5.1, to make the designed HT stealthy and reduce the costs, the HT is designed to flip only one bit of the 8-bit intermediate word when it is activated. Since the lower bound on the number of queries () to solve (16) in the worst case is inversely proportional to the size of , we choose the most significant bit of the 8-bit intermediate word to be flipped. That is, . And in that case, the resulting is minimum.

(2) Determining (). According to Property 3 of the faults (Section 4), only when the entire master key bytes can be involved during the fault propagation. Thus, the HT should be inserted in to ensure the entire secret master key variables are included in the algebraic equations of the injected faults (Section 6.3). And there are 5 candidate locations and each of them is tested by AFA to get the optimal location for the HT.

According to the equations for addition mod in Section 6.2.1, the most significant bit of and for (19) and (20) can be never recovered for no observation which can be made about the most significant bit ofγ. Thus, there are multiple solutions for (19) and (20); that is, multiple candidates for MK will be collected by solving the merged equation system with an SAT solver. To determine MK uniquely, a distinguisher is required to further filter the MK candidates.

The CryptoMiniSAT solver searches for the given amount of solutions, which means all candidates for MK will be checked if the given amount of solutions is set large enough. With this property, we can build a distinguisher. Note the fact that the unknown intermediate words Xr and the known ciphertext are both depending on (); we can filter the MK candidates against by constructing equations for the full rounds of HIGHT. For every MK candidate, will be automatically deduced by the solver based on the MK candidate and the known P. If does not match , the candidate will be eliminated. Thus, with this property of the solver, an efficient distinguisher can be built just by constructing equations for the full rounds of HIGHT. Since the equations for round to FinalTransformation has been constructed in Algorithm 1, we only need to construct equations for InitialTransformation to round of HIGHT to build the distinguisher which is shown in Algorithm 4. Hence, there are additional variables and ANF equations are required for the intermediate words, variables and ANF equations are required for round keys, and 64 variables and ANF equations are required for the whitening keys.

In order to verify the effectiveness of the distinguisher, that is, whether the entire master key bits can be uniquely determined, we set the given amount of solutions to 2128 for the solver. And the simulations are conducted under two different modes: one is denoted by mode A which is with the distinguisher and the other is denoted by mode B which is without the distinguisher. We use the method in Section 6 to build the emerged algebraic equation system for the cipher and the injected faults. The results in Table 4, which are derived statistically from 100 instances, show the statistics of solutions corresponding to different () under mode A and mode B with the number of faults which varies from 3 to 9. We can see that the cases, which are conducted under mode A, have a unique solution for ; that is, the entire 128-bit master key can be uniquely determined for these cases. However, when these attacks are conducted under mode B, the CryptoMiniSAT solver always outputs multiple solutions and the number of solutions seem to be inversely proportional to . Thus, the experimental results indicate that the distinguisher is feasible and effectiveness and also proving the lower bound in the worst case for is 3.

Serving the purpose of accelerating the experiments, five PCs with the same configuration are employed to run the CryptoMiniSAT solver in parallel so as to finish these attacks. Each PC runs 20 instances. Figure 9 shows the average solving time of the CryptoMiniSAT solver corresponding to the cases where the HT is inserted in under mode A. The figure shows that when , the secret master key is failed to be recovered in 48 hours. And for the cases where the HT is inserted in () and , the minimum value for N is 4, while for the cases (), (), and () the minimum value for is 3. The figure also clearly shows the distributions corresponding to the case () having a lower average compared to the other. Thus the optimal location for inserting the HT is () and the corresponding average solving time when is seconds (≈3.49 hours).

7.2.3. Success Rate of the Attack

To evaluate the success rate of the attack under mode A where the HT is inserted in the optimal location determined in Section 7.2.2, 100 instances are tested with different (). Figure 10 shows the success rate of the attack. It can be seen that when N is lower than 3, the success rate of the attack remains 0%. Once N is greater than or equal to 3, as N taken grows, the success rate of the attack increases. And the success rate of the attack can reach 100% by increasing N to 7. It can be also seen in the lower part of Figure 10 that when N is equal to 3, only 4 instances fail to recover the secret master key in 48 hours; thus the success rate of the attack is 96%.

8. Conclusions

In this paper, an algebraic fault analysis (AFA), relying on the stealthy hardware Trojan, against HIGHT cipher has been proposed. To facilitate a bit-level AFA of HIGHT, a FSM-based stealthy HT is designed with an extremely low activation rate of around 0.000025. The optimal location for inserting the HT is determined by AFA in advance. Experiments report a 1.63% additional cost in flip-flops for the HT implemented on a 65 nm Virtex-5 FPGA. As for HIGHT implementation, a single-bit flip on the most significant bit of when the HT is activated requires only 3 injections to recover the secret master key with a success rate of 96%. In this paper, we showed that even with very limited number of faults from a lightweight Trojan, modern cryptographies are still vulnerable against algebraic attacks. This work certified the severity of the lightweight HT for the security-critical ciphers in ICs, and hence extensive security investigations must be devoted throughout the entire design and manufacture process of the security chips.

In the future work, we aim at explore effective solutions to detect the stealthy Trojan injected inside the cryptographic circuits.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work is sponsored in part by the National Natural Science Foundation of China (no. 61272491, no. 61309021, no. 61472357, and no. 61571063), by the China Scholarship Council (Grant no. CSC201606325012), by the Science and Technology on Communication Security Laboratory (9140C110602150C11053), and by the Major State Basic Research Development Program (973 Plan) of China under Grant 2013CB338004.