Database Padding for Dynamic Symmetric Searchable Encryption

Du, Ruizhong; Zhang, Yuqing; Li, Mingyue

doi:https://doi.org/10.1155/2021/9703969

Security and Communication Networks

On this page

Abstract Introduction Preliminaries Conclusion Data Availability Conflicts of Interest References Copyright Related Articles

Special Issue

Security and Privacy for Edge-Assisted Internet of Things

View this Special Issue

Research Article | Open Access

Volume 2021 | Article ID 9703969 | https://doi.org/10.1155/2021/9703969

Database Padding for Dynamic Symmetric Searchable Encryption

Ruizhong Du,^1,2Yuqing Zhang ,¹and Mingyue Li³

Academic Editor: Qi Jiang

Received16 Sept 2021

Accepted06 Dec 2021

Published31 Dec 2021

Abstract

Dynamic symmetric searchable encryption (DSSE) that enables the search and update of encrypted databases outsourced to cloud servers has recently received widespread attention for leakage-abuse attacks against DSSE. In this paper, we propose a dynamic database padding method to mitigate the threat of data leakage during the update operation of outsourcing data. First, we introduce an outlier detection technology where bogus files are generated for padding according to the outlier factors, hiding the document information currently matching search keywords. Furthermore, we design a new index structure suitable for the padded database using the bitmap index to simplify the update operation of the encrypted index. Finally, we present an application scenario of the padding method and realize a forward and backward privacy DSSE scheme (named PDB-DSSE). The security analysis and simulation results show that our dynamic padding algorithm is suitable for DSSE scheme and PDB-DSSE scheme maintains the security and efficiency of the retrieval and update of the DSSE scheme.

1. Introduction

With the rapid development of Internet technology, the concept of the Internet of Things (IoT) has attracted widespread attention once it was proposed. Due to the computing power bottleneck of IoT terminals, IoT terminal users must rely on cloud platforms to store and process private data. Therefore, the client loses control of private data, which will lead to the risk of leakage of private data (such as medical records and identity information) and endanger the security of the data. The most intuitive way to protect private data is to encrypt it and store it on the cloud platform before outsourcing it. However, it is difficult to search ciphertext data, which impairs data availability. To solve this problem, Song et al. [1] first proposed searchable encryption (SE) technology.

Previous SSE solutions have been static and could not support dynamic database updates, such as addition or deletion of outsourced files, limiting the application of SSE. Therefore, dynamic symmetric searchable encryption (DSSE) technologies were proposed [2]. However, the process of dynamic update will cause data leakage, and some existing attacks will pose risks [3]. For instance, file injection attacks can obtain the user’s query content by injecting a small part of the updated files [4].

To solve the above problems, Stefanov et al. [5] informally introduced the security concepts of forward privacy and backward privacy. Forward privacy is the security involved when adding a document/keyword pair. It ensures that the newly added text will not be searched by the previous search token. Backward privacy is the security involved in the deletion operation. It guarantees that the deleted documents will not be searched. In other words, if, after searching , adding a document/keyword pair (, ) and deleting it before the next query, then searching for the keyword again in the future will not display . Bost et al. [5] give the formal definition of backward privacy according to the different degrees of leakage. In recent years, many excellent forward and backward privacy DSSE schemes have been designed [6–17]. However, most of the schemes have low search efficiency and involve a complex index update process. Furthermore, leaks caused by access patterns can easily recover the underlying keywords of the search token, and while many schemes use oblivious RAM (ORAM) to solve this problem, ORAM is less efficient.

Database padding technology is a simple and efficient measure to solve the above problems that is suitable for real large data sets. Nevertheless, studies of padding schemes mainly focus on how to design efficient padding methods [3,18,19] that are not suitable for dynamic databases.

1.1. Our Contributions

In this paper, we design a dynamic database padding algorithm that can be applied to DSSE schemes and give the application scenarios. We have implemented a DSSE scheme with forward privacy and backward privacy (named PDB-DSSE) to prove that the dynamic database padding algorithm proposed in this paper is suitable for the DSSE scheme. In summary, our contributions are as follows:

We introduce outlier detection technology to design a dynamic database padding algorithm that is suitable for DSSE schemes. Specifically, each bogus file is sampled and verified by the outlier detection algorithm that pads the database, so that the generated bogus file and the real file are indistinguishable and finally realize result hiding.

We design a new index structure suitable for the padding database using the bitmap index that consists of two parts that represent real files and bogus files, respectively. In the updating process, the index can be modified by homomorphic addition operations, simplifying the update operation of the index.

We instantiated the application scenario of the dynamic padding algorithm, implemented PDB-DSSE, and tested its performance. Experimental results and security analysis show that the dynamic database padding algorithm proposed in this paper is suitable for the DSSE scheme, and PDB-DSSE maintains the security and search and update efficiency of the DSSE scheme.

1.2. Related Works

Song et al. [1] first proposed SSE. After encrypting the keyword, when searching for files matching the keyword , the algorithm sends the keyword and its corresponding key of each position where the keyword may appear to the server. The server compares it with the encrypted keywords one by one to obtain the result. For a file of length , the encryption and search algorithm of this scheme requires a stream cipher and block cipher. The time complexity is O(). Subsequently, research on SSE has been extended in many aspects, such as conjunctive search [15], rank search [11], range query [14], small client storage SSE [16], and verifiable schemes [11,12].

To support dynamic updates, DSSE was proposed in [2]. The early DSSE schemes were vulnerable to potential attacks [3], such as file injection attacks [4]. To reduce the threat posed by this attack, the concepts of forward privacy and backward privacy were proposed in [5]. In 2016, Bost [6] gave the formal definition of forward privacy and a forward privacy DSSE scheme o os. This scheme achieves certain improvements in efficiency and security by using simple cryptographic tools (i.e., pseudorandom functions and trapdoor primitives) without relying on the ORAM structure. In the next year, Bost et al. [7] gave a formal definition of backward security according to different levels of leakage and then gave four forward and backward privacy schemes. Among them, the Fides scheme they designed achieves the second level of backward privacy. The second scheme, Diana, is very efficient but only implements Type-III backward privacy. Diana is modified to a backward privacy scheme that only requires two roundtrips. The third scheme Janus is the first proposed backward privacy scheme with only a single roundtrip, but this scheme also only implements Type-III backward privacy. The last scheme, Moneta, implements Type-I backward privacy. However, the computational overhead and communication overhead of this scheme are very large due to the use of the TWORAM structure. In 2018, Sun et al. [8] constructed a new primitive called symmetric puncturable encryption (SPE). They further use the common primitive to propose a noninteractive backward privacy DSSE scheme Janus++. This scheme is more efficient than Janus. However, to achieve backward privacy, it hands over much work to clients, greatly increasing their storage and computing overhead.

In 2018, Chamani et al. [9] proposed several dynamic symmetric searchable encryption schemes, namely Mitra, Orion, and Horus. Among them, Orion achieves the highest level of backward privacy: Type-I backward. It requires O(log ) rounds of interaction, and the search process requires O() steps. Horus improves the efficiency of Orion. However, it only reaches the third level of backward privacy. In 2019, Zuo et al. [13] designed a forward privacy and backward privacy DSSE scheme that implements Type- backward privacy that is somewhat stronger than Type-I. To support a larger database, they further extend it to a multiblock setting. Moreover, they extended the first scheme to support range query in [14].

The early studies on database padding focused on the design of an efficient padding method. Cash et al. [3] proposes a heuristic padding method, which pads the number of files corresponding to keywords by multiples. However, this padding method introduces unnecessary padding. In response to the problem of padding overhead, Bost and Fouque [18] divide the clusters according to frequency and pad the clusters with similar frequencies as one cluster. The algorithm achieves the smallest padding cost while preventing counting attacks. Xu et al. [19] proposed the concept of relative entropy to measure the distance between the original keyword distribution and padded keyword distribution and proposed a bogus file generation algorithm to strengthen database padding. As a result, it is almost impossible for us to distinguish a bogus document from a real document.

1.3. Organization

The content structure in the rest of this article is as follows: We give the background knowledge, including the bitmap index structure that we used, the definition of the DSSE and its security model, and the notation used in the work in Section 2. We describe the dynamic padding algorithm in Section 3. We describe the PDB-DSSE scheme in Section 4. The security analysis is given in Section 5. The performance analysis and simulation experiment results are described in Section 6. Finally, Section 7 summarizes the paper.

2. Preliminaries

2.1. Bitmap Index

The index structure of PDB-DSSE is based on the bitmap index [14]. Assuming that the database can store up to files, each keyword corresponds to an -bit string . If the keyword exists in file , the -th bit of the is set to 1; otherwise, it is set to 0. As shown in Figure 1(a), the maximum number of files that the database can store is 6, that is, = 6. At this time, the bit string indicates that file exists. As shown in Figure 1(b), if file is added, a bit string needs to be generated and added to the initial S. If we perform a delete operation to delete the file , as shown in Figure 1(c), we need to generate = mod and add it to the initial bit string. In our PDB-DSSE scheme, we use the database padding algorithm to pad the database with bogus files and generate the corresponding bitmap index. Assuming that the index length is , the first bits represent real files ( is used to indicate the maximum number of files that the real database can store). The bits represent the padded bogus files (as shown in Figure 1(a)). The real database can store up to 4 files, that is, = 4, and the calculation rules are the same as the bitmap index. The detailed algorithm and calculation process are described in Section 3.

(a)

(b)

(c)

2.2. DSSE Definition

In this section, we define the DSSE scheme. We also give its security model and formally define forward privacy and backward privacy.

The database , where is the file identifier, is the keyword set, and represents how many real files are stored in . represents the total number of keyword/file identifier pairs. is used to represent the collection of all distinct keywords in .

A DSSE scheme DSSE=(Setup, Search, Update) consists of the following.

: this algorithm enters a security parameter . It outputs a client’s state and an encrypted database which is uploaded to the cloud server.

: this protocol requires interaction between the client and the server. The client has stored the state and enters the query . The server retrieves through the search token and returns the matching result to the client.

: this protocol requires interaction between the client and the server. The update operations it supports include add and delete. If the client wants to perform an update operation and add (or delete) a bunch of (, ), the server will update and add (or delete) the corresponding ciphertext. At the same time, the client updates the local state .Finally, the server updates .

2.3. Security Model

Given a DSSE scheme defined in Section 2.2, we will define the real game REAL and the ideal game IDEAL to give its security model. REAL reflects the behavior of the original DSSE scheme, and IDEAL reflects what the simulator does. takes the leaked function of the scheme as input. For adversary , the leak function is defined as being the information that can learn when the setup algorithm is executed, being the information that can learn when executing the search protocol, and being the information that can learn when executing the update protocol. Games REAL and IDEAL are defined as follows.

: upon input to a database that is chosen by the adversary , it runs to obtain the . initiates a series of search queries (or update query (, )). Finally, returns the experimental result , .

: the simulator executes the leaked function for a series of search queries or update queries for the difference (, ) of , and the simulator inputs and , respectively, and returns the output result to . Finally, returns the experimental result , .

Definition 1 (see [14] (adaptive security of DSSE schemes)). A DSSE scheme is -adaptively secure with respect to leakage function , iff for any probabilistic polynomial time (PPT) adversary issuing a polynomial number of queries , there exists a stateful PPT simulator , such that

2.4. Forward Privacy

Forward security guarantees that the updates that occur cannot be associated with the operations that occurred before.

Definition 2 (see [6] (forward privacy)). If a dynamic searchable encryption scheme D’s update leak function can be written asD is adaptively forward privacy, whhere is a set of keywords of update operation and is the update file index.

2.5. Backward Privacy

Zuo et al. [13] defined Type- backward privacy. Type- will reveal the file information containing the keyword , the total number of updates of , and the update time of each update on . Our PDB-DSSE scheme uses “0” and “1” bit string to indicate whether the file exists or not, and its operations of addition and deletion use the same module, so it will not reveal when the file was inserted. In summary, our scheme achieves Type- backward privacy.

For example, consider the following series of update operations in a single-keyword query system, which occur in sequence: search for the files corresponding to the keyword at time 1, add file for the keyword at time 2, add file for keyword at time 3, add file for keyword at time 4, delete file at time 5, and finally at time 6, search again for the document corresponding to the keyword . Finally, perform the above operations. Type- backward privacy revealed that searching for will return files and . And, a total of 4 updates occurred at the above time, which occurred at times 2, 3, 4, and 5.

To formally define Type-, for the search query list and the timestamp , the search pattern is defined as . The search mode reveals whether the search keyword is repeated. It is also necessary to define a new leak function TimePDB. For the keyword , TimePDB () lists all the updated timestamps of . For update query ,

Definition 3 (see [13] (backward privacy)). If a dynamic searchable encryption scheme D’s search and update leak functions and can be written asD is Type- adaptively backward privacy.

2.6. Notation

The notation used in the work is given in Table 1.

3. Dynamic Database Padding Algorithm

To solve the problem of data leakage during the update of DSSE schemes, we introduced outlier detection to design a dynamic database padding algorithm that can be used in the DSSE scheme. The padding algorithm can hide the information of the files currently matching the query keyword. In this scheme, real files and bogus files are represented by bitmap indexes, simplifying the process of update operations.

The dynamic database padding algorithm includes the padding database generation algorithm PDB-Gen and the padding database update algorithm PDB-Update.

3.1.

Upon input to a database DB, it outputs a padding database PDB, where . The algorithm is described in Algorithm 1. It initializes two empty sets and that store the number of bogus files that need to be padded for each keyword. For a database , we use the same clustering method as [19]. For cluster , the keywords are sorted by their frequency in ascending order. We pad the keyword counts in a cluster to the same size according to the largest one, calculate the number of bogus files that need to be padded for each keyword, and put the result into set . Subsequently, we randomly generate bogus files, select those that meet the conditions, and add them to . The bogus file generation algorithm is as follows: we use the dimension vector to represent bogus file, where represents the size of the . If we fill the keyword , then set the -th bit to 1; otherwise, we set it to 0. Then, we randomly generate a bogus file and judge whether it meets the requirements. Select the -bit keyword to fill in the bogus file, where is selected from the maximum file size and the maximum file size that the dataset can hold. After generating the bogus file, we use the local outlier factor (LOF) algorithm to detect the outliers of the bogus file [19]. As shown in [19], we should choose a bogus file whose LOF value is approximately equal to 1 for padding. Therefore, we use this threshold to filter samples. If LOF()<1, we add to the padding database PDB. Otherwise, we roll back the padding counts and reselect the bogus files for detection until all keywords are padded.

(1)
(2)	for all cluster :
(3)	for do
(4)
(5)	while do
(7)
(8)	select different keywords randomly
(9)	for do
(10)	if then
(11)
(12)
(13)	else if then
(14)	select a new from
(15)	else
(16)
(17)	//validation via LOF detection
(20)	if then
(21)	//put bogus files to
(22)	else
(23)	rollback
(24)	return

3.2. ,

Upon inputting the keyword that needs to be updated, the index of the update operation, the number of bits of the real database in the index , and the padding database , it outputs the updated and the adjusted index. The algorithm is described in Algorithm 2. For the update operation of the padded database, our scheme uses an -bit bitmap index (Section 2.1) to represent the real files and bogus files, where the first bits indicate whether the real file exists or not. If file exists, then the -th bit is set to 1; otherwise, it is set to 0. For a bogus file, if the file is padded, the bit of the index is set to 1; otherwise, it is set to 0.

(1)
(2)
(3)	(1)
(4)	for do
(5)	if then
(6)
(7)	for and do
(8)	if then
(9)	//try to modify bogus file
(10)	if then
(11)
(12)	Add corresponding bit string to
(13)	else
(14)	//roll back the modify
(15)	(2)
(16)	for do
(17)	if then
(18)
(19)	for and do
(20)	if then
(21)	//try to modify bogus file
(22)	if then
(23)
(24)	Add corresponding bit string to
(25)	else
(26)	//roll back the modify
(27)	index
(28)	return

Subsequently, we convert the index into an array and initialize a counter to record the number of files that need to be added or deleted. If the operation is addition, the padding files corresponding to the keyword in PDB need to delete some bogus files corresponding to . Specifically, we try to change the corresponding position of from 1 to 0. Prior to modifying this position, we perform outlier detection on the changed bogus files . If , then we modify it and modify the index at the same time. Otherwise, we roll back to the state before the modification and continue to try to modify the next bogus file.

If the operation is deletion, we try to pad files corresponding to keyword . First, we count the files that need to be deleted, find the bogus file that is not padded with the keyword in PDB, and modify them one by one. It is worth noting that we try to modify the file corresponding to the position of 0 to 1 and perform outlier detection. If , we modify the bogus file and index. Otherwise, we roll back to the state before the modification and modify the next file. Finally, the modified index is converted into a sequence and returned.

The corresponding index during the execution of the dynamic padding algorithm is shown in Figure 2 that is explained in plain text. Assuming that the maximum capacity of the database is 10, the maximum capacity of the real database is 6, that is, = 6, so that the maximum capacity of the PDB is 4. The initial real data are inserted in order (, ), (, ), (, ), and (, ). At this time, the database index is as shown in Figure 2(a). After the PDB-Gen(DB) algorithm is executed, the index is shown in Figure 2(b), where (, ) and (, ) are padded. If we want to insert (, ), we need to add the bit string according to the adding rules of the bitmap index. We search the PDB for the bogus file PDB () corresponding to , and the result is . Then, deleting , we need to subtract the bit string . According to the deletion rule of the bitmap index, this is equivalent to adding the bit string to obtain the final new index , as shown in Figure 2(c). If we perform a delete operation, delete (, ), we need to subtract the bit string from the index of . According to the deletion rule of the bitmap index, this is equivalent to adding the bit string to retrieve all of the files corresponding to the position value of 0 and attempting to modify the bogus files one by one to change the corresponding position 0 to 1. After trying to modify, if outlier detection is performed on and , then proceed Modify; otherwise, try to modify the next one. In this example, a bogus file is added to and the bit string needs to be added to obtain the final new index , as shown in Figure 2(d). The above calculations require modulo 10.

4. PDB-DSSE Scheme

In this section, we construct the PDB-DSSE scheme using the dynamic database padding algorithm proposed in Section 3, where the real files and bogus files are represented by bitmap indexes and symmetric encryption with additive homomorphism [20] is used to encrypt the indexes. We use the framework of [13] combined with our proposed dynamic padding algorithm to prove the feasibility of the dynamic padding algorithm. It specifically consists of the algorithms Setup, Update, and Search.

4.1.

The detailed design of this algorithm is given in Algorithm 3. Input the security parameter . The security key is randomly selected. Generate a security certificate , where . In addition, set the maximum file number of the database , initialize an empty map , call the padding database generation algorithm PDB-Gen to generate the padding database , and then initialize a map , where stores encrypted databases (including database and padding database ) and stores and the times of update operations on keywords in the keyword space . Finally, send to the server, and the client stores the state =(, , , ) secretly.

4.2.

This protocol requires interactive execution between the client and the server. Algorithm 4 gives a detailed algorithm design. First, the client enters the update keyword , status , and bit string . Then, obtain search token and the update times on keywords in the keyword space according to and use to obtain the key. If it is the first update, initialize the search token and the times of updates . The hash function generates the update token , the hash function is used to mask the previous , and the hash function is used to generate a one-time key. According to the bit string that needs to be updated, the algorithm PDB-Update (, , , ) is called and the bit string used to update the index is returned. Then, is encrypted with a simple SE algorithm with additive homomorphism to obtain the . Then, the client sends , , and to the server. Finally, the server updates the to .

(1)	,
(2)	emptymap
(3)
(4)
(5)	set the max size of to
(6)	return

Client:
(1)
(2)
(3)	if then
(4)	,
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)	send to server
	Server:
(13)

4.3.

This protocol requires interaction between the client and the server. Algorithm 5 gives a detailed algorithm design. First, the client enters the keyword that it wants to search, generates a search token and a key , and sends it to the server. The server uses the hash function to obtain the corresponding update token and accesses the database to obtain the encrypted bit string . Then, the server uses simple homomorphic addition to add them to the final result and sends it to the client. The client uses the hash function to obtain the key, calculates the key sum , and uses the key to decrypt to obtain the bit string . After removing the stuffing bits, it outputs the final result bit string .

Client:
(1)
(2)	if then
(3)	return
(4)	send to server
	Server:
(5)
(6)	for to 0 do
(7)
(8)
(9)
(10)	if then
(11)	Break
(12)
(13)
(14)	send to client
	Client:
(15)
(16)	for to 0 do
(17)
(18)
(19)
(20)	select first bits of
(21)	return

5. Security Analysis

In this section, we give the security analysis of our schemes.

PDB-DSSE is forward and backward privacy. Because the hash function inversion operation is almost impossible to complete, the adversary cannot associate the future update with the update before the search. This guarantees forward privacy. For backward privacy, due to the bitmap index, the number of bit strings returned is consistent and the adversary cannot obtain the number of documents currently matching the keyword. Besides, since addition and deletion are done through an algorithm, our solution will not leak the update type. This guarantees backward privacy. In summary, PDB-DSSE is forward and backward privacy.

Definition 4 (adaptive forward and Type- backward privacy of PDB-DSSE). Let be a secure , (Setup, Enc, Dec, Add) be a perfectly secure symmetric encryption with homomorphism addition, and , , and be random oracles. We define , where and . PDB-DSSE is -adaptive forward and Type- backward privacy.

Proof. Similar to [9], we formulate a sequence of games between REAL and IDEAL. Despite the subtle differences between two consecutive games, they are indistinguishable, leading to the conclusion that REAL and IDEAL are indistinguishable. Finally, we use the leakage function defined in Theorem 1 as the input of the simulator to simulate IDEAL: Game : is exactly the same as defined in Section 2.3. Game : is identical to except we do not use a pseudo-random function to generate the key but randomly select the key with the uniform probability. If has never been searched, the key is generated and stored in the table key; otherwise, input the keyword , and return the corresponding key in the table key. is indistinguishable from due to the security of the . Game : is the same as , except that we use a random oracle instead of hash function . During the update process, the function is used to generate the update token. Instead of using , we randomly select a string as the update token. In the search process, use the random oracle to generate the update token. We also use the random oracle and similar to instead of hash functions and . Since a search token is chosen randomly, is indistinguishable from . Game : is the same as , except that we replace the number in each position of the bit string with 0 and the length of bit string remains the same. Due to the perfect security of simple SE with homomorphic addition, is indistinguishable from .The simulator is the same as , except that we use the search mode sp() instead of the keyword to simulate the ideal world. During the update process, we chose a new random string for each update in the game . During the search process, the simulator starts from and generates new random strings for the previous one by one. For the keyword , uses the first timestamp . Then, it uses the random oracle to calculate it with the ciphertext . At the same time, calculates and through and embeds all 0s in the remaining search tokens. Hence, and simulator are indistinguishable. In addition, what is described in simulator is essentially the game defined in Section 2.3. Hence, and are indistinguishable.
In summary, game is indistinguishable from G0, that is, , which completes the proof.

6. Experimental Analysis

In this section, we present the experimental analysis of our PDB-DSSE scheme, including the comparison with other research results in communication overhead, storage overhead, and the time complexity of query and update.

Table 2 presents the performance comparison between PDB-DSSE and existing research solutions, including security, interaction rounds, and client storage overhead. Here, is the number of () in the database, is the number of times the file containing the keyword is deleted, is the number of all different keywords, and is the number of files stored in the database. Regarding security, as shown in Table 2, PDB-DSSE implements forward privacy and Type- backward privacy. Regarding communication overhead, PDB-DSSE only requires one interaction, greatly reducing the communication overhead of the solution. Furthermore, the client storage overhead is O(1) since the client only needs to store the secret key and the state . Therefore, it is clear that PDB-DSSE only requires one interaction and less client overhead to achieve stronger backward privacy.

Table 3 shows the comparison of the time complexity of the PDB-DSSE scheme with other schemes with stronger backward privacy, Moneta [7], Orion [9], and FB-DSSE [13], where is the number of (, ) in the database, is the number of files containing the keyword , is the number of update operations performed on the keyword , and is the time it takes for symmetric encryption to perform encryption and decryption operations. is the calculation time for modular addition.

6.1. Implementation

The simulation experiment of our scheme uses a machine with an AMD Ryzen 7 4800U with a Radeon Graphics 1.8 GHz processor configured with a Windows 10 (64-bit) system and 16 GB RAM. We use the Java language programming to implement our scheme. During the experiments, the number of bits of the bitmap index was adjusted many times to perform experiments to simulate the performance of this scheme under the situation of different maximum file numbers of the database. In our experiments, the update time and search time of the scheme PDB-DSSE were tested when the number of bitmap index bits was to . We compare our scheme with the Orion scheme [9] that uses the ORAM structure and a higher level of backward security and the FB-DSSE scheme [13] that uses bitmap indexing efficiency. The comparison results are as follows.

Figure 3 shows the search time comparison of PDB-DSSE scheme, Orion scheme [9], and FB-DSSE scheme [13] under different bit lengths that indicate the maximum number of files supported in the scheme. In the Orion scheme, this is represented by the size of . The tested bit length ranges from to (Orion scheme is to ). Figure 3 shows that the search time of these three schemes increases with increasing bit length (). Since the Orion uses the ORAM structure, its search time complexity is O . When , its search time took 96 888.2 ms. We can observe from the experimental results that the search efficiency of the ORAM structure is extremely low for large datasets. Through test comparison, it can be found that PDB-DSSE maintains the same search efficiency as the FB-DSSE scheme to a certain extent. In summary, PDB-DSSE has high search efficiency while realizing forward privacy and stronger backward privacy.

Figure 4 shows the comparison of the update time of the PDB-DSSE with Orion [7] and FB-DSSE [13] under different bit lengths (or ), where the Orion scheme includes the insertion and deletion times. Since Orion needs to modify the ORAM structure when updating, the insertion time of requires 51.823 ms and the deletion time requires 29.454 ms. In particular, when , it takes 3.472 days to build a database with a magnitude of and it takes at least 35 days to build a database with a magnitude of . Therefore, it is extremely difficult to test the update time when is greater than ; this problem does not exist in PDB-DSSE. We tested the update time of the PDB-DSSE and FB-DSSE schemes with a bit length of to . We find that the update time of FB-DSSE is stable at 0.2–0.3 ms and the update time of FB-DSSE is approximately 0.4 ms when the bit length is less than . For bit lengths greater than , the update time of PDB-DSSE and FB-DSSE increases with increasing bit length. This is because when the bit length is less than , the influence of homomorphic addition and modular arithmetic is not significant. The update time complexity of PDB-DSSE is longer than that of the FB-DSSE scheme since it needs to update the PDB, but the update efficiency can still be maintained. Through comparison, it can be found that PDB-DSSE has high update efficiency.

7. Conclusion

In this paper, we construct a dynamic padding method that can be used for DSSE schemes and give an application scenario to realize a DSSE scheme PDB-DSSE with forward privacy and Type- backward privacy. First, we introduce outlier detection technology to design a dynamic database padding algorithm that can be used in DSSE schemes. We use the padded bogus file to confuse the real file to prevent leakage of information about the files that currently matches the search keyword. Furthermore, we design a new index structure based on the bitmap index that is suitable for the padding database that simplifies the process of modifying the index when updating. Finally, we propose an application scenario of the padding method and implement the -adaptive forward and Type- backward privacy DSSE scheme. Simulation results and comparative experiments show that the dynamic padding scheme proposed in this paper is suitable for DSSE schemes. In addition, the PDB-DSSE scheme, which incorporates the dynamic padding scheme proposed in this paper, still maintains efficient search and update efficiency.

As the size of the database increases, bitmap indexing and padding algorithms face certain limitations. To support a larger number of files (such as billions), in future work, we will consider dividing the index into blocks, padding each block separately to improve retrieval and update efficiency and reduce computational overhead. In addition, we will consider the use of bitmap indexes to implement a search for conjunctive keyword queries, thereby improving the retrieval efficiency of existing solutions.

Data Availability

Previously reported Enron Email Datasets were used to support this study and are available at https://www.cs.cmu.edu/∼./enron/.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

References

DX Song, D Wagner, and A Perrig, “Practical techniques for searches on encrypted data,” in Proceedings of the 2000 IEEE Symposium on Security and Privacy. S&P 2000, pp. 44–55, IEEE, Berkeley, CA, USA, May 2000.
View at: Google Scholar
R. Curtmola, J. Garay, S. Kamara, and R. Ostrovsky, “Searchable symmetric encryption: improved definitions and efficient constructions,” Journal of Computer Security, vol. 19, no. 5, pp. 895–934, 2011.
View at: Publisher Site | Google Scholar
D Cash, P Grubbs, J Perry, and T Ristenpart, “Leakage-abuse attacks against searchable encryption,” in Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, pp. 668–679, ACM, Denver, CO, USA, October 2015.
View at: Google Scholar
Y Zhang, J Katz, and C Papamanthou, “All your queries are belong to us: The power of file-injection attacks on searchable encryption,” in Proceedings of the 25th {USENIX} Security Symposium ({USENIX}Security 16), pp. 707–720, USENIX Association, Austin, TX, USA, August 2016.
View at: Google Scholar
E Stefanov, C Papamanthou, and E Shi, “Practical dynamic searchable encryption with small leakage,” NDSS, vol. 71, pp. 72–75, 2014.
View at: Publisher Site | Google Scholar
R. Bost, “Σoφ oζ: forward secure searchable encryption,” in Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp. 1143–1154, ACM, Vienna Austria, October 2016.
View at: Google Scholar
R Bost, B Minaud, and O Ohrimenko, “Forward and backward private searchable encryption from constrained cryptographic primitives,” in Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pp. 1465–1482, ACM, Dallas, TX, USA, October 2017.
View at: Google Scholar
SF Sun, X Yuan, JK Liu et al., “Practical backward-secure searchable encryption from symmetric puncturable encryption,” in Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, pp. 763–780, ACM, Toronto, Canada, October 2018.
View at: Google Scholar
J Ghareh Chamani, D Papadopoulos, C Papamanthou, and R Jalili, “New constructions for forward and backward private symmetric searchable encryption,” in Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, pp. 1038–1055, ACM, Toronto, Canada, October 2018.
View at: Google Scholar
P Rizomiliotis and S Gritzalis, “Simple forward and backward private searchable symmetric encryption schemes with constant number of roundtrips,” in Proceedings of the 2019 ACM SIGSAC Conference on Cloud Computing Security Workshop, pp. 141–152, ACM, London, UK, November 2019.
View at: Google Scholar
A. Najafi, H. H. S. Javadi, and M. Bayat, “Verifiable ranked search over encrypted data with forward and backward privacy,” Future Generation Computer Systems, vol. 101, pp. 410–419, 2019.
View at: Publisher Site | Google Scholar
L Sardar and S Ruj, “Fspvdsse: A forward secure publicly verifiable dynamic sse scheme,” in Proceedings of the International Conference on Provable Security, pp. 355–371, Springer, Cairns, Australia, October, 2019.
View at: Google Scholar
C Zuo, SF Sun, JK Liu, J Shao, and J Pieprzyk, “Dynamic searchable symmetric encryption with forward and stronger backward privacy,” in Proceedings of the European Symposium on Research in Computer Security, pp. 283–303, Springer, Luxembourg, UK, September 2019.
View at: Google Scholar
C. Zuo, S. Sun, J. K. Liu, J. Shao, J. Pieprzyk, and L. Xu, “Forward and backward private dsse for range queries,” IEEE Transactions on Dependable and Secure Computing, vol. 99, p. 1, 2020.
View at: Publisher Site | Google Scholar
X Li, T Xiang, and P Wang, “Achieving forward unforgeability in keyword-field-free conjunctive search,” Journal of Network and Computer Applications, vol. 166, Article ID 102755, 2020.
View at: Publisher Site | Google Scholar
I Demertzis, JG Chamani, D Papadopoulos, and C Papamanthou, “Dynamic searchable encryption with small client storage,” in Proceedings of the Network and Distributed System Security Symposium, DBLP, San Diego, CA, USA, February 2020.
View at: Google Scholar
S. Chatterjee, S. K. Parshuram Puria, and A. Shah, “Efficient backward private searchable encryption,” Journal of Computer Security, vol. 28, no. 2, pp. 229–267, 2020.
View at: Publisher Site | Google Scholar
R Bost and PA Fouque, “Thwarting leakage abuse attacks against searchable encryption-a formal approach and applications to database padding,” IACR Opens the Cryptology ePrint Archive, vol. 2017, p. 1060, 2017.
View at: Google Scholar
L Xu, X Yuan, C Wang, Q Wang, and C Xu, “Hardening database padding for searchable encryption,” in Proceedings of the IEEE INFOCOM 2019-IEEE Conference on Computer Communications, pp. 2503–2511, IEEE, Paris, France, April 2019.
View at: Google Scholar
C Castelluccia, E Mykletun, and G Tsudik, “Efficient aggregation of encrypted data in wireless sensor networks,” in Proceedings of the Second Annual International Conference on Mobile and Ubiquitous Systems: Networking and Services, pp. 109–117, IEEE, San Diego, CA, USA, July 2005.
View at: Google Scholar

Copyright

Copyright © 2021 Ruizhong Du et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

550

Downloads

535

Citations