Abstract

In our daily life, Internet-of-Things (IoT) is everywhere and used in many more beneficial functionalities. It is used in our homes, hospitals, fire prevention, and reporting and controlling the environmental changes. Data security is the crucial requirement for IoT since the number of recent technologies in different domains is increasing day by day. Various attempts have been made to cater the user’s demands for more security and privacy. However, a huge risk of security and privacy issues can arise among all those benefits. Digital document security and copyright protection are also important issues in IoT because they are distributed, reproduced, and disclosed with extensive use of communication technologies. The content of books, research papers, newspapers, legal documents, and web pages are based on plain text, and the ownership verification and authentication of such documents are essential. In the current domain of the Internet of Things, limited techniques are available for ownership verification and copyright protection. In the said perspective, this study includes the discussion about the approaches of text watermarking, IoT security challenges, IoT device limitations, and future research directions in the area of text watermarking.

1. Introduction

With the rapid development of embedded technology, computer technology, mobile communication network, and the Internet, IoT emerges at a historic moment. The primary feature of IoT is global perception, reliable transfer, and intelligent processing of information. The key is to realize the interaction of information between people and machine or machine and machine. Since its introduction, the IoT has caused major repercussions around the world because many human and material resources have been invested in supporting research, and remarkable results have been achieved. The rapid growth of IoT has brought significant changes to the industry, which is considered the third wave of the global information industry after the computer and the Internet. The Internet of Things is a collection of elements embedded in software, actuators, and electronic components that share and collect data over an Internet connection. IoT devices can be used in many environments that are equipped with sensors and low processing power [1]. The significant difference between the traditional internet and IoT is the absence of human role. IoT devices can create, analyze, and take action on information about an individual’s behavior [2]. IoT offers a lot of benefits for humans but facing many issues regarding security and privacy [3].

Current security challenges for IoT that need to be sorted out are presented in Figure 1. This shows that data integrity, security, privacy, automation, updating, a common framework, and encryption capabilities are the main challenges. In IoT, text documents integrity and security issues are exist in a modern digital world [5]. A large number of text documents are generated daily and shared through IoT. Due to advanced technologies, these documents can be easily copied and redistributed [6]. IoT has unlimited benefits, but on the contrary, illegal use of these documents creates a problem for the original. Nowadays, a number of ways have been used by hackers to infect or access the information. Digital text document protection is a crucial issue for researchers in the modern world [7]. The use of digital libraries, Internet technologies, mobile phones, e-commerce, and iPods are a fast and easy way of broadcasting information [8]. However, the security and privacy of digital content are difficult to handle. In this case, it is necessary to provide protection to digital materials that are traveling over the internet [9, 10].

2. IoT Security Challenges

Currently, 23 billion IoT devices are connected worldwide. By the end of 2020, it will further rise and reach up to 30 billion, and by the end of 2025, it will reach over 60 billion [11]. The security challenges for IoT are mention below.

2.1. Updating

The majority of IoT devices update their software automatically, while other devices had to be updated manually [12]. Some manufacturers only offer updates for a short period of time and then stop it. It is challenging to manage the upgrade of millions of devices that are connected to IoT. All the devices do not support the automatic update and require manual updating, which are time-consuming and lead to security loopholes if any mistake happens [13].

2.2. Automation

As in our daily lives, IoT devices continue to invade and deal with the number of IoT devices. It is challenging to manage an enormous amount of user data. The fact cannot be denied that any single error in an algorithm will bring down the entire infrastructure [14].

2.3. Common Framework

In IoT, there is an absence of a common framework, so all the manufacturers retain privacy and security at their own risk. Once a standard framework is implemented, then the security issue will be resolved [15].

2.4. Security and Privacy Issues

Different IoT devices can share data among various platforms. The IoT devices exchange and gather data for multiple reasons, such as decision-making, better service, and improving efficiency. Thus, it is essential that the endpoint of data shall be secured completely.

2.5. Data Integrity

Billions of IoT devices are interlinked and exchange data on a daily basis. The data integrity is the main issue in the IoT that no one can manipulate data at any point. Digital watermarking and blockchain should be implemented in order to ensure data integrity [16, 17].

3. IoT Device Limitations

There are two main issues IoT devices have: first one is battery capacity, and the second one is computing power [18]. Since some IoT devices are placed in such environments where we cannot charge them or charge is not available, the devices should perform the designed functionality in limited energy, and heavy security instructions may drain with limited power [19]. To mitigate this issue, three possible techniques can be used: first, to minimize the security requirements, and second, to raise the capacity of the battery. That seems impossible because most IoT devices are small in size and designed to be lightweight. However, a large battery has no extra room. The third approach is to harvest energy from natural resources such as heat, light, wind, and vibration, but such techniques are required on hardware upgradations and increase the monetary cost. The IoT devices have limited memory space that cannot store and handle the computational requirements of advanced security algorithms [19]. The IoT devices should be smart and manage all these requirements.

4. IoT Devices Architecture

Internet of Things includes many connected sensors and devices, and every device uses different communication standards and protocols. There are no precisely defined rules and standards for communication. In addition, the applications of the Internet of Things would not be limited and increase from day today. Different IoT devices are produced by different manufacturers even if they perform the same functionality. So, this challenge is related to the nature of the IoT and may lead to a lack of unified standardization.

4.1. IoT Devices Data Storage Issues

Data storage becomes a significant issue, as the amount of data increases rapidly. When the stored information is damaged, it is a challenging task to back up all. There is no assurance that data and information are securely transmitted over the IoT devices. Furthermore, it is a significant challenge for management companies and data storage to develop tools and standards that handle data provided and security issues.

4.2. Limited Resources of Infrastructure

IoT devices generally have limited memory and low processing capacities. Designing comprehensive security measures in 64 kB to 640 kB memory is a big challenge for software developers and IoT hardware manufacturers. In addition, they must have enough storage for security software to defend against security threats.

4.3. Data Privacy Protection

Anyone can access integrated devices from anywhere with IoT, which affects sensitive data confidentiality and privacy. Therefore, specific standards or rules must be defined to avoid the privacy violation. For example, some IoT devices share data with other devices, and in this case, the data become unsafe. This helps attackers and intruders to breach the security of the IoT system.

4.4. Lack of Skills

Specific skills and expertise are critical factors in the design, development, implementation, and management of security that must be considered. Any of this factor disruption may cause damage to the IoT security system. In addition, the lack of skills and expertise slows down the adoption of IoT technologies [20]. There are very limited people who can adequately handle the IoT system. The number of qualified people who master in IoT techniques is very limited. The benefits of IoT technology and dealing with its challenges depend mostly on individual capabilities.

5. Digital Watermarking

Digital watermarking belongs to information hiding and plays an essential role in copyright protection, ownership verification, and authentication [21]. In digital media, when we talk about information hiding, text watermarking is the least discussed subject. The protection of digital content is a difficult task, especially plain text [2224]. The information hiding is categorized into steganography, cryptography, and watermarking as shown in Figure 2. A secret message is embedded in digital content without affecting the original text, which authenticates the ownership verification [25, 26].

Researchers have significant challenges that information growth rate is higher, which requires an appropriate technique for watermarking. It is crucial to maintain data integrity while ensuring the confidentiality and availability of information [27]. However, with the practical development of the watermarking application, security issues of watermarking have emerged and achieved significant progress in this field [28].

Many techniques have been proposed in the last two decades, for hiding information in terms of steganography and text watermarking for copyright protection [29], authentication, copy control [30], ownership verification [3137]. The main contributions of this study are listed as follows:(i)We briefly describe the IoT current security and privacy issues and recommendations(ii)We conduct an extensive investigation about the approaches of text watermarking, IoT security challenges, IoT device limitations, and future research directions(iii)We summarize the text watermarking approaches/techniques that are used for digital watermarking(iv)A comparative analysis of previous techniques has been conducted on the basis of robustness, security, capacity, and imperceptibility. Their efficiency evaluated on the basis of set criteria, also identifying the drawbacks of exiting techniques

6. Digital Watermarking and Its Applications

In the real-world, watermarking can be used in a variety of applications that categorized into image, audio, video, and text [38]. Authorized documents, such as websites, certificates, business plans, articles, poems, books, corporate documents, e-mails, and SMS, can be protected through watermarking [39]. The applications of digital watermarking can be used for authentication, copyright protection. Some other application of the watermarking in the text listed below [4042].

6.1. Authentication

The plain text in articles and newspapers highlighted various problems with authentication. Watermarking is a verification tool to authenticate the integrity of the plain text. To prove the authentication if the watermark (author information) is perceived, then it has genuine document else text has been tempered and cannot be measured. The authentication mechanism can be used for a text document to detect any tampering. If tampering is identified, then the document cannot be considered as original, also for legal purposes, and it is necessary to authenticate text document [43].

6.2. Copyright Protection

Watermarking is also used in copyright protection of digital contents, like e-books, web content, research papers, poetry, and other documents. The author inserts a watermark in the document for copyright, and this watermark is extracted in the future from the given material to prove ownership. Digital watermarking is very helpful to settle the copyright issues in court.

6.3. Tamper Detection

A large number of text documents are available for users to read online, and these documents can be confronted with a series of attacks such as copying, unauthorized access, and redistribution. Tamper detection is one of the digital watermarking applications that can detect and recover the tampered region from the digital contents. Text watermarking is used as a fragile tool against these attacks [34, 44].

6.4. Copy Control

Publishers are looking for more consistent ways to control the copy of their important documents. Likewise, they want their essential documents to be available on the Internet for revenue generation. The watermarking is also applied here to provide access control and stop illegal copying [43].

6.5. Forgery Detection

Text documents reproduction and plagiarism are serious issues, and it is rapidly growing. Text watermarking is applied here to embedding watermark in the original document before publishing online [45]. Almost every private and public organizations deal with text documents on a daily basis, and digital text watermarking application can be applied here to control the forgery detection problem.

Watermarking major applications [46] is shown in Figure 3.

7. Text Watermarking Evaluation Criteria

The researchers count a lot of parameters while developing novel techniques. However, digital text watermarking evaluation criteria can be classified into security, capacity, robustness, imperceptibility, and computational cost. It is not possible to design such a system of watermarking that can cover all these properties. In the below content, each property of watermarking mentioned above is described [44, 47, 48].

7.1. Robustness

Robustness means that if watermark information is tempered and then it is still survived [49]. The mean of robustness is that it will be almost impossible without a license and without the content that defeat marked a great extent the content is not suitable and reliable [50]. When a technique of watermarking is designed, it is essential to revenue in consideration of the future application and the equivalent number of attacks that are possible. On the bases of watermark distortion rate (WDR) and pattern matching rate (PMR), the robustness of text watermarking is computed. That is formalized from (1) and (2).where Nm determines the number of patterns matched correctly and defines the number of watermark patterns.

7.2. Imperceptibility

The imperceptibly in the primary and fundamental requirement that means the watermark is securely embedded into the document objects. The watermark information could not feel the audience, or the watermark should not affect the original text. The watermarked and original information should be similar, and the content should be perceptually equal [51]. Peak signal–to-noise ratio (PSNR) and similarity percentage (SIM) is used to ensure the imperceptibility using the following equation (3) [52]:where Odoc (Max) is the maximum pixel value in the document image, RMSE stands for root–mean-squared error, and it is calculated using the following equation (4):

The following equation (5) is used to calculate the similarity parentage (SIM):

7.3. Capacity

The capacity indicates that the maximum bits of watermark information that can be stored in the host document. If a technique can hold large hiding capacity without affecting the visibility, then it is considered. The capacity can be measured using the following equation (6):

7.4. Security

There is another scheme for watermarking which is security. It states that the information of the author (watermark) is hidden from unauthorized users. They do not have access to detect the watermark. Watermark still exists and the payload still remains coved is the mean of security. Unapproved and unauthorized parties are not capable of identifying the author’s information. Security is measured on the bases of the imperceptibility, capacity, and robustness as shown in the following equation (7):

7.5. Computational Cost

Text watermarking techniques are computationally less complex for small text documents. More computation power is required for text documents that occupy many pages. In general, less complex algorithms are used for systems with limited resources to reduce the cost [44].

8. Watermarking Embedding and Extraction Process

Watermarking is the technique of information hiding that provides ownership verification and copyright protection to text documents against illegal usage [5355]. Digital watermarking has two steps: the first one is watermark embedding, and the second step is watermarking extraction or verification. In watermark embedding, secret information (watermark) is inserted into the original document without affecting the content of the document. A key can be used to encrypt the secret information for security purposes, and then the same key is applied for decryption. When an illegal attempt happens, then the watermark information is used to verify the original owner of the document. The reverse process of watermark embedding is called watermark extraction. Basically, this process is applied to verify the originality of the document. The architecture of watermarking is presented in Figure 4.

9. Existing Techniques of Text Watermarking

Digital text watermarking arose in 1994 [56, 57] and grew with the passage of time, as the communication and Internet start all over the world. These techniques are based on words and sentences, acronym, synonym, presupposition, syntactic tree, typo error, noun-verb, and text images for German, Persian, French, Spanish, and English languages. The text watermarking techniques and attacks are presented in Figure 5.

An information hiding technique is proposed in [58] that hides information in a binary text document; they use the boundary of characters for information hiding. Five pixels long, 100 pairs of border patterns were defined. There were two different models for each pair, an “A” model and a “D” model, which can be changed into each other when the pair is returned. A bit is embedded in the five-pixel long border by browsing the patterns. Kim et al. suggested a technique based on the classification of words and interword spaces to insert a watermark [59]. All words are classified in the document according to adjacent words and specific text attributes which comprise a segment, and it is further categorized according to the names of the class and the words in the segment. Each segment class contains the same amount of information.

Zhou et al. [60] introduced a method which used a chaotic encrypting algorithm to generate the watermarks, and the host document splits into two blocks using Chinese mathematical expressions. Two different text blocks and keys were generated to calculate the stoke numbers and the Chinese character frequency. When the content of the watermarked text document is modified, results from two blocks of text do not match, and text document result authentication will be false. In [61], a technique is proposed that is based on a particular part of speech (POS) for text zero watermarking. POS is the category of a word which has similar grammatical properties. The chaotic function is used to extract the sequences that are used to develop a watermark without altering cover data, and the imperceptibility problem is also resolved. This method provides excellent security because the order of the selected POS tag is unknown by an attacker.

Meng et al. [62] introduced a technique where sentence entropy is used to calculate the watermark key. Entropy defines as the average expected value of data that a message contains. Through word frequency and important selection, the sentence entropy is calculated, and according to the order of the crucial sentence, the watermark is embedded. Some unknown attacks were also applied to this method, which includes insertion, deletion and synonym substitution to check the robustness of this method, which is good but shows a very low success rate. Jalil et al. [63] suggested a technique that embeds through generating a watermark key. To find the nonvowel character that occurs most frequently, the occurrence of nonvowel ASCII character analyzes first in each partition. The maximum occurrence of nonvowel and author key letters is used for watermark generation. Certification authorities are used to a registered watermark in order to provide security. Extracted watermark accuracy is analyzed through insertion and deletion attacks. In [64], the author proposed a watermarking technique that generates watermark key on the bases of the preposition, double letters, and cover file partition is analyzed through the repeating letter frequency. The key is generated through a count of double letters in a time interval. The conversion of the image into the text is performed to generate the hidden data that is included in the host document. In insertion, deletion, reordering, and other attacks, the proposed method is robust and more secure.

Cheng et al. [65] introduced an algorithm on the strategy of fragments regrouping for watermark embedding. The original watermark is divided into different fragments of order numbers and then embedded in the characters of the document. After deleting and tamper attack when some fragments are deleted or changed, the destroyed fragment is recovered using other correct fragments that are embedded in the phrases. Kim et al. [66] proposed a method that is based on syntactic displacement and morphological division in natural language watermarking for Korean. Syntax-based watermarking is used in this approach, usually, a Korean word consists of function morpheme and content morpheme. Through the use of word characteristics, the word is divided into two content morphemes into two new words, which are used for watermark embedding.

In [67], a model based on 3-D using 2-D coordinates of word-level and weights of sentences to construct zero watermarking is introduced. The structure of the 2D word space includes the length and frequency of words, then that the 2-D model is extended into 3-D. Three frequent attacks are tested on the proposed model that are synonym replacement, syntactic transformation, and deleting attack. The test report shows that the proposed method is robust, secure, and useful imperceptibility. Al-Wesabi et al. [68] proposed Markov’s model-based approach for watermarking, where the watermark key is generated through cover file probabilistic features. The use of the hidden Markov model information for text watermarking is analyzed and stored in the document for authentication. It offers protection against attacks with a higher percentage of watermark distortion than all attacks. In [69], the author suggests a zero watermark approach that uses the Arabic character’s characteristics for embedding the watermark without changing the original text. In an initial phrase, name/number of sura and number of verses are checked, and then from each verse of Holy Quran, key is generated. With this algorithm, a character watermark bit of the word set is inserted. The proposed method built a system to verify the sensitive of the Holy Quran digital text. With this technique, changes in the original text content can be detected and only minimal hardware resources are required.

Alginahi et al. [70] introduced an approach that generates the watermark key by converting the image into text. A duplicated cover file is used for embedding the image logo where it is classified and processed, and using its characteristics watermarking key is generated. The proposed technique offers authorized content manipulation and copyright protection. Through using blind and fragile watermarking approaches, the watermark key is secured. This method produces excellent results after evaluating the computational time of watermark encoding and decoding. Ba-Alwi et al. [71] presented a novel technique based on probabilistic models for ownership verification and tamper detection in English documents. The probabilistic pattern is extracted by using natural language processing based on the Markov model. Each text document content is analyzed in English and extracts the probabilistic characteristics between these contents.

In [72], the authors suggested a technique based on word items and particular attributes of robustness and excellent performance, which can hide information in a Word document. A novel method is proposed to enhance the robustness of the watermark. Watermarking information is divided into 5 groups. After this, it is embedded into the plain text one by one as a group no. An advantage of this method is challenging to extract hidden information because its first encrypted information is divided into several groups and then embedded into word properties. After the experiments, most of the watermarked text is the same, but in two or three lines, some characters are changed which also changes text meaning. The scheme is not very good on the base of imperceptibility. Chen et al. [73] suggested a semantic technique for embedding watermark information in the text. The watermark information is embedded through the mapping location of each digit. The proposed algorithm does not change the integration of text and format. The author claims that it is robust against watermarking attacks and text format transformation.

Ahvanooey et al. [29] offer a novel text watermarking method for web pages. Structural and syntactic rules are used to embed watermark, which is encoded and converted into zero-width control characters with a binary model classification. Hypertext Markup Language (HTML) is used as a cover file to embed the transparent zero-width watermark. In [74], the author suggests a novel method for embedding information in text, which is based on font code that embeds a watermark into text by disrupting text character glyphs while retaining text content. The glyph recognition method is also presented to restore the information that is embedded in the encrypted document. A new approach is proposed for Arabic text using pseudospace in [75]. The connected letters are isolated with pseudospace to hide watermark bits, which are used to hide watermark bits. In the first method, the watermark is embedded in the punctuation of the Arabic text by inserting a pseudorandom, and in the second method, the pseudospace is added to the standard space, thus increasing the capacity. The proposed method is robust and imperceptible against formatting and tampering attacks. Wen et al. [76] suggested algorithms for Extensible Markup Language (XML) document to hide information. The first method is the eXtensible Stylesheet Language Transformation- (XSLT-) related that is designed with the inclusion of additional codes to provide copyright protection. In the second method, the functional dependency is used for the XML file as a function for zero watermark. The proposed method performs well in alternation attacks, compression attacks, reorganization attacks, and selection attacks. From the study of Hakak et al. [77] in this work, a complete framework is presented with regard to the automatic authentication and distribution of the digital Quran and Hadith verses. The verification process is divided into two phases, security and verification. The watermarking technique in case of the security phase secured the confirmed and tested verse. For verification, the Boyer–Moore algorithm is used for extraction. The efficiency analysis of the existing techniques is presented in Table 1.

10. Attacks in Text Watermarking

Watermark content has specific attacks depending on the application. Some attacks are significant from other attacks. The basic types of attacks are an illegal insertion, illegal updation, illegal deletion, reordering attack, and the mixture of all these attacks. Table 2 presents the analysis of robustness attacks. This includes insertion, deletion, reordering, formatting, copy and paste, and retyping attacks. In the following categories, these attacks are placed [7, 72, 99104].

10.1. Unauthorized Insertion

When an attacker wants to add false information, then such type of attack occurs, i.e., in the case of legal documents. Each time a dispute concerning the application of copyright occurs, and this type to identify the first recorded content stamp is used.

10.2. Unauthorized Detection

The ability to be detected in some applications is restricted. It is believable that the aptitude of a challenger to quickly identify whether a mark in a particular plant is present endangers the security of the watermarking system.

10.3. Unauthorized Deletion

An attacker can delete some words or sentences from the text to remove the original author’s identity. All watermark application required security against illegal deletion. It is crucial to restrict the attacker to remove watermark information. The system is called secure if the watermark is still extracted from the text after applying the attack.

11. Research Challenges and Future Direction

Text watermarking research is at an early stage, although the watermarking process has been extensively studied. There are several significant issues in text watermarking that have remained unresolved. In addition, applications continue to pose new challenges, and many organizations still need to implement text watermarks.

11.1. Information Availability

Information availability means that a user can access information easily and securely. Millions of Internet users around the world generated and shared information on a daily basis, which required protection against illegal usage fully. In the text watermarking context, the availability of information remains constant and prevents any change in the text content. An active system is required to ensure data availability in secure manners, where a user can access information after an independent self-monitoring system.

11.2. Data Integrity

Data integrity is one of the critical aspects of text watermarking, which is related to reliability, usability, relevance, value, and quality. Data consistency and accuracy assurance can be part of integrity [105]. The explosion of the internet allows users to access a vast amount of information, where the integrity of information also required. With the development of internet technologies such as cloud, data can be easily shared through different communication. The main issue is how to ensure the integrity of data over the Internet.

11.3. Originality Protection

It is difficult to identify the originality and quality of data that is available online or come from all sorts of databases that are always well preserved in all cases. The implemented techniques’ processing time is still high and lacks imperceptibility. The challenge is how to find the appropriate method that protects the originality of data and balance between robustness, capacity, and imperceptibility. Most of the prior techniques are either robust or imperceptible or improves the hiding capacity but failed to maintain the balance between all these parameters.

11.4. Sensitive Information Protection

Sensitive information cannot support the smallest change, such as a slight change in a character or word. When we alter confidential information, then the meaning of the text can change, or the original purpose of the text also changed [106]. This case usually involves religious writings, financial documents, government documents, and political documents. Such issues in text watermarking have been addressed with regard to the protection of the religious scriptures of the Arabic text. A lot of studies address the sensitive issue in text watermarking but not to be resolved yet. A precise text watermarking technique is required to resolve the sensitive issue.

11.5. Confidentiality of Information

Confidentiality or secrecy of information means it is not available for unauthorized persons or organizations. Specific measures need to be taken for information protected from unauthorized persons. Specific techniques must be implemented to control the confidentiality of the content in text watermarking. A suitable technique is required for the protection of information confidentiality.

11.6. Cryptography

The embedded data security must be further secured using cryptography, which helps prevent the key and make sure that watermark information is out of reach for an unauthorized user. A lot of methods have been proposed in the past to solve the copyright issues but still needs improvements. A new security framework is necessary for a trusted organization that relies on text watermarking techniques.

11.7. Language Flexibility

The majority of text watermarking techniques is only applicable for certain languages such as English, Arabic and Chinese, which reduce the usability and applicability of the techniques. It is a core challenge for the researcher to identify a suitable and proper text watermarking technique that should be implemented in any type of text language.

11.8. Document Transformation

When a watermarked document is transformed into other formats like Word to PDF and vice versa, there is a risk of losing the watermark information. It is crucial for the researcher to identify a proper text watermarking technique that supports the format transformation.

12. Recommendations

Text documents belong to almost all companies or organizations, such as banks, audit firms, or any public or private organizations. Both electronic and soft copies of sensitive text documents are processed. Such as soft degrees, birth certificates, legal notes, financial statements, classified reports, and declarations. The challenge is to define a reliable method to authenticate these documents and to guarantee the originality and protection of textual documents by copyright. An appropriate watermark technique is needed that is robust against formatting attacks and improves hiding capacity, imperceptible, and secure. This problem can be solved by a new framework to address the current challenges in text watermarking.

12.1. Proposed Model

We proposed a novel framework that overcomes the current challenges of security and privacy in the IoT paradigm based on digital watermarking as shown in Figure 6. The proposed system can provide secure communication of text documents on both local and cloud paradigms. In the proposed framework, the watermark is embedded into the custom properties of a text document. These custom properties are suitable for three reasons. First, they are not referred to with the parts of the primary document. Second, the watermarking process does not change to the original content of the document. Third, it can hide an adequate amount of secret message.

In the proposed model, the secret message (MS) is given as input, and then in the pre-processing phase, MS is encrypted through Advanced Encryption Standard (AES). The ASE is a simple encryption technique that is used to secure MS. The encrypted message (ME) is converted into a binary string, and then it is divided into n number of groups. The suitable components of the original document DO are inspected, and ME groups are embedded into these components. As mentioned above, the custom components are ideal for three reasons, capacity, security, and robustness. MS has no influence on the original content of the document and does not disturb the imperceptibility. After concealing the secret information, the watermarked document (DW) is generated that is stored or shared via the local and cloud paradigm.

Through the experimental results, our proposed model achieves excellent results against all the parameters. The proposed method is robust against all formatting attacks and more secure as compared with previous techniques, as revealed in Figure 7. Various kinds of brute force attacks are applied to check the robustness of the watermarked document. These attacks include content and format-based attacks. Figure 8 presents the comparison of the proposed method with [59, 65, 72, 83] against content and format-based attacks, which illustrate that the proposed model is robust against all possible mentioned attacks.

Our system can be applied for copyrights and owner authentication of text documents on both local and cloud computing paradigm. It can also protect the text documents against illegal use.

In addition, through the initial experimental results, we found that the proposed framework is robust, imperceptible, and supports high embedding capacity because the watermark information is stored in document components.

13. Conclusion

In this investigation, we have presented security and privacy issues in IoTs, text watermarking issues, current techniques, attacks, future research direction, and recommendations. We also classified the existing approaches of text watermarking, security and privacy issues. IoT emerging technologies and the latest applications brought new challenges for the researchers to required their attention. We have discussed and summarized the main difficulties for text document protection in IoT. This article deliberated the most common challenges and issues in text watermarking. The text is the most common medium that travels across the internet and needs full protection. Digital text watermarking is more famous for copyright protection and also hides secret information in digital contents. A lot of techniques have been proposed in this field of research, but still a new model that identifies the approaches, requirements, application of text watermarking, and its embedding process is needed. This article deliberated the most common challenges and issues in text watermarking and proposed a novel method. A novel framework for evaluating text watermarking methods is proposed that is easily and readily accessible. It is consulted by the relevant organizations and the research community. The experimental results and analysis prove that the proposed model is robust against content format-based attacks and improves the ability of concealment as compared to the previous techniques. In future research, the main tasks have been marked, and further investigation in the area of watermarking in IoT is awaited. We also analyzed the other possible attacks in the further, which enhance the robustness and improves the ability of concealment. In future, other Microsoft Word and Excel documents other than special properties will be examined for watermarking. We also investigated the Portable Document Format (PDF) document that is the most popular document format in the world.

Conflicts of Interest

The authors declare that they have no conflicts of interest.