Introduction
Social media analytics generates valuable insights from a large volume of unstructured and semi-structured social media data to make informed decisions. This paper utilizes the marketing department and its leadership as the case. Every department in an organization has a distinct research question with varying reasons for analyzing social media platforms. These include an assessment of the brand’s position in the market. In this department, the organization may want to know what a given group of people say about their brand. Social media monitoring focuses on gathering and tracking the social media users’ information, which involves learning their interests and brand recognition.
Types of Data
In this social media monitoring and analysis, various data types are considered. This includes structured, unstructured, repetitive, and non-repetitive data. First, structured data is a type of data that observes the pre-defined data model, making it easy for analysis (Sivarajah et al., 2017). The structured data in social media are the annotations displayed when the links are shared on various sites. This data provides the user with an information summary on what they are accessing. Second, unstructured data is data that is not organized in a pre-defined format. These include media such as images, video and audio files, mobile data involving locations and text messages, text files, and communication like chats (Sivarajah et al., 2017). The unstructured data is further categorized into repetitive and non-repetitive. Repetitive data are processes that are often prone to repetition. However, repetitive data is located in all the data structures. Non-repetitive data consists of unique parts, including sounds, video, and images.
Data Privacy Issues
Various data privacy issues impact social media monitoring and analysis of data. This includes surveillance, disclosure, discrimination, and personal abuse and embracement. Many organizations use surveillance to monitor their clients’ purchasing behavior, thereby enabling them to develop a variety of offers and value-added services. Centering on sentimental and opinion data analysis, social media is a rich source of information for people to explore. However, this can only be attained through continuous observation of the customers’ transactions, which is a severe privacy threat as people do not tolerate surveillance. Disclosure is also a significant threat to privacy as the person or the group having the data can give it out to third parties. Specific individual data are made anonymous, making the subject not be identified (Bazzaz Abkenar et al., 2021). Discrimination is a significant issue when the data is shared with a third party, even after making specific changes, such as making the person or group anonymous. The disclosed private information may lead to prejudices and biasness if the third party identifies that the group or the person had a distinct view.
The Balance between Data Privacy and Usability of Data
It is a crucial factor in the current digital world to maintain a good relationship with social media users and at the same time, protect the organization from data breaches. Balancing data privacy compliance and usability is crucial for implementing a successful management strategy. Previously, many organizations focused on creating complex firewalls, which became expensive to manage and maintain. Currently, user engagement is significant to prevent third parties from gaining much information on the data. The security development lifecycle (SDL) provides various crucial methods for maintaining data privacy. SDL’s focus is on preventing data disclosure to third parties while sacrificing the validity of the data (Abowd & Schmutte, 2019). SDL methods that can be applied to this case include De-identification, suppression, and coarsening.
First, de-identification is used to remove unnecessary variables for data analysis or processing. These variables are considered direct identifiers, hence the need for their removal. These include names, geographical information, and date of birth. It may be crucial to keep the identifiers if data analysis sections require these variables. This makes it necessary for the data provider team to make necessary security measures on the data before forwarding it to the research team. Second, suppression forms the fundamental element of the SDL as it is regulated. This regulation includes modeling the sensitivity of the data item, not allowing the release of data that has a significant disclosure risk, and not allowing the release of data with sensitive items that can be modeled to real data. Lastly, coarsening breaks down attributes that can be used as data identifiers into smaller details that are difficult to categorize (Abowd & Schmutte, 2019). This is applicable in quasi-identifiers preventing re-identification of the data. This process increases the matching records, making it impossible to single out a particular record.
Privacy Techniques
Various privacy techniques can enhance data privacy in social media monitoring and analysis. These techniques and methods are based on making the data anonymous. They include K anonymity, L diversity, randomization, data distribution, and cryptographic technique. First, K anonymity uses the method of data modification before its submission to the data analytics. This makes it impossible to identify, leading to indistinguishable k records (Rao et al., 2018). This is done using generalization, making it impossible to identify the exact group or person with a specified feature in the data. These data privacy techniques is essential as they prevent colossal data loss.
Second, L diversity is essential in handling homogeneity attacks. In this method, the data is skewed into equivalent classes, making to be less prone to attack. This makes it impossible for third parties or attackers to identify disclosure in the data. Third, randomization techniques use the skill of adding noise to the data, making it impossible to extract data that will lead to record identification. Probability distribution includes noise in the data and is applied in various areas, such as sentiment analysis and surveys (Rao et al., 2018). This method is easy to be used as it does not require information on the other records since it can be used during the data collection process and the pre-processing time. It does not employ anonymity because of the complexity of time.
Fourth, the data distribution technique is when the data is distributed on various sites. This is achieved through horizontal and vertical data distribution. The horizontal distribution is when the data is disseminated to various sites with the same attributes. The vertical data distribution is done by sharing data on various sites of distinct organizations. Lastly, the cryptographic technique is also crucial in preventing data privacy breaches. In this method, the data is encrypted before being released to the analyst (Rao et al., 2018). However, encrypting a large volume of data is nearly impossible and can only be done during data collection. Differential privacy techniques are applied where massive computations are applied to the data without sharing it.
Encryption Technique
Various encryption techniques can protect the data from a privacy breach. This includes the triple data encryption standard (3DES), the advanced encryption standard (AES), the asymmetric encryption algorithm (RSA), and Twofish. The 3DES is a type of encryption that uses 56-bit keys, and it encrypts the data three times. However, this method is slower compared to other encryption methods (Lewis & Swamidurai, 2018). Additionally, the method uses a short block length, making it prone to data leaking and decryption.
AES is among the top encryption that is considered to be safe. This method uses cipher blocks that encrypt information of fixed sizes one at a time. It depends on the means of sharing the decryption key for one to access the mixed data. The RSA uses public cryptography that shares information via insecure channels (Lewis & Swamidurai, 2018). This method is secure as it utilizes large integers such as 1024-bits and 2048-bits to secure the data. The Twofish encryption uses Blowfish with a block size of 128 to 256 bits. This method is significant as it operates in small central processing units (CPUs) as well as hardware. This encryption uses a free license, making it accessible to anyone.
Conclusion
In conclusion, the type of data that social media monitoring and analysis will incorporate includes structured, unstructured, repetitive data, and non-repetitive data. However, various privacy issues affect the use of such data. These privacy issues are surveillance, disclosure, discrimination, and personal abuse and embracement. The marketing department must balance data privacy and data usability to protect the organization from data breaches. Privacy techniques are crucial in protecting the system from breaches, and they include K anonymity, L diversity, randomization, data distribution, and cryptographic technique. The company can implement 3DES, AES, RSA, and Twofish encryption to protect data.
Recommendation
When using the data, the department must consider techniques to prevent a privacy breach. These techniques include K anonymity, L diversity, randomization, data distribution, and cryptographic techniques. This will prevent third parties from accessing the parts of the data that identify the specific person or group. It will promote good relations with the public as no one likes to be surveyed, and if they find out that their data has been leaked, it may lead to underperformance of the organization.
References
Abowd, J., & Schmutte, I. (2019). An economic analysis of privacy protection and statistical accuracy as social choices. American Economic Review, 109(1), 171-202. Web.
Bazzaz Abkenar, S., Haghi Kashani, M., Mahdipour, E., & Jameii, S. (2021). Big data analytics meets social media: A systematic review of techniques, open issues, and future directions. Telematics and Informatics, 57, 101517. Web.
Lewis, W., & Swamidurai, R. (2018). Backing up big data using encryption techniques. Southeastcon 2018. Web.
Rao, P., Krishna, S., & Kumar, A. (2018). Privacy preservation techniques in big data analytics: a survey. Journal of Big Data, 5(1). Web.
Sivarajah, U., Kamal, M., Irani, Z., & Weerakkody, V. (2017). Critical analysis of big data challenges and analytical methods. Journal of Business Research, 70, 263-286. Web.