當前盛行以大數據技術進行資料研究分析,導出各種創新性推論或發現,以造福社會。惟資料廣泛運用時,亦漸形成隱私風險。面對隱私保護與資料效用的衝突,倘將個資去識別化、匿名化,則不受個人資料保護法拘束,可移作原始蒐集目的外之利用或與第三人分享,以供各種運用。惟大數據時代,有眾多資料來源可供交叉比對,不論去識別化或匿名化資料均難以維持不可逆、不可還原的狀態,而不可避免均有被再識別化風險,乃形成隱私等人格與經濟損害、表意自由的寒蟬效應。一些知名的再識別化事件,使得去識別化的有效性漸受質疑。但有反駁,被再識別出來的比例實屬微小,去識別化仍屬有效機制。對此,美國法院見解亦相當分歧。對於去識別化與再識別化的衝突,建議可採下列因應措施:(一)去識別化資料仍可能與其他資料相結合而再識別化,故較務實解決之道,應非在於完全排除再識別化風險,而應著重於減緩風險至極低程度。類似此風險忍受概念,歐美許多立法例普遍採用之「合理」識別化、去識別化標準,亦未要求「完全排除被再識別化之風險」。(二)去識別化的進行,應按再識別化風險評估而兼採符合比例之合理技術、行政與法律措施,以降低再識別化風險。(三)課予民刑事責任而禁止不當再識別化。
The concept of "personal data" as the cornerstone for information privacy laws seems workable. Any data relating to an identified or identifiable natural person will trigger the mechanism of personal data protection. The operation of big data is to derive or infer hidden value from the structured and unstructured raw data through novel reuse. However, the reuse of personal data will be likely beyond the scope of original collection purpose, in violation of the principle of purpose limitation. Furthermore, the ubiquitous use of personal data will lead to privacy risk. As a consequence, one of the solutions is to deidentify personal data in order to use for further purposes or share with third parties. However, in the age of big data, as the deidentified or anonymized data may be combined with other datasets from various sources, it is not likely to absolutely ensure "a person cannot be identified from a dataset." The reidentification will cause damages to privacy, personality or property, and the chilling effect on freedom of expression. As there were several famous reidentification cases in the past two decades, the effectiveness of deidentification or anonymization is gradually criticized. However, some scholars insist that the deidentification or anonymization is still effective in protecting privacy because the rate of reidentification is very small. Similarly, the U.S. courts are also divided in their effectiveness. In facing the conflict between deidentification and reidentification, there could be some solutions. Firstly, the key point is to adopt a reasonable deidentification standard, thus reducing the risk of reidentification to a not important degree, rather absolutely ruling out its risk. Secondly, data controllers shall evaluate the risk of reidentification and thus adopt the technical, legal, and organizational safeguards subject to the principle of proportionality. Finally, statutes shall include civil and criminal liabilities in order to prohibit improper reidentification.