合规快讯|人工智能的转折点：有关隐私和道德问题的关注_项目动态_考试资讯_IAPP_合规类

合规快讯|人工智能的转折点：有关隐私和道德问题的关注

2023-06-19 18:01:47

浏览量：105

本文由ACFE China校对翻译，如需转载，请提前告知。

The topic of Artificial Intelligence (AI) has recently been dominating the headlines, particularly with the emergence of Generative AI-powered chatbots such as Open AI’s ChatGPT, Google’s Bard, Microsoft’s Bing Chat, Baidu’s ERNIE Bot and Alibaba’s Tongyi Qianwen. These powerful language tools have the ability to generate human-like responses and revolutionise the way we communicate and interact with technology. That said, in March 2023, thousands of AI experts, academics and business leaders signed an open letter to “call on all AI labs to immediately pause for at least 6 months the training of AI systems more powerful than GPT-4”, pending the development and implementation of a set of shared safety protocols for advanced AI design and development. At this point, it is high time we revisit the implications of the use of AI on privacy and ethical values, in an attempt to set out the relevant considerations in ensuring that AI is being developed and used in a responsible manner.

随着Open AI的ChatGPT、Google的Bard、微软的Bing Chat、百度的文心一言和阿里巴巴的通义千问等以生成式人工智能（Generative AI）驱动的聊天机器人的出现，人工智能这一话题最近一直占据新闻头条。这些先进的语言工具能够像人类般回应提问，彻底改变我们使用科技交流和互动的方式。然而，在2023年3月，数以千计的人工智能专家、学者和商界领袖签署了一封公开信，呼吁所有人工智能实验室在最近六个月内暂停训练比GPT-4更强大的人工智能系统，并在这期间共同制定和实施一套用于高阶人工智能系统的设计和开发的安全守则。这时候正好让我们重新审视使用人工智能对隐私和道德价值所带来的影响，以期罗列相关的考虑因素，确保机构以负责任的方式开发及使用人工智能。

Generative AI: A Game Changer

生成式AI：颠覆现状

According to McKinsey, “Generative AI” is generally defined as “algorithms that can be used to create new content, including audio, code, images, text, simulations, and videos”. Unlike earlier forms of AI that focus on automation or primarily conduct decision-making by analysing big data, which may not be as visible to the public, Generative AI has quickly become the talk of the town thanks to its “magical” capabilities to respond to almost any requests, and create new and convincingly human content based on prompts, and its accessibility in the form of Chatbots, search engines, and image-generating online platforms.

麦肯锡将“生成式AI”定义为“可用于创建新内容（包括语音、代码、图像、文字、模拟和影片）的算法”。相对公众较少关注的早期人工智能，只专注于自动化或通过分析大数据进行决策，生成式AI迅速成为了全球热点，全因它有“神奇”的能力，几乎能够回应任何提问，及根据输入的提示生成犹如由人类创作的新内容，并以聊天机器人、搜索引擎和图像生成网上平台等形式出现供人使用。

Generative AI has the revolutionary potential to transform different industries by increasing efficiency and uncovering novel insights. Tech giants have reportedly been exploring Generative AI models and applying them to their productivity software, which could benefit countless businesses downstream. General knowledge AI Chatbots based on Large Language Models (LLM) like ChatGPT can increase efficiency by assisting with drafting documents, creating personalised contents and business ideas, providing insights in response to enquiries, and more. The legal industry is not immune to such transformation. Some law firms have started to use Generative AI to automate and enhance various aspects of legal work, such as contract analysis, due diligence, litigation and regulatory compliance.

生成式AI具有革命性的潜能，可为各行各业提高效率和带来新的认知。据报导，科技巨头已在研究将生成式AI应用到它们生产的软件之中，并能令无数企业受益。而建基于大型语言模型（Large Language Models）的人工智能聊天机器人（如ChatGPT）拥有许多领域的知识，可以协助起草文件、制作个性化内容和商业计划、回应查询等。这些变革亦为法律行业带来一些改变，例如一些律师行已经开始利用生成式AI把不同的法律工作自动化，以提升效率，例如合同分析、尽职调查、诉讼和监管合规等。

With Growth Comes Risks

有「机」亦有「危」

Examining Generative AI without rose-coloured spectacles, however, reveals that it also presents a myriad of privacy and ethical challenges.

然而，全面审视生成式AI，就会发现它也带来了不同的隐私和道德问题。

Privacy Risks

隐私风险

AI chatbots based on LLM are different from the less advanced forms of AI based on supervised machine learning. They leverage deep learning technology to analyse and learn from massive amounts of unstructured data without supervision. The training data often comes from public text extracted from the Internet, which may include sensitive personal data or even just trivia postings online. For instance, the developer of ChatGPT reportedly scraped as many as 300 billion words from the Internet to train ChatGPT. As many AI developers keep their datasets proprietary and disclose few details about their collection, there is a risk that data protection laws which typically require personal data to be collected in a fair manner and on an informed basis (such as Data Protection Principles (DPP) 1 and 5 of the Personal Data (Privacy) Ordinance (PDPO)) may be circumvented, posing privacy risks.

有别于以监督式机器学习 (supervised machine learning) 训练的人工智能，以大型语言模型为基础的人工智能聊天机器人是利用深度学习技术，在没有监督的情况下分析和学习大量非结构化数据 (unstructured data)。这些训练数据通常来自于互联网上公开的文字，当中可能包括敏感的个人资料，甚至是在网上发布的琐事。例如，据报导，ChatGPT的开发者是从互联网上收集了多达3000亿字来训练ChatGPT。由于许多开发人工智能的机构会保留其数据集的专有权，并且很少披露有关收集资料的详请，因此它们可能规避了一般资料保障法例要求以公平和在当事人知情的情况下收集个人资料的规定（如《个人资料（隐私）条例》（《隐私条例》）的保障资料第1及第5原则），从而构成隐私风险。

Outputs of AI chatbots may also generate privacy problems. User conversations may become new training data for the AI models. Users might inadvertently be feeding sensitive information to the AI systems, which is susceptible to misuse beyond the original purpose, thereby contravening the limitation of use principle (DPP3 of the PDPO). Situations where an AI chatbot produces an output response containing personal data to which the original context has been taken out and/or misinterpreted can also happen.

人工智能聊天机器人的回答内容也可能产生隐私问题，因为聊天机器人与用户的对话可能会成为人工智能模型的新训练数据。如果用户无意中向人工智能系统提供敏感资料，这些资料便有机会被滥用，超出原本的收集目的，因而违反有关限制使用资料的原则（《隐私条例》保障资料第3原则）。人工智能聊天机器人亦可能提供包含个人资料的回应，而包含这些个人资料的前文后理可能被误解或已被删除。

In addition, Generative AI developers may run into challenges concerning the rights of data subjects to access and correct their personal data (DPP6 of the PDPO) and the retention of personal data (DPP2 and section 26 of the PDPO). If outdated and/or inaccurate personal data was part of the AI’s training data and become part of the LLM, requesting access, correction and deletion of such data could be difficult, if not impossible.

另外，开发生成式AI的机构可能在资料当事人的查阅及更正个人资料的权利（《隐私条例》保障资料第6原则）和个人资料的保存期限（《隐私条例》保障资料第2原则及第26条）方面遇上困难。举例说，若过时及／或不准确的个人资料被用作人工智能的训练数据，并成为背后大型语言模型的一部分，用户便很难（甚至无可能）查阅、更正和删除这些资料。

Furthermore, the data security risks of storing large amount of conversations in an AI chatbot’s model and database should not be overlooked. Even without malicious external threats, accidental leakage alone could be damaging. As recently as March 2023, ChatGPT suffered a major data breach exposing the titles of conversation history, the names, email addresses and last four digits of the credit card numbers of some of its users.

此外，正因人工智能聊天机器人的模型和数据库存储了大量的对话，当中的资料安全风险也不容忽视。即使没有受到外来的恶意威胁，单是意外的资料外泄亦可构成重大伤害。就在2023年3月，ChatGPT出现了一次严重的资料外泄事故，披露了部分用户过往对话的标题、用户的姓名、邮箱地址和信用卡号码的最后四位数字。

Needless to say, it is crucial to ensure that personal data is protected against unauthorised or accidental access, processing, erasure, loss or use (DPP4 of the PDPO).

毋庸置疑，相关的机构必须保障个人资料不会受未获准许的或意外的查阅、处理、删除、丧失或使用（《隐私条例》保障资料第4原则）所影响。

Wider Ethical Risks

其他道德风险

The “garbage in, garbage out” problem has always been an issue for AI models. In my view, this problem is particularly worrying in AI chatbots. Experts refer to this phenomenon as “hallucination”, where a chatbot confidently provides incorrect yet seemingly plausible information. On an occasion I confronted a chatbot by pointing out that the answer provided to me in response to my question was incorrect, and what I got from the chatbot was an instant reply: “Sorry, I made a mistake.” Inaccurate information, such as those on medical advice, can also lead to serious unintended consequences for human users.

诚然，开发任何人工智能模型都应避免用劣质的材料生产劣质的产品。作者认为这个问题在人工智能聊天机器人中尤其令人担忧。聊天机器人不时会自信地提供错误但看似合理的答案，专家将这种现象称为“幻觉”(hallucination)。有一次，作者向一个聊天机器人指出，它提供的答案不正确，而聊天机器人实时回复：“对不起，我犯了一个错误。”人工智能聊天机器人提供不准确的资料，例如错误的医疗建议，更可能会对用户造成严重的后果。

To further complicate the picture, the ethical risks of discriminatory content or biased output behind the use of Generative AI cannot be overlooked. As a reflection of the real world, the training data for AI models may have embedded elements of bias and prejudice (such as those relating to racial, gender and age discrimination). Such data would be “baked into” AI models, which in turn generate discriminatory content or biased output.

另一个更复杂的道德风险问题是：生成式AI所生成的内容可能带有歧视或偏见。这是因为人工智能模型的训练数据是从现实世界中收集，因此亦包含现实世界的偏见（例如种族、性别和年龄歧视等）；而当这些数据成为了人工智能模型的一部分，该人工智能便会生成带有歧视性或偏见的内容。

Last, the unavoidable conundrum of developing a general purpose AI model is the risk of exploitation by bad actors. A case in point is “deepfake”, where fake audio, images or videos would be synthesised, and potentially be used to spread fake news or harmful propaganda. AI chatbots could also be asked to generate codes for malware.

最后，通用的人工智能难以避免被坏人利用的风险。一个典型的例子是深伪技术 (deepfake)，即以生成式AI科技合成的虚假语音、图像或影片，可能被用于传播虚假新闻或有害宣传。人工智能聊天机器人也可能被用于编写恶意软件的代码。

All of these highlight the need for concrete efforts to address the potential misuse of AI and develop effective safeguards to prevent exploitation.

以上这些道德风险突显了社会需要切实努力，制订有效的保障措施，以防止人工智能被不当利用。

Regulatory Landscape of AI

人工智能的监管环境

On the regulatory front, in the Mainland, the “Provisions on the Administration of Deep Synthesis of Internet-based Information Services”, the rules that regulate deep synthesis service providers, operators and users, came into force in January 2023. In April 2023, the Cyberspace Administration of China (CAC) also issued the draft of “Measures for the Management of the Services by Generative AI” for public consultation, which, among others, stipulates the harmful contents prohibited from being generated, and requests providers of Generative AI products and services to submit security assessment to the CAC before launching their services publicly. The providers are also expressly required to comply with the Mainland’s Personal Information Protection Law. On the other hand, the EU is planning to regulate AI, including Generative AI, through the proposed “Artificial Intelligence Act”, which suggests a prescriptive risk-based approach to regulating all AI systems and bans certain high risk AI systems. Canada is also considering a similar law, namely the “Artificial Intelligence and Data Act”, which is undergoing public consultation. Most recently, the UK Government published a white paper in March 2023 on regulating AI through a principle-based framework, which is considered more pro-innovation and flexible than the EU’s approach. Despite the differences in approaches, they share a common theme acknowledging the importance of data protection and ethical considerations. All proposed legislation would mandate that AI systems be developed and used in a way that respects individual privacy and data protection rights, and uphold ethical values such as prevention of bias and prejudice, fairness and transparency.

在监管方面，内地的《互联网信息服务深度合成管理规定》于2023年1月生效，规范深度合成服务供应商、运营商和用户。国家互联网信息办公室（网信办）亦于2023年4月发布了《生成式人工智能服务管理办法（征求意见稿）》，其中规定禁止生成有害内容，并要求生成式人工智能产品和服务供应商向公众提供服务前向网信办申报安全评估；亦明确要求供应商遵守内地《个人信息保护法》。另一方面，欧盟正计划通过《人工智能法》规管所有人工智能。该法案提出以风险为基础的方式监管所有人工智能系统，并禁止使用某些高风险人工智能系统。加拿大也在考虑制订类似法律，即《人工智能和数据法》，目前正在进行公众咨询。最近，英国政府于2023年3月发布了人工智能的白皮书，提出以原则为本的框架规管人工智能，被视为比欧盟提出的监管方式更支持创新及更灵活。尽管方式不尽相同，但这些监管计划都认同保障个人资料和道德风险的重要性。上述拟议的法例均规定在开发和使用人工智能系统时，必须尊重隐私和保护个人资料，并维护道德价值，如防止偏见，保持公平和透明等。

Governments and regulators have also been issuing guidelines on AI and recommending organisations deploying Generative AI in their operations to pay heed to such AI governance and ethics frameworks. My Office issued the “Guidance on the Ethical Development and Use of Artificial Intelligence” in August 2021 to help organisations develop and use AI systems in a privacy-friendly and ethical manner. The Guidance recommends internationally recognised ethical AI principles covering accountability, human oversight, transparency and interpretability, fairness, data privacy, beneficial AI, and reliability, robustness and security. In September 2021, the Mainland Government also issued the “Guidance on the Ethics of the New Generation AI” (《新一代人工智能伦理规范》), which adopts similar principles such as enhancing human well-being (增进人类福祉), promoting fairness and justice (促进公平公正), and protecting privacy and safety (保护隐私安全). Elsewhere in the Asia Pacific region, Singapore, Japan and South Korea have all published guidelines on the ethical governance of AI.

各地政府和监管机构也一直在发布有关人工智能的指引，建议在运营中使用生成式AI的机构留意有关人工智能的管治和道德框架。个人资料隐私专员公署于2021年8月发布了《开发及使用人工智能道德标准指引》，协助机构以保障隐私和具道德的方式开发及使用人工智能系统。该指引罗列了国际公认的人工智能道德原则，涵盖问责、人为监督、透明度与可解释性、公平、数据隐私、有益的人工智能，以及可靠、稳健和安全方面的标准。内地政府于2021年9月发布的《新一代人工智能伦理规范》亦订定了类似的原则，如增进人类福祉、促进公平公正及保护隐私安全。亚太区其他司法管辖区，如新加坡、日本和南韩，也发布了有关人工智能道德管治的指引。

While consensus is yet to be formed globally as to whether AI should be regulated through legislation or other means and the extent of such regulation, one thing is for sure: While the development and use of AI presents an exciting yet complicated landscape with opportunities awaiting to unfold, the possible harms which it may do to data privacy and our ethical values must also be assessed and controlled. All stakeholders, including tech companies and AI developers, should join hands in co-creating a safe and healthy ecosystem to make sure that this transformative technology would be used for human good.

尽管全球尚未就是否应通过立法或其他方式规管人工智能及规管的范围达成共识，但可以肯定的是，在迎接一个人工智能所带来令人振奋又充满机遇的前景的同时，我们也必须评估和控制它对数据隐私和道德价值可能带来的伤害。所有持份者，包括科技公司和人工智能开发机构，应携手共建一个安全及健康的人工智能生态系统，确保这个变革性的技术会用于人类的福祉。

原文标题：AI’s Tipping Point: A Reminder On The Importance Of Privacy And Ethics

原文链接：https://www.hk-lawyer.org/content/ai%E2%80%99s-tipping-point-reminder-importance-privacy-and-ethics

原文出处：香港律师会

转载至ACFE China公众号

好消息好消息