Salesforce AI诉讼有望和解,但或将延缓AI技术普及

内容来源:https://aibusiness.com/responsible-ai/salesforce-ai-suit-could-settle-yet-stall-ai-adoption
内容总结:
近日,又一起针对科技公司AI模型训练数据合法性的诉讼引发关注。10月15日,作家莫莉·坦泽与詹妮弗·吉尔摩在旧金山联邦法院对赛富时(Salesforce)提起集体诉讼,指控这家客户关系管理巨头未经授权使用数千本受版权保护的书籍训练其xGen大语言模型。
这起案件与上月人工智能公司Anthropic达成的15亿美元和解案具有相似性。伊利诺伊大学芝加哥分校数据科学与人工智能战略副主任迈克尔·贝内特指出,法官在Anthropic案中已明确区分:通过合法渠道获取的训练数据受合理使用原则保护,而非法获取的内容则不适用该原则。
行业分析师卡什亚普·康佩拉认为,赛富时案件很可能走向庭外和解而非对簿公堂。Anthropic的和解案例表明,协商解决正成为AI公司应对版权纠纷的务实选择。这不仅彰显版权方具备维权筹码,更揭示训练数据溯源问题已同时构成商业与法律风险。
值得关注的是,此类诉讼正在引发连锁反应。康佩拉强调,企业客户需要确保AI供应商的数据来源经过合法授权、可审计且具备法律保障。部分供应商已通过提供赔偿条款来消除客户顾虑,承诺若因使用侵权内容产生纠纷将承担赔偿责任。
随着AI技术加速落地,训练数据合法性争议或将成为行业发展的新障碍。如何在技术创新与版权保护间寻求平衡,已成为摆在AI企业面前的紧迫课题。
中文翻译:
由谷歌云赞助
选择您的首个生成式AI应用场景
要着手开展生成式AI项目,首先应关注那些能够优化人类信息交互体验的领域。
这起诉讼与Anthropic案类似,后者最终以庭外和解告终。最新指控Salesforce使用盗版作品训练其xGen大语言模型的诉讼,再次体现了通过法律手段要求科技供应商为其AI模型训练方式承担责任的趋势。
10月15日,作家莫莉·坦泽与珍妮弗·吉尔摩在旧金山联邦法院提起集体诉讼,指控这家CRM和CX巨头使用数千本受版权保护的书籍训练其xGen系列大语言模型。诉状中指出,Salesforce“通过非法下载、存储、复制并利用数据集”开发生成式AI模型。
这并非首起科技公司被指使用盗版图书训练AI的案例。上月,经法官裁定生成式AI厂商Anthropic使用数百万本盗版书籍训练AI模型后,该公司以15亿美元达成和解。
“本案与Anthropic案情况高度相似,”伊利诺伊大学芝加哥分校数据科学与人工智能战略协理副校长迈克尔·贝内特表示,“在Anthropic案中,法官裁定通过合法渠道获取并用于模型训练的作品属于合理使用范畴,而非法获取的作品则不在此列。”合理使用是著作权法中的基本原则,允许在特定条件下有限度使用受版权保护的内容。
贝内特强调:“当前争议焦点在于训练模型所用版权内容的获取方式。”
RPA2AI创始人兼分析师卡夏普·康佩拉预测,与Anthropic案类似,Salesforce案更可能走向和解而非对簿公堂。“Anthropic和解案表明,协商解决或许将成为AI公司更务实的选择。这既彰显了版权方的谈判筹码,也说明训练数据来源既是商业问题也是法律问题。”
康佩拉进一步指出,本案不仅关乎合理使用原则,更可能动摇客户对Salesforce模型及其训练数据集的信任。“企业客户需要确保AI供应商的数据源经过合法授权、可审计且符合法律规范。他们应当核实训练数据的来源与可追溯性,并充分理解供应商提供的赔偿条款。”目前部分AI供应商会向客户提供赔偿条款,承诺若因使用侵权内容产生纠纷将承担赔偿义务。
此类诉讼案件可能成为阻碍AI技术广泛应用的又一道壁垒。
您可能还喜欢
英文来源:
Sponsored by Google Cloud
Choosing Your First Generative AI Use Cases
To get started with generative AI, first focus on areas that can improve human experiences with information.
The legal filing is similar to the Anthropic lawsuit, which was settled out of court.
The new lawsuit accusing Salesforce of using stolen works to train its xGen large language models is another example of legal action aimed at holding tech vendors responsible for how they train their AI models.
Novelists Molly Tanzer and Jennifer Gilmore filed a class action complaint on Oct. 15 in U.S. District Court in San Francisco, accusing Salesforce of copyright infringement when the CRM and CX giant allegedly used thousands of books to train its xGen series of LLMs.
The authors say in the suit that Salesforce "unlawfully downloaded, stored, copied and used the datasets to develop" the models.
Salesforce is not the first vendor to be accused of taking content from pirated copyrighted books. Last month, generative AI vendor Anthropic agreed to a $1.5 billion settlement after a judge ruled that the AI model maker had used millions of books included in several large pirated datasets to train its AI models.
"[Salesforce's lawsuit] seems to be a very similar situation to the Anthropic situation," said Michael Bennett, associate vice chancellor for data science and AI strategy at the University of Illinois Chicago.
He added that in the Anthropic case, the judge ruled that works acquired legally and used in training the models constituted fair use, while works not acquired legally do fall under fair use protection. Fair use is the doctrine in copyright law that stipulates that limited use of copyrighted material is permissible to allow free expression.
"The method of acquisition of copyrighted protected works that are used to train a model, that's really where the question sits right now," Bennett said.
However, it is likely that the Salesforce case will settle rather than go to trial, similar to the settlement of the Anthropic lawsuit, said Kashyap Kompella, founder and analyst at RPA2AI.
"The Anthropic settlement suggests that negotiated resolution may be the pragmatic route forward for AI companies," Kompella added. "It signals that copyright owners have leverage and that training data provenance is both a commercial and legal issue."
The lawsuit against Salesforce is not only about fair use. It could also harm Salesforce because it could make customers of the vendor question whether they can trust its models and the data sets it used to train the models, Kompella continued.
"Enterprise clients need assurance that their AI vendor data sources are licensed, auditable, and defensible," Kompella said. "Enterprise clients should satisfy themselves about the provenance and traceability of training data and understand the indemnity clauses that the AI vendors provide."
Some vendors offer indemnity clauses to customers, pledging to compensate them if they are found to have used copyrighted content illegally.
Lawsuits like these could be another barrier to wider AI adoption.
You May Also Like
文章标题:Salesforce AI诉讼有望和解,但或将延缓AI技术普及
文章链接:https://qimuai.cn/?post=1665
本站文章均为原创,未经授权请勿用于任何商业用途