«

起初,我不确定是否该接受Anthropic为我的著作支付报酬——但现在,我接受了。

qimuai 发布于 阅读:5 一手编译


起初,我不确定是否该接受Anthropic为我的著作支付报酬——但现在,我接受了。

内容来源:https://www.wired.com/story/anthropic-settlement-books-copyright/

内容总结:

人工智能公司Anthropic近日就未经授权使用作家作品训练其大语言模型Claude一事,与作家及出版商团体达成至少15亿美元的和解协议。这起案件源于法院认定该公司存在侵权行为,根据初步和解方案,每位作者每部被侵权的作品将至少获得3000美元赔偿。

该案暴露出人工智能发展背后的版权争议核心问题:科技公司使用受版权保护的书籍内容训练模型是否合理?尽管美国版权法存在"合理使用"条款,但作者群体认为,人工智能公司通过图书内容构建了价值万亿美元的产业,却未向内容创作者支付合理报酬,这种做法有失公平。

值得注意的是,反对赔偿的声音正以"国家安全"为由获得政治支持。部分政界人士和科技高管声称,若要求AI公司为训练数据付费,将阻碍美国在人工智能领域与中国竞争的能力。这种论调遭到作家群体的强烈反对,他们认为科技巨头完全有能力建立类似音乐行业的版权付费机制。

更深层次的危机在于,随着人工智能技术的普及,公众深度阅读的习惯正在被快速消解。部分科技领袖甚至公开质疑知识产权法的价值,这种对书籍和作者价值的轻视,可能对文化传承产生深远影响。

(本文基于作者史蒂文·利维的专栏报道,其本人作为作家协会理事及潜在受益人已披露利益相关信息)

中文翻译:

十亿美元如今已不如往昔值钱——但它依然能让人打起精神。至少当我听说人工智能公司Anthropic同意支付至少15亿美元,用于赔偿其大型语言模型Claude早期版本所侵权书籍的作者和出版商时,我的精神为之一振。此前法官已作出简易判决,认定该公司使用了盗版书籍进行模型训练。虽然这项协议仍需经过警惕的法官审查,但据报道每本书将至少向作者支付3000美元。我写过八本书,妻子著有五本——这笔钱够我们把家里卫生间翻新一遍了!

由于该和解协议针对的是盗版书籍,它并未真正解决"AI公司使用受版权保护作品训练模型是否合法"的核心争议。但真金白银的赔偿意义重大。此前关于AI版权的争论多停留在法律、道德甚至政治层面的假设性探讨。既然现实赔偿已拉开序幕,现在正是直面本质问题的时候:既然顶级AI依赖书籍内容,那么企业打造万亿美元业务却不向作者支付报酬是否公平?

抛开法律层面,我始终在这个问题上挣扎。但当争议从法庭走向支票本时,我眼前的迷雾骤然消散——这些赔偿金我拿得理所应当!向作者付费本就是正确之举,尽管包括特朗普总统在内的强大势力持相反观点。

【免责声明】
在深入探讨前,请容我郑重声明:正如前文所述,我本人是作家,这场争议的结果直接影响我的切身利益。同时我担任作家协会理事,该组织正积极为作家维权,并已起诉OpenAI和微软在训练模型中使用作家作品(鉴于我负责科技领域报道,涉及这些企业的诉讼投票均选择回避)。显然,今日所言仅代表个人立场。

过去我始终是理事会里的异类,对于"企业是否有权使用合法购买的书籍训练模型"确实心存矛盾。关于"人类正在构建知识宝库"的论点确实引发我的共鸣。2023年采访歌手Grimes时,她对自己能参与这项实验热情洋溢:"太棒了,我或许能因此获得永生!"这句话同样触动了我——让思想广泛传播正是我热爱写作的重要原因。

但将著作嵌入科技巨头构建的大语言模型则是另一回事。须知书籍堪称AI模型所能吸收的最宝贵语料:其篇幅与连贯性为人类思维提供了独特范本,涵盖主题既广博又系统,比社交媒体更可靠,比新闻文章更深刻。敢断言:若无书籍支撑,大语言模型的表现将大打折扣。

由此看来,OpenAI、谷歌、Meta、Anthropic等企业理应为使用书籍支付丰厚报酬。上月底在那场堪称耻辱的白宫科技晚宴上,CEO们轮流向特朗普吹嘘其为满足AI算力需求而在美国数据中心的巨额投资:苹果承诺6000亿美元,Meta表示将匹配同等金额,OpenAI参与的"星门"项目规模达5000亿美元。相较之下,Anthropic侵权案中拟赔偿作家和出版商的15亿美元简直微不足道。

【不公的"合理使用"】
然而法律天平可能倾向这些企业。版权法中的"合理使用"条款允许在特定条件下无偿使用书籍文章,其中关键标准是是否具有"转换性"——即以创新方式演绎原著内容且不与原作品形成竞争。审理Anthropic案的法官已裁定使用合法购买书籍训练模型受合理使用原则保护。这个判定本身充满矛盾,因为我们援引的甚至是互联网时代之前制定的法律标准,更不用说针对AI了。

显然需要基于当代现实制定新解决方案。白宫今年五月发布的《AI行动计划》未提出具体方案,但特朗普在相关讲话中表态:他认为不应向作者付费,因为建立公平付费体系"过于困难"。"若每篇阅读研究过的文章书籍都要付费,AI产业根本不可能成功,"他表示,"我们理解创作者但无能为力——这根本不可行。"(政府人士本周告知我该表态"为官方政策定调")

"实施难度大"的论调荒谬至极。AI巨头们终日吹嘘其产品将破解宇宙奥秘,却解决不了本质属于记账技术的挑战?音乐产业早已建立复杂的版权追踪系统来保障创作者权益。作家协会CEO玛丽·拉森伯格指出:"我们只需建立集体授权制度,已提出多种方案——但AI公司拒不接受,因这会削弱其合理使用辩护。"

这套伪论证如今已演变为新的拒付理论:声称美国AI存亡取决于能否击败中国。既然书籍内容对打造顶级AI至关重要,向作者付费就会分散资源——这成了国家安全问题!当Anthropic案法官裁定书籍训练属合理使用时,AI事务主管大卫·萨克斯欢呼道:"中国反正是会利用所有数据训练的,没有合理使用原则美国就会输掉AI竞赛。"所以现在我们要在保护创作者权益方面"向中国看齐"?

国家安全更是前NSA法律总顾问斯图尔特·贝克提出离谱方案的依据。他公开建议特朗普启动《国防生产法》战时条款,以政府接管企业为由未经授权使用版权书籍训练AI,作者所得不得超过单本书版税。"企业若买书就构不成侵权,"他向我解释。这种论调让花了数年研究美国参议院撰写《林登·约翰逊传》第三卷的罗伯特·卡罗作何感想?他的权威著作滋养了大语言模型,数百万付费用户从中受益——而作者大概只够买杯南瓜拿铁。

即便作者真能因书籍被用于AI训练获得数万美元补偿,仍无法解决最严峻的问题:人们正在抛弃阅读。多年科技报道生涯中,我常惊诧于某些领袖对书籍与作者的漠视。去年四月杰克·多西发推"废除所有知识产权法",埃隆·马斯克附议"同意";当年我写谷歌传记时,谢尔盖·布林质疑花长时间写书是过时理念,称"关于公司的问题用搜索引擎就能解答"。如今他和同事大概觉得人们对深度知识的渴求靠AI提示词就能满足——但谁能捧着AI生成的答案沉浸阅读呢?

(本文作者史蒂文·利维系Backchannel专栏作家,更多往期内容请参见专栏首页)

英文来源:

A billion dollars isn’t what it used to be—but it still focuses the mind. At least it did for me when I heard that the AI company Anthropic agreed to an at least $1.5 billion settlement for authors and publishers whose books were used to train an early version of its large language model, Claude. This came after a judge issued a summary judgment that it had pirated the books it used. The proposed agreement—which is still under scrutiny by the wary judge—would reportedly grant authors a minimum $3,000 per book. I’ve written eight and my wife has notched five. We are talking bathroom-renovation dollars here!
Since the settlement is based on pirated books, it doesn’t really address the big issue of whether it’s OK for AI companies to train their models on copyrighted works. But it’s significant that real money is involved. Previously the argument over AI copyright was based on legal, moral, and even political hypotheticals. Now that things are getting real, it’s time to tackle the fundamental issue: Since elite AI depends on book content, is it fair for companies to build trillion-dollar businesses without paying authors?
Legalities aside, I have been struggling with the issue. But now that we’re moving from the courthouse to the checkbook, the film has fallen from my eyes. I deserve those dollars! Paying authors feels like the right thing to do. Despite the powerful forces (including US president Donald Trump) arguing otherwise.
Fine-Print Disclaimer
Before I go farther, let me drop a whopper of a disclaimer. As I mentioned, I’m an author myself, and stand to gain or lose from the outcome of this argument. I’m also on the council of the Author’s Guild, which is a strong advocate for authors and is suing OpenAI and Microsoft for including authors’ works in their training runs. (Because I cover tech companies, I abstain on votes involving litigation with those firms.) Obviously, I’m speaking for myself today.
In the past, I’ve been a secret outlier on the council, genuinely torn on the issue of whether companies have the right to train their models on legally purchased books. The argument that humanity is building a vast compendium of human knowledge genuinely resonates with me. When I interviewed the artist Grimes in 2023, she expressed enthusiasm over being a contributor to this experiment: “Oh, sick, I might get to live forever!” she said. That vibed with me, too. Spreading my consciousness widely is a big reason I love what I do.
But embedding a book inside a large language model built by a giant corporation is something different. Keep in mind that books are arguably the most valuable corpus that an AI model can ingest. Their length and coherency are unique tutors of human thought. The subjects they cover are vast and comprehensive. They are much more reliable than social media and provide a deeper understanding than news articles. I would venture to say that without books, large language models would be immeasurably weaker.
So one might argue that OpenAI, Google, Meta, Anthropic and the rest should pay handsomely for access to books. Late last month, at that shameful White House tech dinner, CEOs took turns impressing Donald Trump with the insane sums they were allegedly investing in US-based data centers to meet AI’s computation demands. Apple promised $600 billion, and Meta said it would match that amount. OpenAI is part of a $500 billion joint venture called Stargate. Compared to those numbers, that $1.5 billion that Anthropic, as part of the settlement, agreed to distribute to authors and publishers as part of the infringement case doesn’t sound so impressive.
Unfair Use
Nonetheless, it could well be that the law is on the side of those companies. Copyright law allows for something called “fair use,” which permits the uncompensated exploitation of books and articles based on several criteria, one of which is whether the use is “transformational”—meaning that it builds on the book’s content in an innovative manner that doesn’t compete with the original product. The judge in charge of the Anthropic infringement case has ruled that using legally obtained books in training is indeed protected by fair use. Determining this is an awkward exercise, since we are dealing with legal yardsticks drawn before the internet—let alone AI.
Obviously, there needs to be a solution based on contemporary circumstances. The White House’s AI Action Plan announced this May didn’t offer one. But in his remarks about the plan, Trump weighed in on the issue. In his view, authors shouldn’t be paid—because it’s too hard to set up a system that would pay them fairly. “You can’t be expected to have a successful AI program when every single article, book, or anything else that you’ve read or studied, you’re supposed to pay for,” Trump said. “We appreciate that, but just can't do it—because it's not doable.” (An administration source told me this week that the statement “sets the tone” for official policy.)
The “too hard to implement” argument is absurd. The overlords of AI constantly boast that their products are going to solve the deep mysteries of the universe. Surely they can handle what is essentially a bookkeeping challenge. We’ve managed to pull off the more difficult task of accounting for creator rights in the music industry, where an elaborate system of tracking helps identify when and where songs are played. “All we need to do is put in place a collective licensing system, and we have come up with several different proposals for this,” says Authors Guild CEO Mary Rasenberger. “The AI companies aren’t biting because it might undermine their fair use argument.”
Nonetheless, that bogus argument about the difficulty of paying authors is a critical part of a new justification for not paying authors. This argument stipulates that the survival of the United States depends on beating China in AI. Since book content is crucial in building elite AI, paying authors is an unaffordable distraction. It’s a national security issue! When the judge in the Anthropic case ruled that training LLMs with books was fair use, AI czar David Sacks applauded the decision. “China is going to train on all the data regardless, so without fair use, the US would lose the AI race,” he said. So now, we’re supposed to emulate China in protecting creative rights?
National security is also the underpinning of an unsolicited “solution” to the problem offered by Stewart Baker, a former general counsel of the NSA. He recently posted that Trump should invoke a wartime provision called the Defense Protection Act, which allows the government to take over businesses in times of war, to justify the unapproved use of copyrighted books to train AI models. Authors would be entitled to no more than the royalties they might receive from a single book. “If the companies bought the book there would be no damage [to the author],” he told me. Explain that to someone like Robert Caro, who spent years researching the US Senate for Book 3 of his Lyndon Johnson biography. His magisterial account of that institution likely informs what we see in LLMs, and millions of paid users benefit. Enjoy that pumpkin latte bought with your compensation, Bob!
Of course, even if authors do get four- or even five-figure sums for the use of their books to train AI, that would not address the most serious problem of all—the fact that people aren’t reading. In my years covering the tech business I am often shocked at the disregard some leaders have for books and authors. Last April, Jack Dorsey tweeted “Delete all IP law.” Elon Musk replied “I agree.” When I did my book about Google, Sergey Brin told me how antiquated an idea it was to spend so much time writing a long narrative when questions about the company might best be answered through a search engine. Now, I imagine, he and his colleagues would say that an appetite for deep learning can be more than satisfied by responses to AI prompts. I defy you to curl up with one.
This is an edition of Steven Levy’s Backchannel newsletter. Read previous newsletters here.

连线杂志AI最前沿

文章目录


    扫描二维码,在手机上阅读