算法如何推高价格：博弈论视角

qimuai 发布于 2025-10-23 01:01 阅读：55 一手编译

算法如何推高价格：博弈论视角

内容来源：https://www.quantamagazine.org/the-game-theory-of-how-algorithms-can-drive-up-prices-20251022/

内容总结：

【算法定价暗藏猫腻？博弈论揭示新型市场垄断隐忧】

在传统商业竞争中，商家密谋抬价属于明令禁止的垄断行为。然而最新研究表明，当企业使用自主学习算法定价时，即便没有人为串通，市场也可能自发形成高价垄断局面。

2019年一项开创性研究首次揭示，两个简单算法在模拟市场中经过多轮博弈后，会自发形成"价格战威慑"策略——当对方降价时立即实施更大幅度降价报复，最终导致价格维持在高位。这种"默示共谋"现象引发了监管难题，正如宾夕法尼亚大学计算机科学家亚伦·罗斯所言："算法不会在酒吧密谋，却能达成类似垄断的效果。"

更令人担忧的是，2024年最新研究发现，即便使用理论上能保证公平定价的"无交换遗憾"算法，当遭遇采用固定概率定价的"非响应式"策略时，市场价格仍会持续走高。研究参与者娜塔莉·科利娜指出："这种表面合理的定价策略，实际仍会导致消费者承担高价。"

目前学界对解决方案存在分歧。西北大学杰森·哈特莱恩主张强制推行"无交换遗憾"算法，并通过技术手段验证算法属性。但莱斯大学经济学家马莱什·派指出监管困境："缺乏明确威胁证据时，监管机构很难认定定价违法。"

随着算法定价日益普及，这场关乎市场公平的博弈正在进入全新维度。正如派所强调："这是我们时代亟待解决的重要课题。"监管机构或将面临传统反垄断工具失效的挑战，亟需建立适应数字时代的新型监管框架。

中文翻译：

算法如何推高价格：一场博弈论解析

想象一个小镇上有两家小商品店。顾客总是青睐低价商品，因此两家店主必须通过竞争来制定最低价格。由于对微薄利润不满，某天深夜他们在一家烟雾缭绕的酒馆密谋：如果共同提价而非相互竞争，双方都能获得更高收益。但这种被称为"合谋"的蓄意价格操纵行为早已被法律禁止。两位店主最终决定不冒险违法，居民们得以继续享受低价商品。

一个多世纪以来，美国法律始终遵循这一基本原则：禁止幕后交易以维持公平价格。但在当今时代，情况已变得复杂。在经济各领域，销售方日益依赖名为"学习算法"的计算机程序，这些程序会根据市场状态的新数据持续调整价格。虽然它们通常比驱动现代人工智能的"深度学习"算法简单，却仍可能产生意料之外的行为。

监管机构如何确保算法制定公平价格？传统监管手段已力不从心，因其依赖发现明确合谋证据。"算法显然不会相约喝酒谈价，"宾夕法尼亚大学计算机科学家亚伦·罗斯指出。

然而2019年一篇被广泛引用的论文表明，即使未被预设指令，算法也能学会隐性合谋。研究团队将两个简易学习算法置于模拟市场中对抗，任其探索提升利润的不同策略。随着时间的推移，每个算法都通过试错学会在对方降价时实施报复——以不成比例的幅度大幅削价。最终结果是在相互价格战威胁下形成的高价格局。

这种隐性威胁也支撑着许多人类合谋案例。那么若要保障公平价格，为何不直接要求商家使用本质上无法传递威胁的算法？

在近期论文中，罗斯与四位计算机科学家揭示了此举的局限性。他们证实即便是看似良性的利润优化算法，有时仍会导致消费者利益受损。"高价现象仍可能以看似合理的方式出现，"与罗斯合作的新研究合著者、研究生娜塔莉·科利纳表示。

学界对这项发现的解读尚未达成共识——关键分歧在于对"合理"的界定。但这揭示了算法定价问题的精妙之处，以及监管可能面临的挑战。

"缺乏威胁或协议的明确证据，监管机构很难断言'这些价格存在问题'，"莱斯大学经济学家马莱什·派表示，"这正是我认为此项研究重要的原因之一。"

无悔之境

这篇新论文通过博弈论视角审视算法定价，这个横跨经济学与计算机科学的交叉学科专注于分析策略竞争的数学原理。这是在受控环境中探究定价算法失效机制的一种途径。

"我们试图在实验室环境中重构合谋现象，"宾夕法尼亚大学经济学家约瑟夫·哈林顿解释道，"成功复现后，再寻求破解合谋的方法。"他曾撰写关于算法合谋监管的重要综述论文，但未参与这项新研究。

要理解核心概念，不妨从简单的石头剪刀布游戏入手。在此语境下，学习算法可以是玩家根据过往回合数据制定当前策略的任何方法。玩家可能在游戏中尝试不同策略，但若策略得当，最终会达到博弈论称为"均衡"的状态。在均衡中，每位玩家的策略都是对对手策略的最佳回应，因此无人愿意改变策略。

石头剪刀布的完美策略很简单：每回合随机出招，三种选择概率均等。当对手采取非常规策略时，学习算法便显现优势。此时基于历史回合调整策略，比完全随机出招更能提升胜率。

例如经过多轮观察，你发现身为地质学家的对手出石头概率超50%。若你始终坚持出布，本可获得更多胜局。博弈论将这种顿悟称为"悔恨值"。

研究者已设计出能确保"零悔恨"的简易学习算法。稍复杂的"无交换悔恨"算法更进一步保证：无论对手采取何种策略，你都无法通过系统性地替换某种出招（比如将所有剪刀替换为布）来改善结果。2000年博弈论学者证明，在任何游戏中让两个无交换悔恨算法对抗，它们终将达成特定均衡——这种均衡在单回合博弈中本是最优解。这一特性颇具吸引力，因为单回合博弈比多回合博弈简单得多。尤其重要的是，威胁机制在此失效，玩家无法实施后续报复。

在2024年的论文中，西北大学计算机科学家杰森·哈特兰与两位研究生将上述经典结论应用于竞争市场模型——玩家每回合都可重新定价。在此情境下，研究结果表明当两个无交换悔恨算法达到均衡时，总会形成竞争性价格。合谋成为不可能。

然而无交换悔恨算法并非线上市场唯一的定价策略。当无交换悔恨算法遭遇其他看似良性的对手时，会发生什么？

定价失灵

博弈论指出，对抗无交换悔恨算法的最佳策略很简单：为每个可能动作为设定固定概率，然后每回合随机选择，完全无视对手行为。这种"无响应"策略的最佳概率配置取决于具体游戏规则。

2024年夏季，科利纳与同事埃什瓦尔·阿鲁纳查莱斯瓦兰开始研究双人定价游戏中的最优概率配置。他们发现最佳策略会赋予极高价格惊人概率，同时为各种低价分配较低概率。对抗无交换悔恨算法时，这种奇特策略能实现利润最大化。"这完全出乎我的意料，"阿鲁纳查莱斯瓦兰坦言。

无响应策略表面人畜无害。它们无法传递威胁，因为根本不对对手行为作出反应。但它们能诱导学习算法提高定价，然后通过偶尔低价竞争获取利润。

最初科利纳和阿鲁纳查莱斯瓦兰认为这种模拟场景与现实无关。使用无交换悔恨算法的玩家发现对手以其为代价获利后，理应会切换算法。

但随着深入研究并与罗斯等同事讨论，他们意识到直觉有误。场景中的双方已处于均衡状态。只要无人切换算法，双方利润几乎相当且都达到峰值。此时无人愿意改变策略，消费者则被迫承受高价。更重要的是，具体概率值并非关键——多种概率选择在与无交换悔恨算法对抗时都会推高价格。这看似合谋的结果，却不见合谋踪迹。

愚者得利

监管者该如何应对？罗斯坦言尚无答案。禁用无交换悔恨算法并非良策：若全员使用该算法，价格自会下降。但对亚马逊等电商平台的卖家而言，简单的无响应策略可能是自然选择，即便存在悔恨风险。

"产生悔恨值的途径包括采取愚笨策略，"罗斯指出，"而从历史来看，愚笨并不违法。"

哈特兰认为算法合谋问题存在简明解：除博弈论学者长期推崇的无交换悔恨算法外，禁用所有定价算法。这存在可行实施方案：在2024年的研究中，哈特兰团队设计了不查看代码即可验证算法是否具备无交换悔恨特性的方法。

哈特兰承认，当无交换悔恨算法与人类竞争时，其倾向方案不能杜绝所有不良结果。但他强调罗斯论文中的场景不属于算法合谋范畴。

"合谋是双向行为，"他阐释道，"必须存在单方面行动即可避免合谋的可能性。"

无论如何，这项新研究仍留下诸多待解之谜，关乎算法定价在现实世界中如何失控。

"我们的认知远未达到预期，"派总结道，"这是时代赋予我们的重要课题。"

英文来源：

The Game Theory of How Algorithms Can Drive Up Prices
Introduction
Imagine a town with two widget merchants. Customers prefer cheaper widgets, so the merchants must compete to set the lowest price. Unhappy with their meager profits, they meet one night in a smoke-filled tavern to discuss a secret plan: If they raise prices together instead of competing, they can both make more money. But that kind of intentional price-fixing, called collusion, has long been illegal. The widget merchants decide not to risk it, and everyone else gets to enjoy cheap widgets.
For well over a century, U.S. law has followed this basic template: Ban those backroom deals, and fair prices should be maintained. These days, it’s not so simple. Across broad swaths of the economy, sellers increasingly rely on computer programs called learning algorithms, which repeatedly adjust prices in response to new data about the state of the market. These are often much simpler than the “deep learning” algorithms that power modern artificial intelligence, but they can still be prone to unexpected behavior.
So how can regulators ensure that algorithms set fair prices? Their traditional approach won’t work, as it relies on finding explicit collusion. “The algorithms definitely are not having drinks with each other,” said Aaron Roth, a computer scientist at the University of Pennsylvania.
Yet a widely cited 2019 paper showed that algorithms could learn to collude tacitly, even when they weren’t programmed to do so. A team of researchers pitted two copies of a simple learning algorithm against each other in a simulated market, then let them explore different strategies for increasing their profits. Over time, each algorithm learned through trial and error to retaliate when the other cut prices — dropping its own price by some huge, disproportionate amount. The end result was high prices, backed up by mutual threat of a price war.
Implicit threats like this also underpin many cases of human collusion. So if you want to guarantee fair prices, why not just require sellers to use algorithms that are inherently incapable of expressing threats?
In a recent paper, Roth and four other computer scientists showed why this may not be enough. They proved that even seemingly benign algorithms that optimize for their own profit can sometimes yield bad outcomes for buyers. “You can still get high prices in ways that kind of look reasonable from the outside,” said Natalie Collina, a graduate student working with Roth who co-authored the new study.
Researchers don’t all agree on the implications of the finding — a lot hinges on how you define “reasonable.” But it reveals how subtle the questions around algorithmic pricing can get, and how hard it may be to regulate.
“Without some notion of a threat or an agreement, it’s very hard for a regulator to come in and say, ‘These prices feel wrong,’” said Mallesh Pai, an economist at Rice University. “That’s one reason why I think this paper is important.”
No Regrets
The recent paper studies algorithmic pricing through the lens of game theory, an interdisciplinary field at the border of economics and computer science that analyzes the mathematics of strategic competitions. It’s one way to explore the failures of pricing algorithms in a controlled setting.
“What we’re trying to do is create collusion in the lab,” said Joseph Harrington, a University of Pennsylvania economist who wrote an influential review paper on regulating algorithmic collusion and was not involved in the new research. “Once we do so, we want to figure out how to destroy collusion.”
To understand the key ideas, it helps to start with the simple game of rock-paper-scissors. A learning algorithm, in this context, can be any strategy that a player uses to choose a move in each round based on data from previous rounds. Players might try out different strategies over the course of the game. But if they’re playing well, they’ll ultimately converge to a state that game theorists call equilibrium. In equilibrium, each player’s strategy is the best possible response to the other’s strategy, so neither player has an incentive to change.
In rock-paper-scissors, the ideal strategy is simple: You should play a random move each round, choosing all three possibilities equally often. Learning algorithms shine if one player takes a different approach. In that case, choosing moves based on previous rounds can help the other player win more often than if they just played randomly.
Suppose, for instance, that after many rounds you realize that your opponent, a geologist, chose rock more than 50% of the time. If you’d played paper every round, you would have won more often. Game theorists refer to this painful realization as regret.
Researchers have devised simple learning algorithms that are always guaranteed to leave you with zero regret. Slightly more sophisticated learning algorithms called “no-swap-regret” algorithms also guarantee that whatever your opponent did, you couldn’t have done better by swapping all instances of any move with any other move (say, by playing paper every time you actually played scissors). In 2000, game theorists proved that if you pit two no-swap-regret algorithms against each other in any game, they’ll end up in a specific kind of equilibrium — one that would be the optimal equilibrium if they only played a single round. That’s an attractive property, because single-round games are much simpler than multi-round ones. In particular, threats don’t work because players can’t follow through.
In a 2024 paper, Jason Hartline, a computer scientist at Northwestern University, and two graduate students translated the classic results from the 2000 paper to a model of a competitive market, where players can set new prices every round. In that context, the results implied that dueling no-swap-regret algorithms would always end up with competitive prices when they reached equilibrium. Collusion was impossible.
However, no-swap-regret algorithms aren’t the only pricing game strategies in the world of online marketplaces. So what happens when a no-swap-regret algorithm faces a different benign-looking opponent?
The Price Is Wrong
According to game theorists, the best strategy to play against a no-swap-regret algorithm is simple: Start with a specific probability for each possible move, and then choose one move at random every round, no matter what your opponent does. The ideal assignment of probabilities for this “nonresponsive” approach depends on the specific game you’re playing.
In the summer of 2024, Collina and her colleague Eshwar Arunachaleswaran set out to find those optimal probabilities for a two-player pricing game. They found that the best strategy assigned strikingly high probabilities to very high prices, along with lower probabilities for a wide range of lower prices. If you’re playing against a no-swap-regret algorithm, this strange strategy will maximize your profit. “To me, it was a complete surprise,” Arunachaleswaran said.
Nonresponsive strategies look superficially innocuous. They can’t convey threats, because they don’t react to their opponents’ moves at all. But they can coax learning algorithms to raise their prices, and then reap profits by occasionally undercutting their competitors.
At first, Collina and Arunachaleswaran thought that this artificial scenario wasn’t relevant to the real world. Surely the player using the no-swap-regret algorithm would switch to a different algorithm after realizing that their competitor was profiting at their expense.
But as they studied the problem further and discussed it with Roth and two other colleagues, they realized their intuition was wrong. The two players in their scenario were already in a state of equilibrium. Their profits were nearly equal, and both were as high as possible as long as neither player switched to a different algorithm. Neither player would have an incentive to change strategy, so buyers would be stuck with high prices. What’s more, the precise probabilities weren’t that important. Many different choices led to high prices when pitted against a no-swap-regret algorithm. It’s an outcome you’d expect from collusion, but without any collusive behavior in sight.
It Pays To Be Dumb
So, what can regulators do? Roth admits he doesn’t have an answer. It wouldn’t make sense to ban no-swap-regret algorithms: If everyone uses one, prices will fall. But a simple nonresponsive strategy might be a natural choice for a seller on an online marketplace like Amazon, even if it carries the risk of regret.
“One way to have regret is just to be kind of dumb,” Roth said. “Historically, that hasn’t been illegal.”
As Hartline sees it, the problem of algorithmic collusion has a simple solution: Ban all pricing algorithms except the no-swap-regret algorithms that game theorists have long favored. There may be practical ways to do this: In their 2024 work, Hartline and his colleagues devised a method for checking if an algorithm has a no-swap-regret property without looking at its code.
Hartline acknowledged that his preferred solution wouldn’t prevent all bad outcomes when no-swap-regret algorithms compete with humans. But he argued that scenarios like the one in Roth’s paper aren’t cases of algorithmic collusion.
“Collusion is a two-way thing,” he said. “It fundamentally must be the case that there are actions a single player can do to not collude.”
Either way, the new work still leaves many open questions about how algorithmic pricing can go wrong in the real world.
“We still don’t understand nearly as much as we want,” Pai said. “It’s an important question for our time.”

quanta

文章目录

📚 推荐阅读

扫描二维码，在手机上阅读