快来看,n8n更新了!n8n可扩展性基准测试
内容来源:https://blog.n8n.io/the-n8n-scalability-benchmark/
内容总结:
n8n压力测试结果公布:队列模式与硬件升级成性能提升关键
近日,知名工作流自动化平台n8n进行了一系列极限压力测试,旨在评估其在不同部署模式与硬件配置下的性能表现。测试结果清晰揭示了平台的处理能力边界,并为用户优化部署提供了关键数据支撑。
测试设置:模拟高并发真实场景
本次测试在AWS云平台的两种实例规格上展开:基础型的C5.large(2 vCPUs,4GB内存)与高性能的C5.4xlarge(16 vCPUs,32GB内存)。测试团队重点对比了n8n的“单线程模式”与“队列模式”,并通过K6负载测试工具模拟了从3到200个虚拟用户(VU)的并发访问,监测关键指标包括每秒请求数、平均响应时间和请求失败率。
核心发现:队列模式优势显著,硬件配置决定上限
-
单一工作流测试:在处理单一Webhook请求时,即便是基础硬件,启用队列模式后性能实现飞跃。C5.large实例上,队列模式成功应对200VU的负载,实现零故障,吞吐量提升至72次请求/秒。而当硬件升级至C5.4xlarge并配合队列模式,性能达到顶峰,吞吐量稳定在162次请求/秒,延迟低于1.2秒。
-
多工作流并行测试:模拟10个流程同时运行的复杂场景时,队列模式的重要性更为突出。在C5.large实例上,单线程模式在200VU下故障率高达38%,而切换至队列模式后,故障率降为零,表现稳定。高性能硬件则进一步将吞吐量推高至162次请求/秒。
-
大文件处理测试:在处理图像、PDF等大型二进制文件时,系统资源成为主要瓶颈。在基础硬件上,两种模式在高压下均出现较高故障率。然而,在C5.4xlarge实例上,队列模式成功实现了全程零故障运行,证明了充足的计算资源与内存对于数据密集型任务不可或缺。
关键启示与建议
测试结论指出,对于追求稳定和高性能的用户而言,启用队列模式是实现可扩展性的首要步骤,它能有效分离请求接收与工作流执行,大幅提升并发处理能力。同时,硬件资源配置必须与工作流复杂度相匹配:简单自动化任务对资源要求较低,而涉及多任务并行或大文件处理时,则需要投入更强大的CPU、内存和存储资源。
专家建议,用户在设计自动化系统之初就应充分考虑可扩展性架构,避免在业务增长后出现性能瓶颈。通过此次基准测试,n8n展示了其强大的潜在性能,但充分发挥其效能需要“正确的架构”与“匹配的硬件”双管齐下。
中文翻译:
您是否曾好奇n8n的极限在哪里?我们对其进行了极限压测,结果令人惊叹。
当运行关键任务工作流时,必须明确系统的能力边界。为此,我们近期对不同n8n部署方案进行了高强度测试——通过模拟大流量并发和资源满载场景,甄选出最优配置方案。
无论您是独立开发者还是跨国企业的技术负责人,压力测试都是避免服务中断、性能瓶颈和承诺落空的关键手段。本篇基准测试文章与视频将直观展示n8n的性能边界与临界点。
工作流强度挑战
我们在两种AWS实例(C5.large与C5.4xlarge)上对n8n的单机模式和队列模式(基于多线程的队列架构)进行压测。测试采用K6实施负载测试,通过Beszel进行实时资源监控,并利用n8n内置基准工作流自动触发各压力场景。
测试工作流通过电子表格循环执行不同虚拟用户数等级,实时记录每次测试结果。数据经可视化处理后清晰呈现关键性能指标。我们全程监测系统在不同负载下的表现——响应速度、执行稳定性以及性能拐点。
测试环境配置
C5.large实例配置:
- 1个虚拟CPU
- 2线程
- 4GB内存
- 10Gbps带宽
升级至C5.4xlarge后获得:
- 16个虚拟CPU
- 32GB内存
三大核心测试场景
- 单Webhook:重复触发单一流程
- 多Webhook:并行触发10个工作流
- 二进制数据:大文件上传与处理
每项测试均采用3至200个虚拟用户梯度加压,监测指标包括:
- 每秒请求数
- 平均响应时间
- 负载故障率
(文末附自主压测工具包及n8n基准测试脚本)
单Webhook测试
初始测试模拟向n8n服务器发送单Webhook请求并接收回调响应。通过逐步提升流量,探究单实例承载极限。
在C5.large实例上,单机模式表现出色。虽然该实例能承受100个虚拟用户,但达到200用户时单线程架构达到瓶颈:响应时间延长至12秒,故障率升至1%。
启用队列模式(将Webhook接收与工作流执行解耦)后,性能跃升至每秒72次请求,延迟降至3秒内,200用户并发时实现零故障。
升级至C5.4xlarge实例后,单机模式吞吐量微升至16.2次请求/秒,延迟略有改善。而队列模式表现惊艳:持续保持162次请求/秒,200用户并发时延迟低于1.2秒且零故障。垂直扩容结合正确架构实现10倍吞吐提升。
多Webhook测试
为模拟企业级多任务场景,我们配置10个独立工作流分别由专属Webhook触发。
在C5.large单机模式下,50用户并发时响应时间飙升至14秒(故障率11%);100用户时延迟达24秒(故障率21%);200用户时系统近乎崩溃(故障率38%,响应34秒)。
切换队列模式后,3-200用户区间持续保持74次请求/秒,延迟可控且零故障。同等硬件条件下结果天差地别。
C5.4xlarge实例在单机模式下达23次请求/秒峰值(故障率31%),而队列模式全程稳定在162次请求/秒(零故障)。即使满负载运行,延迟也控制在5.8秒左右。大规模多任务处理需要更强算力,队列模式完美胜任。
二进制文件上传测试
我们设置了大文件(图像/PDF/媒体)上传处理工作流,测试内存与磁盘密集型任务表现。
C5.large单机模式在3用户时仅处理3次请求/秒;200用户时响应时间激增,74%请求失败,系统完全瘫痪。
队列模式虽延缓了崩溃,但200用户时故障率仍达87%且出现数据包残缺。
使用C5.4xlarge后,单机模式提升至4.6次请求/秒,延迟降低三分之一,故障率从74%降至11%。队列模式则实现5.2次请求/秒峰值,全程保持零故障——所有大文件均成功接收处理。该测试表明:处理二进制数据不仅需要优化架构,更需强大的CPU、内存与磁盘吞吐支持。
核心结论
- 队列模式是实现可扩展性的必选项,即便基础硬件也能通过最小配置显著提升性能
- 硬件升级至关重要:C5.4xlarge使吞吐量翻倍、延迟减半并彻底消除故障
- 二进制数据处理需未雨绸缪:必须配备更大内存、高速磁盘、S3共享存储及并行工作节点
无论为内部团队、后端系统或客户应用构建自动化,都应提前规划扩展方案:采用队列模式实现接收与处理分离,通过横向扩展支持并发处理,根据工作流类型匹配硬件资源。简单流程需求较低,但二进制数据与多任务场景需更强配置。n8n具备扩展潜力,但如同高性能引擎,需要合适的"燃料"与"赛道"才能全力驰骋。
自主测试工具包
- n8n基准测试指南
- 队列模式配置说明
- Docker安装指南
- K6负载测试工具
- Beszel监控方案
- GitHub版n8n基准测试脚本
(完整基准测试视频请参阅原文链接)
英文来源:
Ever wondered just how hard you can push n8n before it starts waving the white flag? We pushed n8n to the limits, with impressive results.
When you’re running mission-critical workflows, you need to know your limits. So we recently put different n8n deployments through their paces – simulating heavy traffic and maxing out resources to see which setups come out on top.
Whether you're running a side hustle or managing engineering for a multi-national organization, stress testing goes a long way to preventing downtime, bottlenecks, and broken promises. This benchmark blog and video will show you exactly how far n8n can go, and where it starts to fall apart!
A Workout for your Workflow
We stress tested n8n across two AWS instance types – C5.large and C5.4xlarge – using both n8n’s Single and Queue modes (multi-threaded, queue-based architecture). We used K6 for load testing, Beszel for live resource monitoring, and n8n’s own benchmarking workflows to automatically trigger each stress test scenario.
This workflow used a spreadsheet to iterate through different virtual user (VUs) levels, running each test and recording the results as it went. Once the data was logged, we turned it into a graph that revealed key performance indicators. Plus, in real-time we could see how well the system performed under varying loads – how fast it responded, how reliably it executed, and where it started to crack.
Here’s how we set it up:
The C5.large AWS instance comprised:
- 1 vCPUs
- 2 Threads
- 4 GB RAM
- 10 Gbps bandwidth
When we scaled up to the C5.4xlarge, we added 16 vCPUs + 32 GB RAM.
We ran three critical benchmarking scenarios: - Single Webhook: one flow triggered repeatedly
- Multi Webhook: 10 workflows triggered in parallel
- Binary Data: large file uploads and processing
Each test scaled from 3 to 200 virtual users to measure: - Requests per second
- Average response time
- Failure rate under load
If you’re keen to set up your own stress testing, we’ve included all the tools you need to get started at the end of this blog, including the n8n Benchmark Scripts.
Single Webhook
We started small with a single webhook. This mimicked sending a web hook request to an n8n server and sending a response back that we received that webhook call. This was just one workflow, and one endpoint, gradually ramping up traffic to see how far a single n8n instance can be pushed.
Using a C5.large AWS instance the n8n Single mode deployment handled the pressure surprisingly well, as you can see from the comparison table below. While this instance held up to 100 VUs, once we reached 200 VUs we hit the ceiling for what a single-threaded setup can manage, with response times up to 12 seconds and a 1% failure rate.
When we enabled Queue mode, n8n’s more scalable architecture that decouples webhook intake from workflow execution, performance jumped to 72 requests per second, latency dropped under three seconds, and the system handled 200 virtual users with zero failures.
Scaling up to the C5.4xlarge (16 vCPUs, 32 GB RAM) we saw some impressive gains. In single mode, throughput rose slightly to 16.2 requests per second with modest latency improvements.
But it was Queue mode that really stole the show. We hit a consistent 162 requests per second and maintained that across a full 200 VU load, with latency under 1.2 seconds and zero failures. That’s a 10x throughput gain just by scaling vertically and choosing the right architecture.
Multiple Webhooks
For the next test, we wanted to simulate enterprise-grade multitasking to better reflect real-world n8n deployments, so we set up 10 distinct workflows, each triggered by its own webhook.
On the C5.large in single mode, performance fell off quickly. At 50 VUs, response time spiked above 14 seconds with an 11% failure rate. At 100 VUs, latency reached 24 seconds with a 21% failure rate. And at 200 VUs, the failure rate hit 38% and response time stretched to 34 seconds – essentially a meltdown.
Switching to Queue mode changed the game. It sustained 74 requests per second consistently from three to 200 VUs, with latency within acceptable bounds, and a 0% failure rate. Same hardware, totally different outcome.
Once again, the C5.4xlarge took things to another level. In Single mode, it peaked at 23 requests per second with a 31% failure rate. But in Queue mode, we hit and maintained 162 requests per second across all loads, with zero failures. Even under max stress, latency stayed around 5.8 seconds. Multitasking at scale demands more muscle and Queue mode absolutely delivers.
Binary File Uploads
Finally, we wanted to test the most RAM-hungry and disk-heavy tasks we could, so we set up a binary data benchmark with workflows that deal with large file uploads like images, PDFs, and media.
On a C5.large in single mode, the cracks appeared early. At just three virtual users we managed only three requests per second. At 200 VUs, response times ballooned and 74% of requests failed. That’s not just a slowdown, that’s total operational failure.
Queue mode offered a little more resilience, delaying the breakdown. But by 200 VUs, it too collapsed with an 87% failure rate and incomplete payloads.
Then we turned to the C5.4xlarge. With this larger instance in single mode, we reached 4.6 requests per second, trimmed response time by a third, and reduced the failure rate from 74% to just 11%. Vastly improved, but not perfect.
Then in Queue mode, we peaked at 5.2 requests per second and, crucially, held a 0% failure rate across the entire test. Every large file was successfully received, processed, and responded to. This test made it clear – it’s not just about architecture. Binary-heavy workflows demand serious CPU, RAM, and disk throughput.
Key Takeaways
So what did all these tests tell us? - Queue mode isn’t optional. It’s the first step toward real scalability. Even on entry-level hardware, it massively boosts performance with minimal setup.
- Hardware matters. Upgrading to a C5.4xlarge more than doubles throughput, cuts latency in half, and eliminates failure rates entirely.
- Binary data breaks everything—unless you’re prepared. You’ll need more RAM, faster disk, shared storage like S3, and parallel workers to manage it all.
If you’re building automation for internal teams, backend systems, or customer-facing apps, don’t wait for bottlenecks to force an upgrade. Plan for scale from the beginning. Use Queue mode to separate intake from processing, scale horizontally with workers for concurrent processing, and size your hardware to match your workload. Simple flows need less, but binary data and multitasking need more. n8n is built to scale, but like any engine, it needs the right fuel and the right track to reach full power.
Want to test your own setup? - n8n Benchmarking Guide
- Queue Mode Setup
- Docker Installation Guide
- K6 Load Testing
- Beszel Monitoring
- n8n Benchmark Scripts on GitHub
And here's the full benchmarking video.