Scaling Throughput in Waterfall Private Network: Results from Tests 25-29

Disclosure

This article text was prepared with AI assistance.
All test data, measurements, and reported metrics were produced from actual test runs without AI-generated data.

Main number: Best result is Test 29 with 128,333.33 TPS sustained, 93,095.24 TPS full-run, and 175,000 TPS peak.
Main takeaway: Throughput scaling is strongest with 2s slots and 35 blocks/slot, but only if node capacity and distribution can retain performance beyond peak windows.
Main limitation: Results come from a single-datacenter private environment; real distributed-network throughput is expected to be lower.

This Waterfall Network testing cycle was designed to answer a practical scaling question: which parameters increase throughput most effectively, and where does system behavior break down under load.

Across tests 25-29, we changed three main dimensions:

Slot duration (3s -> 2s)
Blocks per slot (25 -> 35)
Network decentralization level (node count increase from 10 to 17)

The central hypothesis was that protocol-level changes alone are not enough. Throughput growth in Waterfall Network depends on the interaction between protocol configuration, node distribution, and hardware capacity. The final runs confirm this: all high-load profiles reached very high peak values, but only some preserved that performance during the full run.

In short, the strongest profile in this series is Test 29 (2s, 35 blocks/slot), while the full dataset shows why retention and stability matter as much as peak TPS.

Test Matrix

Test	Slot duration	Blocks per slot	Nodes	Validators	Full-run avg TPS	Best sustained 30s TPS	Peak TPS
25	3 sec	25	10	36,864	48,750.00	68,000.00	83,333.33
26	3 sec	25	17	65,536	64,444.44	76,000.00	83,333.33
27	3 sec	35	17	65,536	57,368.42	103,333.33	116,666.67
28	2 sec	25	17	65,536	54,571.43	106,333.33	125,000.00
29	2 sec	35	17	65,536	93,095.24	128,333.33	175,000.00

This matrix captures two primary control variables:

Slot duration (3s vs 2s)
Blocks per slot (25 vs 35)

The largest step-change appears when both are combined (2s + 35) in Test 29.

Configuration Impact (A/B)

Comparison	Scenario	Full-run change	Sustained 30s change	Peak TPS change
25 -> 26	`+nodes`, same `3s/25`	+32.19%	+11.76%	+0.00%
26 -> 27	`3s/25 -> 3s/35`	-10.98%	+35.96%	+40.00%
28 -> 29	`2s/25 -> 2s/35`	+70.59%	+20.69%	+40.00%

Interpretation: adding blocks per slot strongly increases ceiling metrics, but retention depends on node-level capacity and scheduling.

Ceiling Utilization and Retention

Formula used: Theoretical max TPS = (blocks_per_slot * 10,000) / slot_duration_sec

Test	Theoretical max TPS (formula)	Peak TPS	Peak utilization	Sustained utilization	Full-run / Sustained
25	83,333.33 (`25*10,000/3`)	83,333.33	100.00%	81.60%	71.69%
26	83,333.33 (`25*10,000/3`)	83,333.33	100.00%	91.20%	84.80%
27	116,666.67 (`35*10,000/3`)	116,666.67	100.00%	88.57%	55.52%
28	125,000.00 (`25*10,000/2`)	125,000.00	100.00%	85.07%	51.32%
29	175,000.00 (`35*10,000/2`)	175,000.00	100.00%	73.33%	72.54%

The tests consistently hit the configured peak ceiling. The practical question is retention: how much of peak performance survives across the full run.

Burst Window Throughput

Test	Best 100 sequential blocks TPS
25	55,555.56
26	55,555.56
27	55,555.56
28	62,500.00
29	71,428.57

This metric confirms that short high-density windows improved as slot duration decreased and load density increased.

KPI Comparison

Full-run vs sustained TPS Figure 1. Full-run average TPS vs sustained 30s TPS across tests 25-29. Test 29 leads both metrics (93,095.24 full-run, 128,333.33 sustained).

Figure 2. Peak slot TPS by test. Test 29 reaches 175,000 TPS, the highest ceiling in this series.

Figure 3. Stability ratio (Full-run / Sustained). Higher values indicate better retention of peak performance over the full run.

Additional Visual Analysis

Theoretical vs achieved TPS Figure 4. Direct comparison of theoretical, peak, sustained, and full-run throughput by test.

Utilization by test Figure 5. Utilization percentages relative to the theoretical ceiling. This highlights how much performance is retained from peak to sustained and full-run.

Figure 6. 2x2 configuration heatmap (slot duration x blocks/slot) for sustained and full-run metrics.

Per-Node Block Pressure

Test	Blocks per slot	Nodes	Blocks/slot per node (avg)
25	25	10	2.50
26	25	17	1.47
27	35	17	2.06
28	25	17	1.47
29	35	17	2.06

This indicator helps explain why decentralization and CPU headroom are both critical at higher slot density.

Infrastructure Profile

Cloud provider: Microsoft Azure
VM size: Standard_D4as_v5
Typical node shape: 4 vCPU, 16 GiB RAM
Max network bandwidth: 12,500 Mb/s
CPU family: AMD EPYC 7763v (Milan) / AMD EPYC 9004 (Genoa)
Disk: Premium_LRS, 40 GB, profile P6
Observed higher-performance node: 8 vCPU
Observed peak memory usage: ~10 GB

Methodology and Comparability

Each block contains 10,000 transactions.
Primary KPI: Best sustained 30s TPS.
Full-run average is retained to expose tail degradation.
Human-readable time in reports is shown in UTC.
Run durations are not identical across tests, so full-run values should be interpreted together with sustained and retention metrics.
All measurements in this article are based on private Waterfall Network test runs.

Stability Term Definition

retention = full-run avg TPS / best sustained 30s TPS

Why it matters:

Peak TPS shows the short-window ceiling.
Retention shows how much of that capacity survives across the full run.
For operational planning, retention is often more informative than peak alone, because it captures degradation behavior.

Data and Reproducibility

Raw datasets used

KPI formulas

slot_tps = slot_tx / slot_duration_sec
timestamp_tps = timestamp_tx / slot_duration_sec
full_run_avg_tps = total_tx / (slot_count * slot_duration_sec)
best_sustained_30s_tps = max(rolling_window_slot_tx_sum / 30)
For 3s slots use a 10-slot rolling window; for 2s slots use a 15-slot rolling window.
peak_window_tps(w slots) = window_slot_tx_sum / (w * slot_duration_sec)
best_100_blocks_tps = 1_000_000 / ((max_ts - min_ts) + slot_duration_sec)
Assuming each block has 10,000 transactions.
retention = full_run_avg_tps / best_sustained_30s_tps

Experiment Interpretation

Test 25 exposed a clear infrastructure bottleneck under parallel block production.
Increasing nodes to 17 (Test 26) improved both throughput and stability.
Raising slot density to 35 blocks/slot (Test 27) increased ceilings but also increased sensitivity to per-node pressure.
Moving to 2s slots (Tests 28 and 29) raised the throughput envelope further.
Test 29 combined high density (35) and short slots (2s) and delivered the strongest overall profile in this series.

Important Limitation

These tests were executed in a Waterfall Network test environment within a single datacenter. In a real distributed network, final performance is likely to be lower.

Visual Topology Snapshots

Test 29 topology (2s, 35 blocks/slot)

Figure 7. Test 29 topology under 2s slots and 35 blocks/slot. Result: 128,333.33 TPS sustained, 93,095.24 TPS full-run, 175,000 TPS peak.

Conclusions

Best overall operating point in this series is Test 29 (2s, 35 blocks/slot), with 128,333.33 TPS sustained, 93,095.24 TPS full-run, and 175,000 TPS peak.
Peak capacity is not the bottleneck in these runs. Most profiles reached their theoretical peak ceiling; the real differentiator is retention from peak to sustained and full-run.
Configuration scaling is nonlinear: shorter slots increase ceiling and burst throughput, higher blocks per slot amplify gains, but both also increase pressure on per-node scheduling and CPU.
Decentralization and hardware must scale together. Increasing node count improved stability, and higher-core nodes consistently performed better under heavy load.
Results are strong but environment-bound. Since tests were executed in a single-datacenter test network, mainnet-like distributed conditions should be expected to reduce absolute throughput.

Overall, this Waterfall Network test set indicates that sustainable scaling requires a balanced profile across slot timing, block density, node topology, and compute resources, not a single-parameter optimization.