The decentralized compute thesis has been a fixture of crypto narratives for the past few years and has finally reached a proof point. The pitch where idle GPUs scattered across the planet are coordinated by token incentives to compete with hyperscaler infrastructure is on the right track to materialize.
TAO Plants a Flag in the Big Tech Backyard
The same names that started to surface during the recent AI momentum spillover over January, from Render to Bittensor, are now back at the forefront of the digital-assets stack. They outperform the broader altcoins complex by an order of magnitude. Notably, parts of the stack are beginning to show early signs of real usage, with certain Bittensor subnets quietly evolving from experimental coordination layers into functional training networks.
Markets have periodically believed it. Ordinarily, they have moved on. What changed over the past week is that one network stopped pitching and started demonstrating, expanding beyond crypto native boundaries in the process.
Bittensor's Subnet 3, Templar, operated by Covenant AI, completed a pre-training run on a 72bn parameter model. This is roughly the weight class of LLaMA-2-70B, Meta's canonical open frontier model, trained entirely across distributed commodity hardware with no controlling cluster, no permissioning layer and no fixed participant set. In other words, a network that no single entity controls produced a model that the entities who control everything may find themselves benchmarking against.
(Source: CoinMetrics)
Templar’s project token, SN3, rose 83% on the week following the official paper release to reach $83.4mn in market capitalization. TAO also responded. The token surged as high as 64% over the past week, briefly pressing against the $300 level that has capped the price since mid-November 2025, driven by futures and spot volumes back at levels not seen since the post-Trump-trade momentum.
Incentive markets for machine intelligence
Bittensor is not a single AI network. It is an architecture for building incentive markets, each one a self-contained economy where participants compete to produce something measurable and get paid proportionately to how well they produce it. Those markets are called subnets, independent networks that share Bittensor's underlying consensus layer and token system but operate with their own rules, their own validation logic and their own definition of useful work.
The result is a market that continuously prices productive compute with no central employer deciding what gets built or who gets paid. Every subnet is effectively a decentralized R&D budget with an automated performance review. A subnet can reward image generation, protein folding predictions, financial data feeds or the training of large language models. Miners do the work. Validators score it. TAO emissions flow to whoever the validation mechanism determines contributed most.
Templar is Subnet 3 and it has been running the most ambitious version of that experiment, using Bittensor's incentive architecture to coordinate the pre-training of frontier-scale language models across hardware that no single entity owns or controls.
Covenant AI and the lab without a lease
Templar does not operate in isolation. Covenant AI, the team behind it, has assembled three Bittensor subnets that together replicate the complete operational stack of a centralized AI laboratory. Templar handles pre-training, the compute-intensive process of training a base model from scratch on raw internet-scale data. Basilica provides the underlying distributed compute infrastructure that feeds it. Grail runs post-training and reinforcement learning, the refinement layer that turns a base model into something instruction-following and deployable. What was once an interesting subnet experiment turned into a vertically integrated AI development pipeline where every component that a centralized lab internalizes as a cost centre becomes instead an open, permissionless market where everything is economically coordinated.
Training a frontier-scale language model on permissionless infrastructure is not a harder version of training on a managed cluster. It is a categorically different problem. A managed cluster solves trust administratively, as participants are vetted, access is controlled and the system assumes cooperation. Remove that layer and three distinct failure modes emerge simultaneously.
The first is free-riding, where nodes copy honest gradients to determine in which direction and by how much each parameter in the model should shift to make the next prediction less wrong. They submit them as original work and collect rewards without contributing real compute. The second is noise injection, where participants submit degraded or random updates that pull the shared model in unproductive directions. The third and most corrosive is adversarial poisoning where participants deliberately manipulate gradient updates to steer the model toward a compromised objective.
Gauntlet, Covenant's validation engine, addresses that matter by scoring every submitted update continuously and cross-referencing loss on assigned versus unassigned data batches to surface free-riders, making dishonesty the least profitable strategy available.
The hardware heterogeneity problem, involving nodes ranging from data centre A100s to bandwidth-constrained consumer hardware, is handled through Heterogeneous SparseLoCo. This is a parallelism framework that groups limited-resource nodes to jointly instantiate a single model replica across pipeline stages using activation compression tuned for real-world internet links. At inter-stage bandwidths of 100 Mb/s to 1 Gb/s, the bandwidth profile of ordinary colocation infrastructure, compute utilization sits above 97%. Without compression, those bandwidths make the entire exercise non-viable. With it, the long tail of global GPU capacity becomes absorbable.
The efficiency result is where centralized benchmarks start to look uncomfortable. Model FLOPs Utilization (MFU), the share of theoretical compute actually doing productive training work, is the metric practitioners use to evaluate infrastructure quality. IBM's rigorously managed FSDP training of LLaMA-2-7B across 128 dedicated A100s achieved approximately 57% MFU. That is the centralized operator baseline from a serious team with full infrastructure control. Templar's incentive mechanism drove MFU from 30% at initialization to 66% across the 72B run, surpassing the baseline. The gains did not come from better hardware but from every node competing for TAO emissions with a direct financial incentive to optimize its training code.
LLM benchmarking
The trained model is publicly evaluable, which is the only honest accountability mechanism in this space.
(Source: Sandmark)
Covenant-72B holds a narrow but real edge over LLaMA-2-70B on MMLU, scoring 67.1 against 65.6 on the broadest academic knowledge benchmark that remains among the hardest to game through narrow optimization. On HellaSwag and WinoGrande, Meta's model retains an edge. The honest read is near-parity across the standard evaluation suite, achieved on roughly half the training data and with fundamentally different infrastructure. Parity is not dominance, but parity achieved without a centralized cluster against a model that cost multiples more to produce repositions the terms of the comparison.
While scores only reflect the output, Covenant-72B incurred just 70 seconds of idle time per synchronization round across a two-hour training window, against 8.3 minutes per round for INTELLECT-1, its closest decentralized predecessor running a model 7.2 times smaller. Idle time in distributed training is dead capital, with GPUs consuming power while producing nothing as they wait for nodes to align. Compressing that overhead by a factor of seven on a network an order of magnitude larger is the infrastructure result that the benchmark table does not show but cannot be separated from.
The base model is one result. What Grail does with it is another. The post-training results sharpen the commercial case further. A two-stage supervised fine-tuning pipeline extending context to 8k tokens, the unit of text a language model reads and generates, produced a chat model that outperforms comparable centralized alternatives specifically on IFEval and MATH. Instruction-following precision and quantitative reasoning are the two dimensions most predictive of enterprise utility, determining whether a model gets integrated into a production workflow or stays in a research demo.
The fact that Grail's post-training pipeline produced stronger results on those dimensions than several centralized chat models suggests the full Covenant stack is further along than any single result would indicate in isolation.
A relocating moat
The centralized AI stack runs on one premise that has until now been beyond challenge. The entity that controls the computing controls the model. OpenAI, Anthropic, Google DeepMind and xAI have spent tens of billions collectively constructing that control through GPU clusters, data centre capacity and the engineering depth to coordinate it all into something that produces frontier intelligence. This builds a barrier high enough to make the coordination problem a moat.
Templar does not try to clear that barrier. It makes the barrier irrelevant. AWS, Azure and Google Cloud are not facing a competitor with better pricing or faster provisioning. They are facing a system that routes around the pricing relationship entirely. When an enterprise trains on a hyperscaler, capital flows toward Amazon or Google. When a node operator joins Templar, capital flows the other direction. The same GPU capacity sitting dormant across universities, colocation facilities and decommissioned mining operations taps into an inventory that dwarfs what any single institution could procure. It becomes productive infrastructure the moment the incentive mechanism makes participation worth more than idleness.
But the gap between what Covenant has demonstrated and what a mid-sized company could actually deploy remains wide. Enterprise onboarding does not exist in any meaningful form, operational reliability at production scale is unproven and the interface between a permissionless training network and a conventional IT procurement process is yet to come.