the investment industry is a placebo

Uselessly long feedback loops mean investing is an act of faith

Giant swaths of the investment industry are doing nothing but selling soothing balms. Placebos. It is the vitamin industry at best and a penis-pill pop-up banner when it “democratizes private investment.”

But it could be no other way.

If you can statistically prove your strategy has alpha you also know exactly how to price it to take all the surplus. The market for making alpha is like any market — it equilibrates based on supply and demand. There’s more wealth out there in search of a return than the capacity to absorb which is why pod shop bosses feast.

I’m not knocking this. Their CAGR, ie returns net of vol drag, through insane market environments after fees are perfectly fine. In fact, it seems that the HF market is more efficient compared to the amount of over-earning for beta that went on in the early 2000s between the dot-com meltdown and the GFC. People that made several lifetimes of loot telling just-so stories to allocators who didn’t notice how Moneyball applied to their field.

The GFC disillusionment revealed many stories to be no more than fairy tales. There was an opening for a new story that perfectly complemented the spread of technological capacity and its rider, technical skill.

Evidence-based investing.

With large data sets and faster computers we could solve investing like a physics problem. Engineers aren’t fooled by steak dinners and silver-tongues. The softest stuff they read is Kahneman who holds the why for why their factors work. All of it has the sheen of the scientific method.

Except there’s one lingering inconvenience. It’s an inconvenience that’s obvious to gamblers. I think most investors can feel it in their bones. Not surprising, we are all natural gamblers to some extent (we eat hot dogs and let strangers drive us around).

The inconvenience is the uselessly long feedback loops. Which we are going to discuss. But it’s worth mentioning that the feedback loops double as a defense for asset managers. They get to say “this works over time and if it worked all the time it wouldn’t work”. That’s true but it doesn’t solve my problem — I STILL HAVE NO FEEDBACK — plus the defense IS convenient to the fee collector.

So we’re left with “keep buying my pills because they might work”. Good luck getting a refund if the fish oil doesn’t make you live longer. The whole arrangement is irreducibly uncomfortable. That’s why it’s called the Paradox of Provable Alpha (and why I need to one day finish writing moontowermoney).

All of these thoughts were stirred up again as I listened to Adam Butler on Excess Returns.

An early quote in the interview:

The size of these edges is so small relative to the noise we encounter daily — especially compared to the gyrations of the underlying indices — that it's very difficult to make high-confidence, informed choices in advance. In other words, it's hard to know which edges or strategies to allocate to in a portfolio with any certainty that they'll outperform a random selection of other possible strategies over the next 10, 20, or 30 years of your investment horizon…your skill in selecting strategies in advance based on even very long histories of performance is pretty close to zero.

I should clarify — Adam distinguishes investment or factor style edges from trading or niche forms of investment that rely on some form of information advantages that have built over time. I think Adam would agree with me when I say trading is like any other business but for superficial reasons gets confused with investing.

I agree with his understanding of pod shops:

Pod shops are really looking for people that genuinely have alpha, so I think it's useful to kind of distinguish between what we might call sort of systematic factor strategies and alpha.

Alpha comes from somebody who has very particular niche insight or information or experience within a fairly narrow domain of the market. So, for example, we have a client who allocates to a municipal bond manager. Now, this manager has a hard cap at about a billion dollars. The team that runs it spun out of what used to be the largest muni market-making desk — worked there for 20–30 years.

What did that give them? Well, it gave them access to knowledge of where all of the flows from muni bonds — all of the issuance from the muni bond sector — are coming from, the different state governments, who the decision-makers are there, how they can get inside information on what type of issuance is coming down the pipe. And then, being at the center of flows in the muni market — which is a very niche segment of the market — right?

I think that’s just one example, but there are many. For example, somebody who worked for 20 years in the electricity markets — and electricity is a very nuanced pricing market with a very small number of key players, and is largely driven by changes in regulations at the state level and the county level. So, having very specialized knowledge of that, from having worked and gained experience inside the sector, gives you a real edge.

Right, so these are the types of strategies and people that the pod shops are looking for right now. These typically tend to be fairly illiquid strategies — right? You can't have Elliott Management, a $70 billion firm, running just a niche electricity strategy or a niche muni strategy. But the goal is to find hundreds of people who are all running these niche little strategies, that will all require liquidity to take advantage of opportunities at completely different times from one another, and putting them all together in a diversified basket.

Now, I'm sure there are also very scalable strategies in there as well, that maybe are running more liquid equity strategies or option strategies or whatever. What I fundamentally believe — and my insight from knowing people at those shops — is that the majority of the alpha that you can't get anywhere else at scale comes from the assembly of many different, less liquid, small niche players that are all operating together in an ensemble.

That last sentence harkens right back to the idea of combining multiple strategies into a portfolio that has a higher Sharpe than any of the constituents by letting the zigs neutralize zags to shrink the denominator.

So I’m nodding along then the host, Jack drops a tight line:

I think Corey [Hoffstein] showed in Factor Fimbulwinter that the amount of time we would need to show that is longer than our investing lifetime.

So this does become about faith.

The verb “showed” following the subject “Corey” is a clue that I get to learn something cool today.

So I read the article referenced:

🔗Factor Fimbulwinter (8 min read)

I’m going to jump to the end because I think Corey stages why he takes the approach he does in the article (emphasis mine):

The question we must answer, then, is, “when does statistically significant apply and when does it not?” How can we use it as a justification in one place and completely ignore it in others?

Furthermore, if we are going to rely on hundreds of years of data to establish significance, how can we determine when something is “broken” if the statistical evidence does not support it?

Price-to-book may very well be broken. But that is not the point of this commentary. The point is simply that the same tools we use to establish and defend factors may prevent us from tearing them down.

Corey uses fire to fight fire.

This is where the learning begins. Let’s see what he does.

We ran the following experiment:

  1. Take the full history for the factor and calculate prior estimates for mean annualized return and standard error of the mean.
  2. De-mean the time-series.
  3. Randomly select a 12-month chunk of returns from the time series and use the data to perform a Bayesian update to our mean annualized return.
  4. Repeat step 3 until the annualized return is no longer statistically non-zero at a 99% confidence threshold.

For each factor, we ran this test 10,000 times, creating a distribution that tells us how many years into the future we would have to wait until we were certain, from a statistical perspective, that the factor is no longer significant.

Sixty-seven years.

Ok, I’m going to raise my hand in class.

I didn’t really understand the method.

Awesome I get to learn something new...which means you do too! This is pretty cool.

Luckily, there’s a tireless teacher known as ChatGPT. I wrangled with this professor at office hours until I was able to have it teach me in words I or a middle-schooler can understand.


🧪 How They “Repeat with New 12-Month Chunks” — Using a Coin Flip Example

We're walking through the logic of how a statistical test updates with each new batch of data — using a coin flip as our analogy.


🎯 The Goal

We want to detect whether a strategy (like a stock factor or a biased coin) has stopped working — i.e., its returns have gone flat.

Think of it like this:

  • A coin used to land heads 60% of the time.
  • But now it’s just a fair coin (50/50), we just don’t know it yet.
  • So we flip it 12 times (like one year of monthly returns), check what it shows, and keep flipping until we’re statistically convinced it’s no longer special.

🧪 The Setup

  • The coin is now fair (true heads probability = 50%).
  • We start with a belief: "maybe it's still biased to 60%."
  • We flip the coin in 12-flip chunks, and after each chunk, we update our belief.
  • This mimics what the researchers did — taking random 12-month samples from a flat return series and asking: “Does this still look like a working strategy?”

🔁 Step-by-Step Walkthrough (One Simulation)

Let’s pretend we have a long list of coin flips (each 1 = heads, 0 = tails), all drawn from a fair coin.

Here’s a fictional sequence:

[1, 0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 1, 1, 0, 1, 0, 1, 1, 0, 0, 1, 0, 0, 1, ...]

We’ll take this in chunks of 12 flips, like 12 months of flat returns.


1️⃣ Year 1 — First 12 flips

[1, 0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 1]
  • Number of heads: 6
  • Sample mean: 6 / 12 = 0.50

We compare this to our hypothesis that the coin is biased to 60%:

“Is 50% close enough to 60% that we still believe the coin is special?”

Yes — this result is plausible from a 60% coin, so we keep going.


2️⃣ Year 2 — Next 12 flips

[1, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 0]
  • Heads: 5
  • Mean: 5 / 12 = 0.4167

Now combine both years:

  • 24 total flips
  • 11 total heads
  • Cumulative mean: 11 / 24 = 0.458

Still not far enough from 0.60 to be statistically confident the coin isn’t biased.


3️⃣ Year 3 — Another 12 flips

[0, 0, 1, 0, 1, 1, 0, 0, 1, 0, 1, 1]
  • Heads: 6
  • Cumulative now: 36 flips, 17 heads
  • Cumulative mean: 17 / 36 = 0.472

Now let’s check this against our original belief (that the coin is 60%).

We run a z-test:

  • Expected mean under H₀ (the hypothesis): 0.60
  • Standard error: sqrt(0.6 * 0.4 / 36) ≈ 0.0816
  • Z = (0.472 - 0.60) / 0.0816 ≈ -1.57

Since -1.57 is not extreme enough (we need z < -2.58 to reject at 99% confidence), we still can’t say the coin is fair.


🔁 Repeat the Process

Each year:

  • Add 12 more flips
  • Update the total number of heads and flips
  • Recalculate the cumulative mean
  • Re-test: “Are we now confident the coin isn’t biased?”

Eventually, the sample mean will drift far enough from 0.60 that the test crosses the 99% threshold. At that point, we’d say:

“I’m now 99% confident this coin is no longer special.”

💡 Why This Works

Even if the coin is fair, short sequences can look biased just by chance. You might see 8 heads out of 12 once in a while — that doesn't mean the coin works.

So the researchers repeat this full process — from scratch — 10,000 times, each with a new random sequence of fair flips.

Then they record:

“How many years did it take before the test figured out the coin was dead?”

On average, the answer was:

67 years.

🧮 What This Means for Investing

If a strategy (or "factor") stops working but still produces noisy returns — it might take decades before we’re statistically confident it no longer works.

The noise in the short term can mask the truth for a long time.

I enjoyed Corey’s technique because it gives you a sense of proportion between signal and noise. In so many domains where an assertion is made, that proportion is absent. I always think about how a CRO I worked with would reflexively try to put error bars on any metrics presented in a chart. It’s good epistemological hygiene. It automatically triggers awareness of base rates and outside views. It’s not a panacea for truth, but it rules out obvious bullshit —randomness sold to you as signal. That will save you time and money in life. It may not increase your top line but it will save you your bottom line.