Preferences

I've been wading through terabytes of financial data over the past month. (I'm an ML/AI software engineer trying to create a trading bot that makes me more consistent income during times of uncertainty that are outside my control.)

This is some of the data I'm looking at. NVDA price on a particular day vs. average position of trades in bid ask spread aggregated over 10 seconds, on all the exchanges it is being traded on. When it is close to 1, it is trading at close to the NBBO ask; when it is close to 0, it is trading at close to the NBBO bid.

https://imgur.com/a/lcCwDsj

One of the things I found is that the dark pool trades predict price action very well on a few-minute time horizon. "FINRA Alternative Display Facility" on the 2nd plot from the top is all the darkpool trades. If I had access to dark pool trade data in real time I think I could piggyback off the manipulators and make bank.

Unfortunately the SEC only requires them to report transactions within 15 minutes, not in real time.


> Unfortunately the SEC only requires them to report transactions within 15 minutes, not in real time.

TRF reports must happen within 10 seconds or be submitted with a late modifier [0]. An executing broker systematically submitting all reports 15 minutes late would be investigated pretty quickly.

You can buy access to the consolidated tape from a marketdata provider, although this is going to be pretty prohibitively expensive for an individual.

[0] https://www.finra.org/rules-guidance/rulebooks/finra-rules/6...

I have a Polygon real-time market data subscription and everything comes real time except for darkpools.

From the customer service bot:

"Yes, other trades are generally reported faster than dark pool trades on our WebSocket. We stream market data in real-time as we receive it, with most trades being reported very quickly. For US stocks, the average latency for trades and quotes is less than 20ms.

However, dark pool trades can be reported with a delay. FINRA allows up to 15 minutes for reporting trades from dark pools. This means that while most trades come through almost instantly, dark pool trades might have a longer reporting time."

I'm not sure what market data provider provides dark pool trades in real time?

Are you using Polygon real-time data for individuals? I suspect if you're after more fine graded data you'd be looking for a data provider for "professionals". There are usually different offerings, one for personal use and one for business use ("professional subscriber"), but these market feeds are quite expensive.
I see. I have the "Stocks Advanced" real-time data for individuals feed. I didn't realize the "professional" real time feed isn't the same feed.
I suspect that customer service bot is incorrect. As also noted on this page[0], all dark pool trades are published to the consolidated tape, which appears to be part of the polygon API.

As an aside, the behavior you’re seeing might actually be causal- a large print appearing on the tape likely causes automated systems to widen out their quotes. Or alternatively, if the trade is through the NBBO, you’re seeing the top of the book (the protected quote) being swept in compliance with RegNMS.

[0] https://www.finra.org/investors/insights/can-you-swim-dark-p...

That's one of the purposes of darkpools, to let institutions trade large size without moving the market. No one's getting manipulated here: both the buyers and sellers know what's going on.

You can trade on these yourself - the darkpool operators are always looking for new market makers. All you need to do is agree to quote a few hundred US equities in decent size inside the NBBO, six and a half hours a day, and you too can have access to this flow.

If that's the case, couldn't the providers who operate the dark rooms (like OpenChronos) use their own realtime data to do so? Are there any who do that today? (i.e. monetized market surveillance). If so, does it run afoul of regulations?
I don't know, honestly. I would imagine it to be illegal to trade on that data before it's released to publicly-available trade streams.

I'm of the opinion that a huge amount of manipulation happens and the SEC either turns a blind eye to it or the SEC goes after easier fish that have worse lawyers to get their income.

> I would imagine it to be illegal to trade on that data before it's released to publicly-available trade streams.

But what would be the penalty if you are caught? A few hundred million dollar fine? Absolute peanuts to any of the firms operating these pools. Of course there is corruption in the system, it is set up to facilitate corruption. Without corruption there would be little reason for these to even exist.

There might be other ways to know about the data in advance? Some kind of flaw somewhere that lets us sniff the “dark rooms”?
There might be other ways to know about the data in advance?

Yeah, just have lunch with a few of the insiders. That's the issue and will always be the issue.

I tried to train a model to predict the darkpool transactions from the rest of the transactions. It didn't work. I'm back to the drawing board trying other things now.
why do you think you will be able to generate "consistent income" with a trading bot in a field that is highly scrutinized by some of the best talents in the world with access to better & faster data and backed by a mountain of capital?
A lot of people say this, but you miss 100% of the shots you don't take. If you think you can't do it because there are "better people already doing it", or because academics say the markets are efficient and therefore you shouldn't try -- you will never succeed at anything.

- I really don't think they have the best talents in the world. The last place I'd want to work is a finance company. That's true of most AI/ML engineers I know. Therefore I think the best AI/ML talent is outside, not inside.

- HFT and speed isn't the game I'm playing. HFT firms that trade with FPGAs colocated with the exchange are also cursed by their own working business model and they aren't usually interested in minute-scale AI/ML trades

- Not counting speed, I can get most of their data, you can just pay for it. You just need to work out whether you can make more $ than the cost of the data. I suspect it's possible.

- Being backed by a mountain of capital means you take less risks, and you hedge volatility to please your clients, often at higher prices than efficient. Nobody would $100 million would risk 20% of it to make 100% gains. I'm happy to risk 20% of my net worth to double my money though. Big difference

- The market is filled with price action traders that amplify volatility in surprisingly predictable ways. Markets drift quite predictably to Trump policies over hours, not milliseconds. It doesn't take an idiot to realize that markets are headed down if he announces more tariffs.

- I suspect there are lots of $1000-per-trade opportunities in corners of the market that the big players do not take because they go after bigger fish, and increasing the volume on those trades would result in moving the prices and the strategy not working. For me though, $1000 a day is meaningful. That's enough to replace a dayjob.

This item has no comments currently.

Keyboard Shortcuts

Story Lists

j
Next story
k
Previous story
Shift+j
Last story
Shift+k
First story
o Enter
Go to story URL
c
Go to comments
u
Go to author

Navigation

Shift+t
Go to top stories
Shift+n
Go to new stories
Shift+b
Go to best stories
Shift+a
Go to Ask HN
Shift+s
Go to Show HN

Miscellaneous

?
Show this modal