If you use a tool that defaults the log spew to a cheap archive, sampling to the fast store, and a way to pull from the archive on-demand much of that is resolved. FWIW I think most orgs get big scared at seeing $$$ in their cloud bills, but don't properly account for time spent by engineers rummaging around for data they need but don't have.
This is a tricky one that's come up recently. How you you quantify the value of $$$ observability platform? Anecdotally I know robust tracing data can help me find problems in 5-15 minutes that would have taken hours or days with manual probing and scouring logs.
Even then you have the additional challenge of quantifying the impact of the original issue.
- Reliability as a cost center
- Vendor costs are to be limited
- CIO-driven rather than CTO-driven
Then it's going to be a given that they prioritize costs that are easy to see, and will do things like force a dev team to work for a month to shave ~2k/month off of a cloud bill. In my experience, these orgs will also sometimes do a 180 when they learn that their SLAs involve paying out to customers at a premium during incidents, which is always very funny to observe. Then you talk to some devs and they say things like "we literally told them this would happen years ago and it fell on deaf ears" or something.
> Also, logging everything creates yet another security hole to worry about.
I think the real problem isn’t logging, it’s the fact that your developers are logging sensitive information. If they’re doing that, then it’s a moot point if those logs are also being pushed to a third party observability platform or not because you’re already leaking sensitive information.
If developers think “log everything” means “log PII” then that developer is a liability regardless.
Also, this is the sort of thing that should get picked up in non-prod environments before it becomes a problem.
If you get to the point where logging is a risk then you’ve had other failures in processes.
This is often easier said than done. And there's ginormous costs associated with logging everything. Money that can be better spent elsewhere.
Also, logging everything creates yet another security hole to worry about.