After Reddit's API shutdown the writing was on the wall. Services like Reddit and Discord are huge data troves, and now this data has a concrete $$ value. Offering unrestricted API access means that third parties will store and sell this data. So shutting them down (and monetizing your data yourself) is an obvious decision. Slack recently changed its ToS to disallow this as well - https://www.reuters.com/business/salesforce-blocks-ai-rivals....
Discord locked down the ability of bots to read message contents back in 2022. For bots used in over 100 servers, doing so now requires explicit approval from Discord (in addition to the standard approval the server owner would need to give). For most bots, the rest of the bot API is rich enough for this to not be an issue.
That cuts out a lot of the value for LLM training; and will reduce the blast radius if Discord ever decides to fully pull the plug on message access.
To read message archives or to read messages in realtime? Because I was working on a sideproject that required monitoring the messages in a channel, not just slash commands.
Some of them are absolutely up to dodgy stuff, they have access to countless private chats in servers where the users/admins are unaware messages are being sent to a third party. I wouldn’t be surprised if some of these bots are ran by state actors.
This isn't Discord doing this out of the goodness of their privacy-concerned hearts. They'll eventually try to do a Salesforce-Slack kind of play here by preventing external entities from monetizing on their platform.
Tech will turn into a casino where the house (aka the platform) always wins.
One of the things I found surprising is that many of Discord's bot permissions are not scoped to servers at all. I've been asked to authenticate with a bot service for one server numerous times that requests access to things pertaining to all servers I use, and that seems very wrong.
spy.pet used user accounts. botghost uses bot accounts, for which you need to enable certain intents in order to read messages.
Yup. I ran a bot a long time ago and it was pretty trivial to quietly scrape the entire history of any server it was in.
I only ever did this on my own server for good reason, but still.
Really a bot doesn't have any more access than a user does. You as a user can manually scroll back through the entire server history, you can check on roles, and you can see the names of channels that are hidden from you.
But it becomes a problem when bots are doing this at scale and selling the resulting data. Sort of like some other bots that people like to argue are doing the same thing a human could.
A while back there was a service called 'Spy Pet' that ran hundreds of discord bots selling access to searchable data logs. I wonder if discord is primarily concerned about the massive logging capability of services like these.