Preferences

Maybe this has been obvious or well known, but this was a revelation to me:

> It took six hours for my order to complete, and the accounts look legit; each has a profile picture, different companies that they work for, a couple of repositories, and a contribution to one or more open-source projects, next to being a GitHub member for over a year.

This is the motivation for garbage AI-generated PRs or insubstantial docs changes that "people" make. It doesn't matter if they're good. They only exist to add surface level legitimacy to fake accounts so that services like this one can exist.


I've noticed this on producthunt too; there are lots of accounts which just summarize the post and you can tell they don't really know what the product is / have tried it. But other than that they seem completely legitimate.
And there are bots on Reddit and on Hacker News that will provide automatic links to paywall-skippers and such.

Not all bots have bad intent. Some compensate for a lack of a protocol/platform-native feature. Which makes all bots occasionally annoying, because human time is spent.

Intent is a matter of perspective.

I'm sure that from the perspective of a publication that paywalls, those bots are not a good thing.

Exactly. I’ve denied so many PR’s which only fixed a typo in comments or some other trivial thing. I know what they’re doing.
Do you then fix the typo with no attribution to the original reporter? Or do you just choose not to fix a known error because you question the motives of the reporter?
Usually this is stuff like fixing "// TODO: this fails reguarly in the CI" (real example from the last such PR). It's the kind of thing I'll fix if I see it, but not really worth going out of my way fixing something like that: it's just a comment in some test code (not user-facing) and the few people that will ever see that comment understand "reguarly" as well. It's a non-issue and basically just spam IMHO.

These are always either bots or people looking to bolster their CV by bragging they "contributed" n PRs to n repos. I signed up to collaboratively make some (hopefully nice) software, not to deal with a stream of PRs like this.

Typos in README or publicly facing docs are different; I usually merge those (and those are almost always good faith too, because usually a real human picks up on them before the bots/script kiddies do).

You deny PR's just for typo's? I've done a few of those in the past all in good faith, perhaps I'm the bot?
No I have denied PRs that _only_ fixed a typo. Edited to be more clear.
I think the people responding are missing the point.

I've also submitted PRs that just fixed typos, and I've considered that a legit contribution.

But if I maintained a high-profile project right now, I'd at least take pause in thinking some of these accounts could be spam reputation-boosting accounts that only make comments/PRs to lend legitimacy to the account when it ultimately stars some artificially boosted repo.

And making it harder to detect star manipulation erodes the signals of trust which have been used on Github, and ultimately can be a security concern (historically I've looked at numbers of contributors, stars, downloads, and issues open/closed as a rough idea of how secure some npm dependency might be.. basically the idea that "more eyeballs" can mean slightly less chance of a massive security issue, especially in security-critical code like oauth libraries)

I don't know what the solution is here. Maybe requiring people sign a CLA like some corporate open source projects do is at least enough of a barrier

I issued a PR to correct a typo in a popular ML library just two weeks ago, I am not a bot, this strategy seems flawed.
Any mechanism to combat abuse will have false positives as well.
What proof do you have that you are not a bot?
He passed the Voight-Kampff test.
So you’ve just left the typo in the project? I’ve opened PRs just to fix one or two misplaced characters before. Just trying to do my part to help
Could just fix the typo yourself and reference the PR in your commit then close the PR without pulling.

I imagine contributors won’t exactly be happy with that though.

Honestly, who cares if the end result is the same? Their forked repo with the patch on still exists, the patch was incorporated in whatever way made sense to the original repo owner.

OK, a credit, but really, who cares? People can see what an accepted pull request consisted of, so I'm not sure they're kidding anybody in terms of boosting their reputation with credits for fixing typos.

All the same, I'm just glad to see people improve their presentation, especially typos.

I probably submitted more than 10 PRs that only fixed a single word in projects I like.

I do that when I see typos in documentation.

Same, and I appreciate similar contributions to my own projects
I had the same thought. Is fixing typos not contributing? Should I not be submitting PRs to help with documentation/polish/etc? I never even thought that trying to help would be viewed as malicious.
Project maintainers frequently hold themselves back with amateur presentation, and that includes typos. It's hard to take some things seriously if they are failing at English, let alone their programming language of choice. It's sad because there's plenty of amazing open source out there, but the presentation is terribad.

IMO, the solution is simple: allow project maintainers to disable pointless metrics that would incentivize the GitHub equivocal to karma farming.

I also think the quality of comment on HN suffers for the fact that the karma score is visible metric to the end-user. Reddit particularly. The view count on tweets too.

Denying that PR and fixing it yourself is taking credit for others' work, and leaving it in is no good either. I don't see any upside to rejecting them. I'd be ashamed of myself for that.
Well, i submitted such a PR today. It hasn't been rejected (just yet).
Lol so you didn't accept PRs to "own the bots"? Your horse might be a bit too high you know.
Relevant xkcd: https://xkcd.com/810/
Thus, we win once the bots submit useful PRs.
Someone submitted a PR to me yesterday on my open source repo fixing a typo — did not occur to me it was part of this strange con
It probably was legitimate. There are tons of people (myself included) that send PRs to fix typos and such. I also accept PRs like that to my own projects.

A lot more people will read the docs than the code, and typos are annoying and for some people highly distracting (OCD)

It probably was not legitimate. There are wonderful people like you but they are dwarfed by the thousands of scam accounts using this method to try and engineer legitimacy.
I did it too and now I hope people don't think I tried to scam them by being nice...
I doubt it. I too submit small fixes.
Same here. Typos bother me. Rather than complaining, I just submit a PR to fix particularly egregious ones.
[Citation needed]
It's reassuring to know that I'm not the only one with this issue. Typos can be extremely disruptive, often compelling me to re-read the entire sentence.
how is it a con if it's a useful contribution? it's not my job to filter bot accounts used for starring repos. if a PR improves my project I'll accept it.
If it’s useful (actually), it’s not a con on you of course! It does pollute the overall ecosystem though, since it is used to prop up fake stars and the like.
Same. There ought to be a way to submit inconsequential things like typos through a separate system that don't get counted as pull requests. That way the people who are genuinely claiming good faith shouldn't really care if it doesn't count as part of your "PR score".
Yes, botspam is bad. But if a change is good. A change is good, merge it and get on with your life.

If someone wants to write "check_spelling_bot" and get a ton of github karma, I have 0 issue with it. In fact, I encourage GitHub to do it :).

I've never seen "good" changes from these bots. The code they propose often isn't even working, it's just a change that looks like it might do something. Some of the spellcheck changes might be beneficial, but there's an ethical question about whether accepting those PRs (versus making the change in a new commit directly) is a net-negative to the community overall.

If you got spam, but the spam was useful to you personally, not marking it as spam prevents your email provider from flagging the account as spam to the wider world.

But if it's useful I'd argue it's deifinitionally not spam for you. And if most people don't mark it as spam so it doesn't get flagged, that means it just isn't spam. Bots send emails too, and not all bot sent emails are spam.

Relevant xkcd: https://xkcd.com/810/

> But if it's useful I'd argue it's deifinitionally not spam for you.

That is absolutely false. Spam is not defined based on it's utility, but based on whether a message is solicited.

If the author of a repo hasn't signed their repo up for an automated PR bot, that bot sending out PRs is absolutely spam.

Most definitions for it include whether the message is irrelevant or unwanted. Trying to define it purely as unsolicited gets into weird cases where you'd say that an emergency tornado alert is spam because it's unsolicited. Maybe there's a definition if spam where that's the case but if the residents of an area find that alert useful it seems unlikely that they would describe it as spam, even though they didn't solicit the warning
PR spam is annoying, you end up with “oxford-comma-bots” fighting “no-oxford-commas” forever to get credibility.
this is valuing a humans time of filtering through good and not good changes at 0. Imagine if you had to receive all email and judge each individual one for spam / not spam.
I occasionally ponder if I should send many small changes at the company I work for. Not because I'm not a real boy but because every so often my manager will be impressed by my commit count. Not that I'm trying to inflate it (I just try really hard to avoid meetings so I can write code which I actually enjoy), but...sure seems like it encourages the wrong behavior.
Oh wow, thanks for pointing it out. It’s now clicking for me.

A few months back, I noticed that there were some accounts posting issues on open source repositories, but their issues were a direct copy/paste of mine. I couldn’t figure out why they would copy/paste my issue so I dismissed it.

Now it makes sense!

What sort of "AI" model drives bots like this? Is it really AI? Or is it more like automated scrapers that run some basic deterministic functions (like spellchecks)?

Or are people actually deploying LLMs to inspect code and produce usable optimizations?

Because that would be an interesting beneficial side effect to an otherwise "nefarious" marketing hustle.

I expect it's a combination of scripting, some manual effort, and AI.

For what it's worth, I've seen lots of examples of this, and "usable optimizations" is entirely false. The PRs are often not working code. It's scattershot. The point isn't to make a PR that benefits the project in any way, it's to fill in the green square on the profile so it looks like there's a human doing things.

Anyone giving this stuff more than a cursory glance would see that it's all bullshit. But the point isn't to stand up to scrutiny. It's to defeat abuse protection measures with legitimate-looking activity. And in the case of stars, to make it look to anyone who's just glancing at the star-ers that there are real people starring the repos.

It's deviously clever and absolutely terrible.

Depending on where the scam is run, I wouldn't consider it out of the question that it would be economical to do manually.

This item has no comments currently.

Keyboard Shortcuts

Story Lists

j
Next story
k
Previous story
Shift+j
Last story
Shift+k
First story
o Enter
Go to story URL
c
Go to comments
u
Go to author

Navigation

Shift+t
Go to top stories
Shift+n
Go to new stories
Shift+b
Go to best stories
Shift+a
Go to Ask HN
Shift+s
Go to Show HN

Miscellaneous

?
Show this modal