You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Over the past few weeks I noticed that I was getting too many reel suggestions of "online creators" (aka e-girls, or e-who... that have an onlyfan). I do like animes and cosplay, and I don't dislike cute girls, but it really bothers me that a significant percentage of my reel feed is girls in lingerie basically advertizing their OF. And, similarly but to a lesser extent, all the other creators of whatever useless products. So I decided to do something about.
I started blocking them by hand, but 1 week of blocking only got me up to 50 accounts, out of the 3 millions OF creators in the world. Doing things manually was simply not an option.
How great my joy was when I discovered instagrapi, and even happier when I realized that scraping instagram is easy!!! (but with some limitations of course). In just a couple weeks, I managed to get a simple yet not too bad of a script working.
General idea
The way things usually go is that since OF links are prohibited in the biography, people use link hosting services like linktr.ee or beacons.ai . So the idea is pretty simple: get user ID, check biography for links, visit each link, and see if there is a link to onlyfans here. There's more we can do to make this better, but that's the general idea.
The difficult part is that you cannot check all 2 billions instagram accounts brute-forcely, you need a strategy. The strategy that I doesn't work too bad is using the following logic: OF creators usually have way more subscribers than accounts they are following. We can capitalize on that to add additional constraints on which accounts should be blacklisted. The other good thing is that OF creators tend to follow each-other (like business partners), so rather than pulling out 100.000 followers, you only need to grab the 50 other people in their circle (and hopefully most of them are also OF creators).
And it works!!!
So now what?
The very first thing on my mind is: do you want this tool?
Am I going too far, or is it worth putting more effort to share this tool with all of you folks, with a documentation and everything?
The very second thing on my mind is: are you willing to help?
You absolutely don't have to, but I'm sure you guys are capable of making this tool so much better, and support is always welcome.
If yes, do you have any ideas/suggestions/comments?
Check out the section below where I dive a bit more in the details of what I am doing, and let me know if there are general things that you think can be improved upon.
Some details
The way I do it is I start with one popular OF account, grab the people this account is following, and check those, and then add more and more people to this "checklist" on the fly.
To avoid being banned by IG for data scraping, I added a random sleep time of 1-2 seconds between each account check. It's slow, but the entire check is also slow anyway...
Getting bio and links is easy, but my first (bad) surprise was when I scanned the webpage, which I do using selenium. beacons.ai happens to have a very strong security against bots, so I needed to tweak it quite extensively to get something that isn't detected. it's not too complicated, but the main limitation is that 1) I need to close and reopen the navigator for each new link so that it's not recognized as a bot, and 2) add a 2 to 4 seconds delay to mimic a human. This not only makes the script slower, it also means I have chrome being opened and closed to check every link. Annoying, but this seems to be the price to pay to bypass bot blockers.
To reduce the number of checked websites, I created a whitelist to avoid wasting time on opening youtube, instagram, facebook, soundcloud, etc. websites.
Some accounts will actually be clean. In that case, I of course don't grab their followers, so they become dead ends.
I added 3 lists of triggers for flexibility: one on bio, one on links, and one on the content of the links.
At first, it sounded like including words like 'private', 'exclusive', 'shop' would make the thing more powerful, but that just made the checklist go nuts. This is especially tricky because any wannabe influencer will put 'exclusive' in their bio to refer to new products they're selling, and because they follow thousands of their "friends", the checklist just becomes insanely long very quickly. For this reason, I decided to limit myself to blacklist accounts where "onlyfans" or "fansly" appears in the websites of bio links, and only add those "followings" in the checklist.
Another related problem is that some OF creators have aggressive marketing: they follow thousands of accounts just to get more followers. A thousand dead ends is one full day of my script running for nothing. To palliate for that, I only add accounts to the checklist if the number of accounts following lesser than 200 (to make sure that it's not aggressive marketing). However, this is not the general rule, and I've definitely seen OF creators with 1 millions subs and 1 or 2 thousands of people they're following. So a better way would be to define a following/follower ratio, but I need to investigate that more.
The blacklist is a file that has user ID, name, date of last check, list of links in bio, and all trigger words.
The idea is that the procedure is separated in 3 scripts. First script to build the network, checking a bunch of accounts and adding them to the black list. Second script to double-check accounts in the blacklist: have a refined keyword list, or just re-check accounts with an updated version without the need to re-build the whole network. Third script: ban everyone based on the user preference (either ban OF, or ban everyone who has a "shop", or any type of link, up to you!). Maybe 4th script to unban all?
Anyway, that's all that comes to my mind. Any thoughts, comments, suggestions, critics are welcome!
Sincerely yours~
an0wen
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Hi all,
Context
Over the past few weeks I noticed that I was getting too many reel suggestions of "online creators" (aka e-girls, or e-who... that have an onlyfan). I do like animes and cosplay, and I don't dislike cute girls, but it really bothers me that a significant percentage of my reel feed is girls in lingerie basically advertizing their OF. And, similarly but to a lesser extent, all the other creators of whatever useless products. So I decided to do something about.
I started blocking them by hand, but 1 week of blocking only got me up to 50 accounts, out of the 3 millions OF creators in the world. Doing things manually was simply not an option.
How great my joy was when I discovered instagrapi, and even happier when I realized that scraping instagram is easy!!! (but with some limitations of course). In just a couple weeks, I managed to get a simple yet not too bad of a script working.
General idea
The way things usually go is that since OF links are prohibited in the biography, people use link hosting services like linktr.ee or beacons.ai . So the idea is pretty simple: get user ID, check biography for links, visit each link, and see if there is a link to onlyfans here. There's more we can do to make this better, but that's the general idea.
The difficult part is that you cannot check all 2 billions instagram accounts brute-forcely, you need a strategy. The strategy that I doesn't work too bad is using the following logic: OF creators usually have way more subscribers than accounts they are following. We can capitalize on that to add additional constraints on which accounts should be blacklisted. The other good thing is that OF creators tend to follow each-other (like business partners), so rather than pulling out 100.000 followers, you only need to grab the 50 other people in their circle (and hopefully most of them are also OF creators).
And it works!!!
So now what?
The very first thing on my mind is: do you want this tool?
Am I going too far, or is it worth putting more effort to share this tool with all of you folks, with a documentation and everything?
The very second thing on my mind is: are you willing to help?
You absolutely don't have to, but I'm sure you guys are capable of making this tool so much better, and support is always welcome.
If yes, do you have any ideas/suggestions/comments?
Check out the section below where I dive a bit more in the details of what I am doing, and let me know if there are general things that you think can be improved upon.
Some details
The way I do it is I start with one popular OF account, grab the people this account is following, and check those, and then add more and more people to this "checklist" on the fly.
To avoid being banned by IG for data scraping, I added a random sleep time of 1-2 seconds between each account check. It's slow, but the entire check is also slow anyway...
Getting bio and links is easy, but my first (bad) surprise was when I scanned the webpage, which I do using selenium. beacons.ai happens to have a very strong security against bots, so I needed to tweak it quite extensively to get something that isn't detected. it's not too complicated, but the main limitation is that 1) I need to close and reopen the navigator for each new link so that it's not recognized as a bot, and 2) add a 2 to 4 seconds delay to mimic a human. This not only makes the script slower, it also means I have chrome being opened and closed to check every link. Annoying, but this seems to be the price to pay to bypass bot blockers.
To reduce the number of checked websites, I created a whitelist to avoid wasting time on opening youtube, instagram, facebook, soundcloud, etc. websites.
Some accounts will actually be clean. In that case, I of course don't grab their followers, so they become dead ends.
I added 3 lists of triggers for flexibility: one on bio, one on links, and one on the content of the links.
At first, it sounded like including words like 'private', 'exclusive', 'shop' would make the thing more powerful, but that just made the checklist go nuts. This is especially tricky because any wannabe influencer will put 'exclusive' in their bio to refer to new products they're selling, and because they follow thousands of their "friends", the checklist just becomes insanely long very quickly. For this reason, I decided to limit myself to blacklist accounts where "onlyfans" or "fansly" appears in the websites of bio links, and only add those "followings" in the checklist.
Another related problem is that some OF creators have aggressive marketing: they follow thousands of accounts just to get more followers. A thousand dead ends is one full day of my script running for nothing. To palliate for that, I only add accounts to the checklist if the number of accounts following lesser than 200 (to make sure that it's not aggressive marketing). However, this is not the general rule, and I've definitely seen OF creators with 1 millions subs and 1 or 2 thousands of people they're following. So a better way would be to define a following/follower ratio, but I need to investigate that more.
The blacklist is a file that has user ID, name, date of last check, list of links in bio, and all trigger words.
The idea is that the procedure is separated in 3 scripts. First script to build the network, checking a bunch of accounts and adding them to the black list. Second script to double-check accounts in the blacklist: have a refined keyword list, or just re-check accounts with an updated version without the need to re-build the whole network. Third script: ban everyone based on the user preference (either ban OF, or ban everyone who has a "shop", or any type of link, up to you!). Maybe 4th script to unban all?
Anyway, that's all that comes to my mind. Any thoughts, comments, suggestions, critics are welcome!
Sincerely yours~
an0wen
Beta Was this translation helpful? Give feedback.
All reactions