Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

InstaPy 0.7.0 #4279

Open
breuerfelix opened this issue Apr 7, 2019 · 20 comments
Open

InstaPy 0.7.0 #4279

breuerfelix opened this issue Apr 7, 2019 · 20 comments

Comments

@breuerfelix
Copy link
Collaborator

breuerfelix commented Apr 7, 2019

Ideas / Refactoring

I stumbled across multiple really good ideas for big refactoring.
Before really doing this, we should have a place to discuss the road to 5.0.

You wanna discuss about an idea written here ? Quote the part and answer with your own thoughts ! Quoting is important since this post might change really often, so everybody still knows what the state on your comment was :)

API ideas by @ishandutta2007

I have been meaning to stir this discussion , maybe this is the right thread, we have way too many APIs to do the same or similar things. This kind of design is making it a hell to maintain the code.

How I see is there should be one and only one api which does this three steps:

Choose target user(by tag, or by location, or by hashtag or from Followers/Following of a user, from a specific list).
Choose the desired item of that user(post or profile or comment)
Interact with that item(follow or like or comment).
If we see these three steps separaly we may enhance either of one of these three steps iteratively than trying to create new workflows which are essentially new combination of a usecase of these three steps.
You can expose these three APIs separately or wrap with one. To start we can create a universal wrapper and refactor incrementally later as and when we get opportunity.

HTTP API by @converge

would be awesome if you could wrap up your thoughts here

Extension Management by @breuerfelix

InstaPy package gets an additional folder named extensions. Every Extension (like Clarifai in this example) gets its own folder in this directory.
Extensions will be programmed only in their folder. This will keep the main codebase clean.

Interaction of Extensions with the core:
We create on Singleton called Event for example. Lets assume Clarifai want to prevent the core loop from liking / commenting some pictures.
There will be a function called event.before_interaction(image). The core function will call this function before liking / commenting a picture and if it returns true it will like it and it will skip it if it returns false.
The clarifai extension can then add its validator function for this event. event.add_listener('before_interaction', self.validate_picture). self.validate_picture will be a function which checks wether the image is inappropriate or not. (returning true or false).
event.before_interaction(image) is now calling ALL functions added as a listener to its own event and if any of those is returning false, it will return false. Only if all callback functions are returning true, the image will be liked.

The users can now just import their clarifai extensions in the main.py from instapy.extensions import clarifai and add these to the main instapy object instapy.add_extension(clarifai).
This function will then add all the Listeners to the event object.

That way the core package is able to interact with any amount of extensions without knowing them.
The event singleton may grow in the future and will be the only gate between extensions and the core.

Managing XPaths by @converge and @analyticsdept

Given the recent issue with XPaths, it would absolutely make sense to maintain them in a constants file separate to the functions so they can be updated in a single place hopefully minimising some risk to the code.

Idea: Make a new git repository which contains only the current XPaths. Use jsdelivr to read in the current xpaths from repos master branch on each startup.

this file may be edited by any collaborator ! add your ideas and thoughts.
not a collaborator ? answer to this issue and I will add your ideas into this post !

@analyticsdept
Copy link
Contributor

analyticsdept commented Apr 7, 2019

Thanks for starting this up @breuerfelix !

Firstly, I like the idea of APIs with singular functions; less hassle to manage and way simpler to implement new features on top. Also really stoked to see your thinking on how to handle extensions - it makes a lot of sense! As long as there's a defined standard around the available hooks it should be fine.

I have a few things to add...

Given the recent issue with XPaths, it would absolutely make sense to maintain them in a constants file separate to the functions so they can be updated in a single place hopefully minimising some risk to the code.

I've also been thinking about the best way to implement some kind of analytics into InstaPy to measure activity and be able to tie it to conversions like new followers. Either using local storage or pushing to your own server via an exposed endpoint.

Lastly - this is part refactor and part new feature - but it I feel it would make sense to build a web interface that allows the visualisation of the analytics but also allows for dynamic configuration of interactions. If you want to change who you're interacting with or what those interactions should be on the fly, this should be possible. I'm pretty certain this will be a massively useful feature for proper campaign management.

@timgrossmann timgrossmann changed the title InstaPy 5.0 InstaPy 0.5.0 Apr 8, 2019
@timgrossmann timgrossmann changed the title InstaPy 0.5.0 InstaPy 5.0 Apr 8, 2019
@breuerfelix
Copy link
Collaborator Author

Given the recent issue with XPaths, it would absolutely make sense to maintain them in a constants file separate to the functions so they can be updated in a single place hopefully minimising some risk to the code.

Really great idea! @converge had the idea to provide them as a cdn on each start of the bot with https://www.jsdelivr.com :) so people don't even have to update their bot for the updates xpaths, this just happens on its own !

Lastly - this is part refactor and part new feature - but it I feel it would make sense to build a web interface that allows the visualisation of the analytics but also allows for dynamic configuration of interactions. If you want to change who you're interacting with or what those interactions should be on the fly, this should be possible. I'm pretty certain this will be a massively useful feature for proper campaign management.

Thats what I already built ! it's only available for Gui backers at the moment but will be released fully open source soon, once its more stable :) if you are interested in developing this gui you can chat me on discord ! would be really great :)
Everything what is possible with the cli is already possible with the gui, and some basic analytics, but we need more open source help to get the analytics going :D

@breuerfelix
Copy link
Collaborator Author

so people don't even have to update their bot for the updates xpaths, this just happens on its own !

Yes ideally all remote settings, configs, xpaths or anything that can be treated as data and can change frequesntly should move to a serverside(preferably a keyvalue datastore) if you can afford a server, since it was a fully client side app so we had no other option.
So what do we have now for UX?
Remember We have another server for Pods as well, but its free, so souldn't over load it.

Since XPaths are a static Resource I would suggest using GitHub as a "database". We can access it via HTTP, everybody is able to see it's source, anytime, everybody is able to make issues, pull request, etc, to enhance the source or fix it.

@analyticsdept
Copy link
Contributor

so people don't even have to update their bot for the updates xpaths, this just happens on its own !

...
Since XPaths are a static Resource I would suggest using GitHub as a "database". We can access it via HTTP, everybody is able to see it's source, anytime, everybody is able to make issues, pull request, etc, to enhance the source or fix it.

I decided it would be good to test part of this concept out, so I've added a PR (https://github.com/timgrossmann/InstaPy/pull/4301) with this implemented using a local JSON file created by a compiler. I've kept the XPath reader as a separate utility so it can easily be modified to pull a remote file - we'd just have to consider the security implications of whichever approach the project ends up implementing. For the time being, just updating the JSON file to the latest version would be enough.

There are maybe some more efficient ways of doing it, but some not without refactoring all of the code anyway. Keen to take on any feedback and I'll keep updating it.

@VascoW
Copy link

VascoW commented Apr 15, 2019

Hi Folks,

first off: I love InstaPy and is finally a reason to really dive into Python a bit more. (and yes I'm a python noob but have used many other languages before so I pretend to be a fast learner).

Reading there are thoughts to consider a rework it may the right time to explain my struggles. Either the functionality is missing, it's there but badly described or I'm an idiot. For later one: please bear with me and simply ignore :)

I'm planning to interact with InstaPy via Telegram. Doing a quickstart.py and interacting with my telegram library works but some issues pop up. Namely:

  • custom logging seems to be rather limited. Grabbing results from the functions seems impossible to me. Still digging but perhaps a return on each function or a JSON object with some results would allow to grab those handy details. Some of the logging into the console is considered useless for me,instead I would prefer to focus on what matters. Surely someone (like me) will even use this to e.g. make sure to keep looking for potential followers until you have reached a desired number
  • running InstaPy in a function. It seems InstaPy need to be the main process. But if you e.g. use a telegrambot to initiate InstaPy with some specific settings you typically just want to call a function to run InstaPy. Some errors pop up if you try to do that (even yet it seems to work somehow). At this point I shall mention I still struggle to get the smurtrun concept which probably is the cruel pit in my thoughts :)

And general

  • delays. Being not in a well connected part of the world loading a webpage alone takes ages. With the additional sleeps from InstaPy I sometimes ending up with just 20 actions in a hour. Hiding a bot from Instagram sounds good but pretending to surf like my granny is perhaps the other extreme. It seems influencing the idle and sleep times is rather difficult. Some page reloads seem unnecessary and unnatural. So I hope there is some "central setting" considered to reflect preferences/style/56kmodem speeds while I trust the human behavior can be improved :)
  • crash on internet hiccup. I realized InstaPy crashes when the internet is gone. Sounds fair but again in my area not uncommon. Not sure how difficult it is but is there a way to error proof it and just "wait for return of internet" or close properly instead of crashing? Trust this ends up on the nice to have list unless someone much more clever than me has it done in 2 lines of code :)

@timgrossmann
Copy link
Collaborator

@sionking

crash on internet hiccup.

Coming back to what you PRed a while ago 😉

@samrap
Copy link

samrap commented Apr 21, 2019

On the custom logging front, it would be cool if you exposed an easier way to configure logging. I usually just override the logger with my own, using something like this:

def with_custom_logging(session):
    logger = logging.getLogger()
    // Add handlers, etc

    session.logger = logger

session = InstaPy()
with_custom_logging(session)

It'd be pretty cool if you could maybe expose a better API for doing this in a canonical way, or make it easier to discover how to override the logger in documentation.

@analyticsdept
Copy link
Contributor

On the custom logging front, it would be cool if you exposed an easier way to configure logging...
It'd be pretty cool if you could maybe expose a better API for doing this in a canonical way, or make it easier to discover how to override the logger in documentation.

@samrap I was thinking about doing something similar where you can set your logging endpoint in the config to either a URI or a function. Does that sound like what you had in mind?

I want to spend some time digging through the code again soon and propose a measurement strategy that we can hopefully move forward with.

@samrap
Copy link

samrap commented Jun 4, 2019

@analyticsdept I was imagining exposing a hook to easily configure log handlers, that way the user has full control of what the logging does.

For example, you could attach a stdout handler that emits everything, a JSON handler that logs structured logs at level WARN to a file, etc.

I'm pretty sure you can do this now with the code example I have above but it feels kind of hacky overriding the logger property of InstaPy. I think an API for instantiating InstaPy similar to Dave Cheney's Functional options for friendly APIs would be super cool! There's lots of instantiation right now that's hidden from the user but optionally exposing those would be a nice feature.

(disclaimer: I haven't put much thought into this, just outlining areas I've had headaches in and some potential solutions to investigate. Hopefully I get some time to come up with some more concrete ideas soon)

@breuerfelix breuerfelix changed the title InstaPy 5.0 InstaPy 0.6.0 Jun 4, 2019
@stale stale bot added the wontfix label Jul 4, 2019
@InstaPy InstaPy deleted a comment from stale bot Jul 4, 2019
@Tachenz
Copy link

Tachenz commented Aug 2, 2019

All these sound amazing but currently there is still a huge block issue, which affects the majority of the users...

@theveloped
Copy link

Hi everyone, so today I've been digging around in the code and I was looking at the idea of wrapping/refactoring the codebase to closely match the resources (e.g. tags, users, posts, comments).

In this way, every resource can have its own distinct class with certain actions (e.g. a post can be liked or commented on) and can be a source to derive other resources from (e.g. post commenters are user objects). The object themselves could have all the attributes needed for filtering, ordering etc. allowing this logic to be handled using regular python operations and easily combined to the high-level functions currently used by most InstaPy users.

I think InstaPy refactored/wrapped in this way would be extremely easy to use and be pretty much self-explanatory to use. Below is a quick working example of what I mean for the post resource. Let me know what you guys think.

#!/usr/bin/env python3
"""
Example of building specific classes for the different resurces on instagram.
"""
import time
from enum import Enum

from instapy import InstaPy
from instapy import smart_run
from instapy.like_util import get_links_for_tag, like_image
from instapy.commenters_util import extract_post_info
from instapy.comment_util import get_comments_count, is_commenting_enabled, comment_image, open_comment_section
from instapy.util import web_address_navigator, get_current_url, explicit_wait, extract_text_from_element
from instapy.xpath import read_xpath


class Comment(object):

    def __init__(self, link, user, text, date=None):
        self.link = link
        self.user = user
        self.text = text
        self.date = date

    # Used for working with sets
    def __hash__(self):
        return hash(self.link + self.user + self.text)

    # Used for working with sets
    def __eq__(self, other):
        if isinstance(other, type(self)):
            return hash(self) == hash(other)
        else:
            return False

    def __repr__(self):
        return "Comment({0}, {1}, {2})".format(hash(self), self.user, self.link)

    def __str__(self):
        return repr(self)


class Post(object):

    class Types(Enum):
        PHOTO = 0
        CAROUSEL = 1
        VIDEO = 2


    def __init__(self, link):
        self.link = link

        self.type = None
        self.user = None
        self.like_count = None
        self.comment_count = None


    def __hash__(self):
        return hash(self.link)


    def __eq__(self, other):
        if isinstance(other, type(self)):
            return hash(self) == hash(other)
        else:
            return False


    def __repr__(self):
        return "Post({0}, {1}, {2})".format(hash(self), self.type, self.link)


    def __str__(self):
        return repr(self)


    def show(self, session):
        web_address_navigator(session.browser, self.link)


    def count_likes(self, session, refresh=False):
        print("[+] counting likes")

        if not self.like_count or refresh:
            self.show(session)

            count = session.browser.execute_script(
                "return window._sharedData.entry_data."
                "PostPage[0].graphql.shortcode_media.edge_media_preview_like"
                ".count"
            )

            if not count:
                count = 0

            self.like_count = count

        print(" - {0} likes".format(self.like_count))
        return self.like_count


    def count_comments(self, session, refresh=False):
        print("[+] counting comments")

        if not self.comment_count or refresh:
            self.show(session)

            # Check commenting is available
            commenting_state, msg = is_commenting_enabled(session.browser, session.logger)

            if commenting_state:
                count, msg = get_comments_count(session.browser, session.logger)

            # Fallback in case of error
            if not count:
                count = 0

            self.comment_count = count

        print(" - {0} comments".format(self.comment_count))

        return self.comment_count


    def refresh(self, session):
        print("[+] refreshing post values")
        self.show(session)
        self.count_likes(session, refresh=True)
        self.count_comments(session, refresh=True)


    def like(self, session, verify=False):
        print("[+] liking post")
        self.show(session)

        if verify:
            like_image(session.browser, session.username, session.blacklist, session.logger, session.logfolder, 0)

        else:
            like_image(session.browser, session.username, session.blacklist, session.logger, session.logfolder, 1)


    def comment(self, session, comment=None):
        print("[+] comment on post")
        self.show(session)

        if comment:
            comment_image(session.browser, session.username, [comment], session.blacklist, session.logger, session.logfolder)

        else:
            comment_image(session.browser, session.username, session.comments, session.blacklist, session.logger, session.logfolder)


    # Retrieve all comments form a post
    # TODO: everything, including scroll for more comments and handle exceptions
    def get_comments(self, session, amount=None, randomize=False):
        print("[+] retrieving comments")
        self.show(session)

        if not amount:
            amount = self.count_comments(session)

        open_comment_section(session.browser, session.logger)
        link = get_current_url(session.browser)
        comments = set()
        time.sleep(3)

        # Comments block
        comments_block_XPath = read_xpath("get_comments_on_post", "comments_block")

        # wait for page fully load [IMPORTANT!]
        explicit_wait(session.browser, "PFL", [], session.logger, 10)
        comments_block = session.browser.find_elements_by_xpath(comments_block_XPath)

        for comment_line in comments_block:
            # Commeter that placed the comment
            commenter_elem = comment_line.find_element_by_xpath(read_xpath("get_comments_on_post", "commenter_elem"))
            commenter = extract_text_from_element(commenter_elem)

            # Text in the comment
            comment_elem = comment_line.find_elements_by_tag_name("span")[0]
            text = extract_text_from_element(comment_elem)

            # Make our comment object
            comment = Comment(link=link, user=commenter, text=text)
            comments.add(comment)

        return comments


class InstaSession(InstaPy):
    '''
    InstaPy Wrapper for use of decorators

    note: only used for demonstration NOT NEEDED
    '''

    def posts_from_tags(self, tags=[], amount=2, skip_top_posts=True, randomize=False, media=None):

        posts = set()
        for tag in tags:

            # Get post link per tag
            tag_links = get_links_for_tag(
                self.browser,
                tag,
                amount,
                skip_top_posts,
                randomize,
                media,
                self.logger,
            )

            # Convert to a post object
            for link in tag_links:
                post = Post(link)
                posts.add(post)

        return posts


def test():
    """ Main entry point of the app """

    insta_username = 'your_username'
    insta_password = 'your_password'
    session = InstaSession(username=insta_username, password=insta_password, headless_browser=False, show_logs=True)

    with smart_run(session):
        posts = session.posts_from_tags(tags=["instapy"], amount=2)

        for post in posts:
            print("[+] updating post: " + str(post))

            # Example of filling up an objec (e.g. like and comment count of a post)
            # post.refresh(session) # entire object in one go
            post.count_likes(session)
            post.count_comments(session)

            # Examples of performing an action on an object (e.g. for a post: liking/commenting)
            post.like(session)
            post.comment(session, comment="Damn!")

            # Retrieve comments
            comments = post.get_comments(session, amount=None, randomize=False)
            for comment in comments:
                print(" -  comment: " + str(comment))

if __name__ == "__main__":
    """ This is executed when run from the command line """
    test()

@analyticsdept
Copy link
Contributor

@theveloped this refactor is so clean. I'd be really keen to see how this turns out. If you need help, let me know!

@theveloped
Copy link

I've added a very basic version of the proposed wrapper on a fork here. The models (User, Post and Comment) can be found in /instapy/models and some tests in /tests.

I'm very curious if people can give some feedback on this approach and if the current approach would fit the InstaPy project as a whole (be suitable for a pull request later on). So let me know what you think.

@jeremycjang
Copy link

Would love to help test and troubleshoot this in lieu of the recent block issues that a lot of users have been experiencing

Has anyone been able to figure out what might be triggering the random action blocks for users that were otherwise fine running the same programs before? It seems like web requests sent by the program are identical to normal user interaction and the way elements are clicked doesn't make a difference as far as I can tell.

@breuerfelix breuerfelix changed the title InstaPy 0.6.0 InstaPy 0.7.0 Aug 13, 2019
@WaterBleu
Copy link

Hi All,

Also a Python Noob and also seasoned coder in other languages.
Would love to see refactoring happen to clean up the code, would also love to have some basic structure/interaction to the code base thus would allow more man power into the project

@Flaunkerton2395
Copy link

I have been running with the headless browser off and I see that when comments are typed out, they are typed out very quickly. Is it possible to include some kind of delay between typing out each letter so the behavior appears more human?

@juauzynhu
Copy link

I have been running with the headless browser off and I see that when comments are typed out, they are typed out very quickly. Is it possible to include some kind of delay between typing out each letter so the behavior appears more human?

I think we need random values in almost every timed functions, so Insta won't find a pattern in bot actions.

Another idea is a better analyitics and logging viewer.

@analyticsdept
Copy link
Contributor

@breuerfelix recently there have been a lot of issues (that I've seen and experienced) running chromedriver on Raspberry Pi - I took a look at the code in browser.py and it's totally geared for Firefox/Gecko so I'm doing a lot of rewriting and refactoring to be able to properly support Chrome.

What are your (and anyone else who wants to jump in) thoughts on supporting a single browser? It's easy enough to abstract the construction of the browser object but that has maintenance and support implications.

@Abdulsamipy
Copy link

@breuerfelix recently there have been a lot of issues (that I've seen and experienced) running chromedriver on Raspberry Pi - I took a look at the code in browser.py and it's totally geared for Firefox/Gecko so I'm doing a lot of rewriting and refactoring to be able to properly support Chrome.

What are your (and anyone else who wants to jump in) thoughts on supporting a single browser? It's easy enough to abstract the construction of the browser object but that has maintenance and support implications.

are you getting action block in chrome?

@kr6k3n
Copy link

kr6k3n commented Jul 15, 2020

@analyticsdept
@breuerfelix

Lastly - this is part refactor and part new feature - but it I feel it would make sense to build a web interface that allows the visualisation of the analytics but also allows for dynamic configuration of interactions. If you want to change who you're interacting with or what those interactions should be on the fly, this should be possible. I'm pretty certain this will be a massively useful feature for proper campaign management.

Hello, this is my first contribution/ interaction with this project.
I really like the idea of a web interface and I think it would even be greater if it were able to manage multiple accounts at the same time.
I would also like to suggest limiting sessions by time as a new feature.
On my part I have also implemented a script that allows me to automate posting a picture to instagram, I'd like to know if the Instapy project would be interested in that as well. Since this project is mainly about growth hacking, I think that it would be interesting to have features such as automatically reposting from certain instagram accounts or trending hashtags.
EDIT:
I also thought of another feature that could be added: we could implement a interaction strategy which would focus on your account's followers's followers. When you interact with them, they will see a "{username} follows {your account}" message and therefore those people will be more eager to follow you back.
Another nice thing about a web interface is that it could be easily embedded in an electron app

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests