Thinking about a new ranking system #35

michaelrambeau · 2017-02-26T09:08:03Z

How could we rank projects in order to show the trending projects?
Could we assign a single score to all projects and show projects sorted by this score?

The Github API gives access to "stargazing" events.
For a given project we can get a list a events that tell us who added a star and when it was.

2 rules I'd like to handle when handling stargazing events:

Time effect

The effect of star gazing events decreases with time.
If someone added the star one year ago, it's far less important that a star added yesterday.

Follower effet

The more followers on Github a user has, the more important the event is.
A user with a lot of followers on Github is considered as an expert.

sradevski · 2017-07-18T11:21:26Z

@michaelrambeau I think ranking can be from very simplistic to a very complicated matter, so it is important to gradually improve it. I think the points you mentioned are top-notch, and definitely valuable for ranking libraries. Thinking from my personal perspective, these are the "features" that I keep an eye on a library and see if it is fit for my project (except for functionalities it offers):

Number of stars and number of forks (this one is obvious I think)
How active are the contributors? (frequency of commits)
How responsive are the contributors? (average duration for an issue to be resolved; ratio of open to closed issues)
How many "core" contributors are there? If there is a single person who commits 95% of the code, there is a higher risk of the library being abandoned.
How many companies have adopted the library (especially the bigger ones)? If companies make a commitment and begin depending on a library, it is more likely that they will put some effort to maintain it.

These are at the top of my mind now, and there are probably some other things that people keep in mind before deciding to adopt a library. Note that all of these are not a prerequisite for a library to be successful or well maintained, but they are nice heuristics for me to make a decision.

As for implementation, for a start, a simple weighted algorithm will probably do well, and as the number of features grows, it might be better to use some of the many machine learning libraries to do the job.

michaelrambeau · 2017-07-18T20:35:11Z

Thank you for your comments @sradevski

How active are the contributors? (frequency of commits)

How responsive are the contributors? (average duration for an issue to be resolved; ratio of open to closed issues)

About the points number 2 and 3, I think it's included in the "rating" provided by https://npms.io/.

Check the About page to read more about npms.io metrics: https://npms.io/about
Among other things, the score includes the following points:

Ratio of open issues vs. total issues
The time it takes to close issues
Most recent commit
Commit frequency
Release frequency

Do you think we should emphasize more the npms.io score? (because its meaning is not obvious!)

Anyway, I agree with your arguments and I'd love to see points like "frequency of commits" in a more visual way.

For example in the block where we display the last commit date, instead of the white background, we could display somehow a graph that shows the commit activity over the last months, does it make sense for you?

sradevski · 2017-07-22T05:46:13Z

@michaelrambeau I had no idea what the percentage means :) I think eventually it would be nice to have some visualization that summarizes some of these metrics. I actually wasn't aware of npms. Looking at it, it seems like it doesn't make much sense for us to implement metrics, but it is more reasonable if there is a metric we want, we make a pull request to npms. We can only take care of presenting the metrics in an easy-to-understand manner.

Going to presenting data, I will try to think of what might be a good way to visualize all the data that we can take from npms. I will let you know if I get some ideas.

michaelrambeau · 2017-07-22T11:42:19Z

@sradevski Thank you for your ideas Steve.

You're right, the displayed percentage does not make too much sense.
Maybe I should just hide it from the page.

Instead of this percentage, I'd rather see the number of dependencies of a given package, right from the list of projects pages, I think it would be more useful.

michaelrambeau added the discussion label Feb 26, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Thinking about a new ranking system #35

Thinking about a new ranking system #35

michaelrambeau commented Feb 26, 2017

sradevski commented Jul 18, 2017

michaelrambeau commented Jul 18, 2017

sradevski commented Jul 22, 2017

michaelrambeau commented Jul 22, 2017

Thinking about a new ranking system #35

Thinking about a new ranking system #35

Comments

michaelrambeau commented Feb 26, 2017

Time effect

Follower effet

sradevski commented Jul 18, 2017

michaelrambeau commented Jul 18, 2017

sradevski commented Jul 22, 2017

michaelrambeau commented Jul 22, 2017