Skip to content

Commit

Permalink
added example + optimized dockerfile
Browse files Browse the repository at this point in the history
  • Loading branch information
hbt committed Nov 28, 2018
1 parent ba71197 commit 2524836
Show file tree
Hide file tree
Showing 487 changed files with 79,589 additions and 53 deletions.
58 changes: 17 additions & 41 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,41 +1,8 @@
FROM ubuntu:16.04

#// TODO(hbt) ENHANCE optimize

RUN apt-get update && apt-get install -y git
RUN apt-get update && apt-get install -y python python-pip

RUN mkdir /deps && cd /deps && \
git clone https://github.com/frost-nzcr4/find_forks

RUN cd /deps/find_forks && pip install -r requirements-prod.txt

#// TODO(hbt) NEXT testing rm
RUN mkdir /tests && cd /tests && git clone https://github.com/hbt/mouseless

RUN apt-get install -y curl gawk autoconf automake bison libffi-dev libgdbm-dev libncurses5-dev libsqlite3-dev libtool libyaml-dev pkg-config sqlite3 zlib1g-dev libgmp-dev libreadline6-dev libssl-dev
RUN gpg --keyserver hkp://keys.gnupg.net --recv-keys 409B6B1796C275462A1703113804BB82D39DC0E3 7D2BAF1CF37B13E2069D6956105BD0E739499BDB && curl -sSL https://get.rvm.io | bash -s stable --ruby=2.1.3

RUN cd /deps && git clone https://github.com/hbt/github-backup

#// TODO(hbt) NEXT optimize

RUN /bin/bash -l -c "rvm use 2.1.3"
RUN /bin/bash -l -c "gem install bundler --no-ri --no-rdoc"
RUN /bin/bash -l -c "cd /deps/github-backup && bundle"
#RUN /bin/bash -l -c "ruby /deps/github-backup/bin/github-backup -u frost-nzcr4 -r find_forks -o /deps -f "
#https://github.com/frost-nzcr4/find_forks


#RUN cd /deps && git clone https://github.com/hbt/github-backup
#RUN cd /deps/github-backup && bundle

RUN apt-get install -y php

RUN cd / && git clone https://github.com/hbt/gitinspector

RUN apt-get install -y locales

RUN apt-get update && apt-get install -y git python python-pip curl gawk autoconf automake bison libffi-dev libgdbm-dev libncurses5-dev libsqlite3-dev libtool libyaml-dev pkg-config sqlite3 zlib1g-dev libgmp-dev libreadline6-dev libssl-dev php locales

RUN echo "LC_ALL=en_US.UTF-8" >> /etc/environment && \
echo "en_US.UTF-8 UTF-8" >> /etc/locale.gen && \
Expand All @@ -45,16 +12,25 @@ RUN echo "LC_ALL=en_US.UTF-8" >> /etc/environment && \
ENV LANG en_US.UTF-8
ENV LANGUAGE en_US:en
ENV LC_ALL en_US.UTF-8

RUN cd /deps && git clone https://github.com/hbt/git_stats

# install ruby
RUN gpg --keyserver hkp://keys.gnupg.net --recv-keys 409B6B1796C275462A1703113804BB82D39DC0E3 7D2BAF1CF37B13E2069D6956105BD0E739499BDB && curl -sSL https://get.rvm.io | bash -s stable --ruby=2.1.3

# clone deps and install
RUN cd /deps
RUN mkdir /deps && cd /deps && \
git clone https://github.com/frost-nzcr4/find_forks && \
git clone https://github.com/hbt/github-backup && \
git clone https://github.com/hbt/gitinspector && \
git clone https://github.com/hbt/git_stats

RUN cd /deps/find_forks && pip install -r requirements-prod.txt

RUN /bin/bash -l -c "rvm use 2.1.3 && gem install bundler --no-ri --no-rdoc && cd /deps/github-backup && bundle"
RUN /bin/bash -l -c "rvm use 2.1.3 && cd /deps/git_stats && bundle"


ENV PATH="/git-forks-analysis/bin:${PATH}"
ADD . /git-forks-analysis
ADD ./config/php.ini /etc/php/7.0/cli/php.ini


#// TODO(hbt) NEXT testing rm
#RUN find-forks-recursively -u frost-nzcr4 -r find_forks -o /deps -f
#RUN cd /tests/mouseless && find-forks

79 changes: 68 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,14 +16,11 @@ Analyze forks network to find interesting forks, commits, file changes.
* View other `Tips` below on how to search across the forks network using Git commands


[//]: # (// TODO(hbt) ENHANCE simplify Config)


// TODO(hbt) NEXT add instructions

## How to install it?

// TODO(hbt) NEXT add submodules init -- test from scratch
```bash

git clone https://github.com/hbt/git-forks-analysis
Expand All @@ -33,8 +30,6 @@ docker-compose pull hbtlabs/git-forks-analysis

## How to use it?

// TODO(hbt) ENHANCE check code highlight

To generate HTML visualization of forks

```
Expand All @@ -51,12 +46,6 @@ cd bin
```

view generated html files in `out` directory for:

- gitinspector in /out/repo.html e.g /out/mouseless.html
- quick_stats in /out/repo/quick_stats/index.html e.g /opt/mouseless/quick_stats/index.html


Calling the `gitinspector` directly via CLI

```bash
Expand All @@ -67,6 +56,74 @@ Calling the `gitinspector` directly via CLI

```

## What does it look like?

* gitinspector of mouseless [mouseless HTML](/example/mouseless/mouseless.html)
* gitinspector CLI output [mouseless CLI](/example/mouseless/mouseless.txt)
* quick_stats of mouseless [mouseless quick stats](/example/mouseless/git_stats/index.html)

## How to find interesting forks?

* Forks with stars / watchers -- use https://techgaun.github.io/active-forks/index.html
* Looks for commits count, insertions and deletion counts


## How to search for commits per file across all forks and branches?

```bash

cd out/mouseless
git log --all background_scripts/extension-reloader.js

# include diffs

git log -p --all background_scripts/extension-reloader.js


```

## How to search for changes in a function across all forks and branches?

```bash

# look for contributions to a specific function across all forks
git log --all -p -L ":getCenters":lib/model/PersonInfo.php --ignore-all-space --ignore-space-change --ignore-space-at-eol --ignore-blank-lines


# also accepts line numbers range
git log --all -p -L 13,20:lib/model/PersonInfo.php --ignore-all-space --ignore-space-change --ignore-space-at-eol --ignore-blank-lines

#modify ~/.gitattributes to add language support
#*.php diff=php
#*.js diff=node

#normalize the repo in case of ^M
#Note: this might fuck up some file formats (e.g binary, images etc.)
https://superuser.com/questions/293941/rewrite-git-history-to-replace-all-crlf-to-lf

**Note: perform all operations on tmpfs. Much faster**
#normalize the whole repo and its history
# specific file (much faster) -- few minutes
git filter-branch --tree-filter 'git ls-files lib/model/PersonInfo.php -z | xargs -0 fromdos' -- --all

#whole repo but takes longer
git filter-branch --tree-filter 'git ls-files -z | xargs -0 fromdos' -- --all


#Alternative if language is not properly supported is to use pickaxe
git log --all -p -S"function createHints" content_scripts/hints.js

# for some languages, --function-context works well
git log --all -p -ScreateHints --function-context content_scripts/hints.js


```


## Other git data mining tools worth a mention

* [https://github.com/arzzen/git-quick-stats](https://github.com/arzzen/git-quick-stats)
* [https://github.com/src-d/hercules](https://github.com/src-d/hercules)


## Contribute: Get in touch if you have a git data mining tool recommendation
105 changes: 105 additions & 0 deletions example/mouseless/git_stats/activity/by_date.html

Large diffs are not rendered by default.

137 changes: 137 additions & 0 deletions example/mouseless/git_stats/activity/day_of_week.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,137 @@
<!DOCTYPE html>
<html>
<head>
<title>GitStats - mouseless</title>
<meta charset='utf-8'>
<link href='../assets/bootstrap/css/bootstrap.min.css' rel='stylesheet' type='text/css'>
<style>
body { padding-top: 60px; }
</style>
<link href='../assets/bootstrap/css/bootstrap-responsive.min.css' rel='stylesheet' type='text/css'>
<script src='../assets/jquery.min.js' type='text/javascript'></script>
<script src='../assets/bootstrap/js/bootstrap.min.js' type='text/javascript'></script>
<script src='../assets/highstock.js' type='text/javascript'></script>
<script src='../assets/exporting.js' type='text/javascript'></script>
<script src='../assets/export-csv.js' type='text/javascript'></script>
</head>
<body>
<div class='navbar navbar-fixed-top'>
<div class='navbar-inner'>
<div class='container'>
<a class='btn btn-navbar' data-target='.nav-collapse' data-toggle='collapse'>
<span class='icon-bar'></span>
<span class='icon-bar'></span>
<span class='icon-bar'></span>
</a>
<a class='brand' href='../index.html'>GitStats - mouseless</a>
<div class='nav-collapse collapse'>
<ul class='nav'>
<li class=''>
<a href='../general.html'>General</a>
</li>
<li class='active'>
<a href='by_date.html'>Activity</a>
</li>
<li class=''>
<a href='../authors/best_authors.html'>Authors</a>
</li>
<li class=''>
<a href='../files/by_date.html'>Files</a>
</li>
<li class=''>
<a href='../lines/by_date.html'>Lines</a>
</li>
<li class=''>
<a href='../comments/by_date.html'>Comments</a>
</li>
</ul>
</div>
</div>
</div>
</div>
<div class='container'>
<div class='tabbable tabs-left'>
<ul class='nav nav-tabs'>
<li class=''>
<a href='by_date.html'>Activity by date</a>
</li>
<li class=''>
<a href='hour_of_day.html'>Hour of day</a>
</li>
<li class='active'>
<a href='day_of_week.html'>Day of week</a>
</li>
<li class=''>
<a href='hour_of_week.html'>Hour of week</a>
</li>
<li class=''>
<a href='month_of_year.html'>Month of year</a>
</li>
<li class=''>
<a href='year.html'>Year</a>
</li>
<li class=''>
<a href='year_month.html'>Year and month</a>
</li>
</ul>
<div class='tab-content'>
<div class='tab-pane active'>
<div class='page-header pagination-centered'>
<h1>Day of week</h1>
<h3> </h3>
</div>
</div>
<table class='table table-bordered table-condensed'>
<tr>
<th>Day</th>
<th>Sun</th>
<th>Mon</th>
<th>Tue</th>
<th>Wed</th>
<th>Thu</th>
<th>Fri</th>
<th>Sat</th>
</tr>
<tr>
<th>Commits</th>
<td>151</td>
<td>217</td>
<td>178</td>
<td>147</td>
<td>156</td>
<td>221</td>
<td>226</td>
</tr>
<tr>
<th>Percentage</th>
<td>11.7</td>
<td>16.7</td>
<td>13.7</td>
<td>11.3</td>
<td>12.0</td>
<td>17.1</td>
<td>17.4</td>
</tr>
</table>
<script type="text/javascript">
(function() {

var onload = window.onload;
window.onload = function(){
if (typeof onload == "function") onload();
var options = { "title": { "text": "" },"legend": { "enabled": false },"xAxis": { "title": { "text": "Day" },"categories": [ "Sun","Mon","Tue","Wed","Thu","Fri","Sat" ] },"yAxis": { "title": { "text": "Commits" },"labels": { } },"tooltip": { "enabled": true },"credits": { "enabled": false },"plotOptions": { "areaspline": { } },"chart": { "defaultSeriesType": "line","renderTo": "charts.activity_by_wday","type": "column" },"subtitle": { },"series": [{ "name": "Commits by day of week","data": [ 151,217,178,147,156,221,226 ] }] };

window.chart_charts.activity_by_wday = new Highcharts.Chart(options);

};
})()
</script>

<div id="charts.activity_by_wday"></div>
</div>
</div>


</div>
</body>
</html>
Loading

0 comments on commit 2524836

Please sign in to comment.