Skip to content

Commit

Permalink
less shitty way of counting digits
Browse files Browse the repository at this point in the history
ya, it wasn’t counting floats
  • Loading branch information
elliewix committed Aug 27, 2016
1 parent 42161d2 commit 8392370
Show file tree
Hide file tree
Showing 1,010 changed files with 2,894 additions and 2,006 deletions.
Empty file added .Rhistory
Empty file.
6 changes: 6 additions & 0 deletions .ipynb_checkpoints/Untitled-checkpoint.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
{
"cells": [],
"metadata": {},
"nbformat": 4,
"nbformat_minor": 0
}
6 changes: 6 additions & 0 deletions .ipynb_checkpoints/Untitled1-checkpoint.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
{
"cells": [],
"metadata": {},
"nbformat": 4,
"nbformat_minor": 0
}
12 changes: 9 additions & 3 deletions data_profile.py
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,15 @@ def review_csv(file, mode = 'rt', headers = True, index_row = True, missing = ''
info['unique_value_content'] = "Not reported (More than 10 unique values)"
info['missing'] = data[i].count(missing)
info['percent_missing'] = "{:.0%}".format(info['missing'] / len(data[i]))
digits = len([d for d in data[i] if d.isdigit()])
#digits = len([d for d in data[i] if d.isdigit()])
dcount = 0
for d in data[i]:
try:
float(d)
dcount += 1
except:
pass #hahaha i'll pay for this
digits = dcount
totalvalues = len([d for d in data[i] if len(d) > 0])
if totalvalues == 0:
info['percent_digit'] = "no digits"
Expand Down Expand Up @@ -166,8 +174,6 @@ def main(source, target, missingcode):
all_file_data[f] = ({'file_metadata': finfo, \
'csv_basic': csvinfo['csv_basic'], \
'columns': csvinfo['cols']})
#print "looking at " + f
#print len(all_file_data)
make_md(f, all_file_data[f], headers, target)
write_name = target.split('/')[-2].split('.')[0] + '_DataProfiles.json'
with open(target + write_name, 'wt') as jsonout:
Expand Down
2 changes: 1 addition & 1 deletion fakes/0_DataProfile.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Data Profile for fakedata/0.csv

Generated on: 2016-Aug-11 20:46:47
Generated on: 2016-Aug-27 16:21:10


Number of columns: 10
Expand Down
2 changes: 1 addition & 1 deletion fakes/100_DataProfile.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Data Profile for fakedata/100.csv

Generated on: 2016-Aug-11 20:46:47
Generated on: 2016-Aug-27 16:21:10


Number of columns: 10
Expand Down
2 changes: 1 addition & 1 deletion fakes/101_DataProfile.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Data Profile for fakedata/101.csv

Generated on: 2016-Aug-11 20:46:47
Generated on: 2016-Aug-27 16:21:10


Number of columns: 10
Expand Down
2 changes: 1 addition & 1 deletion fakes/102_DataProfile.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Data Profile for fakedata/102.csv

Generated on: 2016-Aug-11 20:46:47
Generated on: 2016-Aug-27 16:21:10


Number of columns: 10
Expand Down
2 changes: 1 addition & 1 deletion fakes/103_DataProfile.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Data Profile for fakedata/103.csv

Generated on: 2016-Aug-11 20:46:47
Generated on: 2016-Aug-27 16:21:10


Number of columns: 10
Expand Down
2 changes: 1 addition & 1 deletion fakes/104_DataProfile.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Data Profile for fakedata/104.csv

Generated on: 2016-Aug-11 20:46:47
Generated on: 2016-Aug-27 16:21:10


Number of columns: 10
Expand Down
2 changes: 1 addition & 1 deletion fakes/105_DataProfile.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Data Profile for fakedata/105.csv

Generated on: 2016-Aug-11 20:46:47
Generated on: 2016-Aug-27 16:21:10


Number of columns: 10
Expand Down
2 changes: 1 addition & 1 deletion fakes/106_DataProfile.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Data Profile for fakedata/106.csv

Generated on: 2016-Aug-11 20:46:47
Generated on: 2016-Aug-27 16:21:10


Number of columns: 10
Expand Down
2 changes: 1 addition & 1 deletion fakes/107_DataProfile.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Data Profile for fakedata/107.csv

Generated on: 2016-Aug-11 20:46:47
Generated on: 2016-Aug-27 16:21:10


Number of columns: 10
Expand Down
2 changes: 1 addition & 1 deletion fakes/108_DataProfile.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Data Profile for fakedata/108.csv

Generated on: 2016-Aug-11 20:46:47
Generated on: 2016-Aug-27 16:21:10


Number of columns: 10
Expand Down
2 changes: 1 addition & 1 deletion fakes/109_DataProfile.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Data Profile for fakedata/109.csv

Generated on: 2016-Aug-11 20:46:47
Generated on: 2016-Aug-27 16:21:10


Number of columns: 10
Expand Down
2 changes: 1 addition & 1 deletion fakes/10_DataProfile.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Data Profile for fakedata/10.csv

Generated on: 2016-Aug-11 20:46:47
Generated on: 2016-Aug-27 16:21:10


Number of columns: 10
Expand Down
2 changes: 1 addition & 1 deletion fakes/110_DataProfile.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Data Profile for fakedata/110.csv

Generated on: 2016-Aug-11 20:46:47
Generated on: 2016-Aug-27 16:21:10


Number of columns: 10
Expand Down
2 changes: 1 addition & 1 deletion fakes/111_DataProfile.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Data Profile for fakedata/111.csv

Generated on: 2016-Aug-11 20:46:47
Generated on: 2016-Aug-27 16:21:10


Number of columns: 10
Expand Down
2 changes: 1 addition & 1 deletion fakes/112_DataProfile.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Data Profile for fakedata/112.csv

Generated on: 2016-Aug-11 20:46:47
Generated on: 2016-Aug-27 16:21:10


Number of columns: 10
Expand Down
2 changes: 1 addition & 1 deletion fakes/113_DataProfile.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Data Profile for fakedata/113.csv

Generated on: 2016-Aug-11 20:46:47
Generated on: 2016-Aug-27 16:21:10


Number of columns: 10
Expand Down
2 changes: 1 addition & 1 deletion fakes/114_DataProfile.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Data Profile for fakedata/114.csv

Generated on: 2016-Aug-11 20:46:47
Generated on: 2016-Aug-27 16:21:10


Number of columns: 10
Expand Down
2 changes: 1 addition & 1 deletion fakes/115_DataProfile.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Data Profile for fakedata/115.csv

Generated on: 2016-Aug-11 20:46:47
Generated on: 2016-Aug-27 16:21:10


Number of columns: 10
Expand Down
2 changes: 1 addition & 1 deletion fakes/116_DataProfile.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Data Profile for fakedata/116.csv

Generated on: 2016-Aug-11 20:46:47
Generated on: 2016-Aug-27 16:21:10


Number of columns: 10
Expand Down
2 changes: 1 addition & 1 deletion fakes/117_DataProfile.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Data Profile for fakedata/117.csv

Generated on: 2016-Aug-11 20:46:47
Generated on: 2016-Aug-27 16:21:10


Number of columns: 10
Expand Down
2 changes: 1 addition & 1 deletion fakes/118_DataProfile.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Data Profile for fakedata/118.csv

Generated on: 2016-Aug-11 20:46:47
Generated on: 2016-Aug-27 16:21:10


Number of columns: 10
Expand Down
2 changes: 1 addition & 1 deletion fakes/119_DataProfile.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Data Profile for fakedata/119.csv

Generated on: 2016-Aug-11 20:46:47
Generated on: 2016-Aug-27 16:21:10


Number of columns: 10
Expand Down
2 changes: 1 addition & 1 deletion fakes/11_DataProfile.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Data Profile for fakedata/11.csv

Generated on: 2016-Aug-11 20:46:47
Generated on: 2016-Aug-27 16:21:10


Number of columns: 10
Expand Down
2 changes: 1 addition & 1 deletion fakes/120_DataProfile.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Data Profile for fakedata/120.csv

Generated on: 2016-Aug-11 20:46:47
Generated on: 2016-Aug-27 16:21:10


Number of columns: 10
Expand Down
2 changes: 1 addition & 1 deletion fakes/121_DataProfile.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Data Profile for fakedata/121.csv

Generated on: 2016-Aug-11 20:46:47
Generated on: 2016-Aug-27 16:21:10


Number of columns: 10
Expand Down
2 changes: 1 addition & 1 deletion fakes/122_DataProfile.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Data Profile for fakedata/122.csv

Generated on: 2016-Aug-11 20:46:47
Generated on: 2016-Aug-27 16:21:10


Number of columns: 10
Expand Down
2 changes: 1 addition & 1 deletion fakes/123_DataProfile.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Data Profile for fakedata/123.csv

Generated on: 2016-Aug-11 20:46:47
Generated on: 2016-Aug-27 16:21:10


Number of columns: 10
Expand Down
2 changes: 1 addition & 1 deletion fakes/124_DataProfile.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Data Profile for fakedata/124.csv

Generated on: 2016-Aug-11 20:46:47
Generated on: 2016-Aug-27 16:21:10


Number of columns: 10
Expand Down
2 changes: 1 addition & 1 deletion fakes/125_DataProfile.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Data Profile for fakedata/125.csv

Generated on: 2016-Aug-11 20:46:47
Generated on: 2016-Aug-27 16:21:10


Number of columns: 10
Expand Down
2 changes: 1 addition & 1 deletion fakes/126_DataProfile.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Data Profile for fakedata/126.csv

Generated on: 2016-Aug-11 20:46:47
Generated on: 2016-Aug-27 16:21:10


Number of columns: 10
Expand Down
2 changes: 1 addition & 1 deletion fakes/127_DataProfile.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Data Profile for fakedata/127.csv

Generated on: 2016-Aug-11 20:46:47
Generated on: 2016-Aug-27 16:21:10


Number of columns: 10
Expand Down
2 changes: 1 addition & 1 deletion fakes/128_DataProfile.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Data Profile for fakedata/128.csv

Generated on: 2016-Aug-11 20:46:47
Generated on: 2016-Aug-27 16:21:10


Number of columns: 10
Expand Down
2 changes: 1 addition & 1 deletion fakes/129_DataProfile.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Data Profile for fakedata/129.csv

Generated on: 2016-Aug-11 20:46:47
Generated on: 2016-Aug-27 16:21:10


Number of columns: 10
Expand Down
2 changes: 1 addition & 1 deletion fakes/12_DataProfile.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Data Profile for fakedata/12.csv

Generated on: 2016-Aug-11 20:46:47
Generated on: 2016-Aug-27 16:21:10


Number of columns: 10
Expand Down
2 changes: 1 addition & 1 deletion fakes/130_DataProfile.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Data Profile for fakedata/130.csv

Generated on: 2016-Aug-11 20:46:47
Generated on: 2016-Aug-27 16:21:10


Number of columns: 10
Expand Down
2 changes: 1 addition & 1 deletion fakes/131_DataProfile.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Data Profile for fakedata/131.csv

Generated on: 2016-Aug-11 20:46:47
Generated on: 2016-Aug-27 16:21:10


Number of columns: 10
Expand Down
2 changes: 1 addition & 1 deletion fakes/132_DataProfile.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Data Profile for fakedata/132.csv

Generated on: 2016-Aug-11 20:46:47
Generated on: 2016-Aug-27 16:21:10


Number of columns: 10
Expand Down
2 changes: 1 addition & 1 deletion fakes/133_DataProfile.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Data Profile for fakedata/133.csv

Generated on: 2016-Aug-11 20:46:47
Generated on: 2016-Aug-27 16:21:11


Number of columns: 10
Expand Down
2 changes: 1 addition & 1 deletion fakes/134_DataProfile.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Data Profile for fakedata/134.csv

Generated on: 2016-Aug-11 20:46:47
Generated on: 2016-Aug-27 16:21:11


Number of columns: 10
Expand Down
2 changes: 1 addition & 1 deletion fakes/135_DataProfile.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Data Profile for fakedata/135.csv

Generated on: 2016-Aug-11 20:46:47
Generated on: 2016-Aug-27 16:21:11


Number of columns: 10
Expand Down
2 changes: 1 addition & 1 deletion fakes/136_DataProfile.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Data Profile for fakedata/136.csv

Generated on: 2016-Aug-11 20:46:47
Generated on: 2016-Aug-27 16:21:11


Number of columns: 10
Expand Down
2 changes: 1 addition & 1 deletion fakes/137_DataProfile.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Data Profile for fakedata/137.csv

Generated on: 2016-Aug-11 20:46:47
Generated on: 2016-Aug-27 16:21:11


Number of columns: 10
Expand Down
2 changes: 1 addition & 1 deletion fakes/138_DataProfile.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Data Profile for fakedata/138.csv

Generated on: 2016-Aug-11 20:46:47
Generated on: 2016-Aug-27 16:21:11


Number of columns: 10
Expand Down
2 changes: 1 addition & 1 deletion fakes/139_DataProfile.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Data Profile for fakedata/139.csv

Generated on: 2016-Aug-11 20:46:47
Generated on: 2016-Aug-27 16:21:11


Number of columns: 10
Expand Down
2 changes: 1 addition & 1 deletion fakes/13_DataProfile.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Data Profile for fakedata/13.csv

Generated on: 2016-Aug-11 20:46:47
Generated on: 2016-Aug-27 16:21:10


Number of columns: 10
Expand Down
Loading

0 comments on commit 8392370

Please sign in to comment.