Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alternative coding #4

Closed
kamil-kielczewski opened this issue Sep 1, 2020 · 2 comments
Closed

Alternative coding #4

kamil-kielczewski opened this issue Sep 1, 2020 · 2 comments

Comments

@kamil-kielczewski
Copy link
Owner

kamil-kielczewski commented Sep 1, 2020

Check octal coding idea proposed by jsfuck author aemkei here:

I like the idea of encoding the characters into numbers in a bootstrap to save space.

Have you thought about using octal sequences? This would save some some space per characters:

EG:

eval(eval("'91419154914591629164950961951'".replace(/9/g, "\")))
The bootstrap code is ~25k but maybe we can save some bytes by replacing the quotes or backspace.

'\141\154\145\162\164\50\61\51' in chrome console gives "alert(1)"

Check if this works with emoji/Chinese letters

@kamil-kielczewski
Copy link
Owner Author

kamil-kielczewski commented Sep 1, 2020

  1. This technique "\n" works when n is octal (base8) number form (0-377) (decimal: 0-255) - below code shows all caracters
[...Array(256)].map((x,i)=> `${i.toString(8)} ` +eval(`"\\${i.toString(8)}"`) )

As we see in that coding

  • letter must be replaced by 4 characters (3 digits+slash) e.g. "\101\132\141\172" gives "AZaz",
  • special1 char replaced by 4 characters (3digits+slash) e.g. "\100\133\134\135\137\140\173\174\175" gives @[\]_{|}`
  • number must be replaced by 3 characters (2 digits+slash) e.g. "\60\61\62\71"gives "0129",
  • special2 char replaced by 3 chars (2digit+slash): "\41\42\44\45\46\47\50\51\52\53\54\55\56\57\12\40" gives "!"$%&'()*+,-./ " and new line
  1. I write this tool to make statistics and calc proportion of 3char codes to 4char codes (100%=only 3char codes, 0%=only 4char codes) - if closer to 100% then more profit we get from switching base4 to base8

Results for example libs (average prop=33%)

  • react prop: 28% -> {"letters":40767,"spec1":1479,"numbers":140,"spec2":16750,"total":60644}
  • jquery prop: 29% -> {"letters":173935,"spec1":7136,"numbers":2170,"spec2":71996,"total":287629}
  • rxjs prop: 43% -> {"letters":184207,"spec1":8584,"numbers":832,"spec2":147876,"total":351229}
  • d3.js prop: 41% -> {"letters":294070,"spec1":15719,"numbers":23163,"spec2":193491,"total":550259}
  • charjs prop: 32% -> {"letters":345517,"spec1":15579,"numbers":9908,"spec2":162666,"total":578944}
  • three.jsprop: 27% -> {"letters":788424,"spec1":27764,"numbers":23129,"spec2":279741,"total":1263667}

Results for example minified libs (average prop=24%)

  • react min prop: 22% -> {"letters":4500,"spec1":255,"numbers":128,"spec2":1275,"total":6674}
  • jquery min prop: 26% -> {"letters":56760,"spec1":4985,"numbers":1170,"spec2":20574,"total":89475}
  • vuejs min prop: 26% -> {"letters":59605,"spec1":4747,"numbers":955,"spec2":22446,"total":93670}
  • rxjs min prop: 19% -> {"letters":91976,"spec1":5195,"numbers":884,"spec2":22165,"total":127570}
  • charjs min prop: 25% -> {"letters":147456,"spec1":10746,"numbers":7528,"spec2":46665,"total":226226}
  • d3.js min prop: 30% -> {"letters":157354,"spec1":13965,"numbers":17320,"spec2":58868,"total":265487}
  • three.js minprop: 20% -> {"letters":453285,"spec1":19428,"numbers":18070,"spec2":106569,"total":642740}

So minified lib have 24% of characters which can by write using 3 characters in base8. Non minified libs have 33% (probably due to many white-chars). But we can assume that users usually will convert minified code to get jsfuck version (because it is smaller).

  1. So in base8 we have 9 characters (0-7 and slash) for which we want to find shortest jsfuck representations - this is my proposition for this (i add + before each resentation because it must be used to concat with rest part of the string)
0 -> +(+!![])           //         1 ( 8 chars)
1 -> +!![]              //      true ( 5 chars)
2 -> +(+[])             //         0 ( 6 chars)
3 -> +[][[]]            // undefined ( 7 chars)
4 -> +(+[![]])          //       NaN ( 9 chars)
5 -> +(!![]+!![])       //         2 (12 chars)
6 -> +(![]+[])[+![]]    //         f (15 chars)
7 -> +(!![]+[])[+![]]   //         t (16 chars)` 
8 -> +![]               //     false ( 4 chars)
  • I choose shortest jsf representation for backslash \(8) (because it appear before each char).
  • I use second short jsf code to number 1 because 4characters base8 (which appears in >75% of minified code) starts always by 1 (codes >=200 are useless in typical app source-code))
  1. Character comparison base8 vs base4 with this tool - in this tool we use optimalized base4 map (in same way like for base8) details here Checek and change digits jsf representations  #1 ). After comparison it is clear that base8 have shorter codes only for this 7 characters !"#*+JK - however only 5 of them !"#*+ have 3char base8 representation (which give us profit).

  2. Conclusion: Lets assume we have ~95 critical ASCII characters set used in typical code (ASCII dec code 32-127). For this set only 5 characters !"#*+ chave 3char base8 representation and gieves us profit (others gives no profit or loose). 3char base8 representation exist in about ~25% of minimized typical libraries code and in ASCII we have ~32 characters with 3char base8 representation - this 15%. So we have 15% * 25% = 5% of input code we have profit, for 95% we have no profit or loose. It is not worth to implement this. You can test it by yourself by typing code in this tool

@kamil-kielczewski
Copy link
Owner Author

kamil-kielczewski commented Sep 2, 2020

Here: #6 is more promising modification of this approach wich can be checked and may be implemented in near future

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant