-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit 8c732df
Showing
13 changed files
with
425 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
/target | ||
/classes | ||
/checkouts | ||
pom.xml | ||
pom.xml.asc | ||
*.jar | ||
*.class | ||
/.lein-* | ||
/.nrepl-port | ||
.hgignore | ||
.hg/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
The MIT License (MIT) | ||
|
||
Copyright (c) 2015 Zensight | ||
|
||
Permission is hereby granted, free of charge, to any person obtaining a copy | ||
of this software and associated documentation files (the "Software"), to deal | ||
in the Software without restriction, including without limitation the rights | ||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | ||
copies of the Software, and to permit persons to whom the Software is | ||
furnished to do so, subject to the following conditions: | ||
|
||
The above copyright notice and this permission notice shall be included in | ||
all copies or substantial portions of the Software. | ||
|
||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | ||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | ||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | ||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | ||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | ||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN | ||
THE SOFTWARE. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,58 @@ | ||
# header-utils | ||
|
||
[![Clojars][clojars-img]][clojars-url] | ||
[![Build Status][travis-image]][travis-url] | ||
[![MIT License][license-image]][license] | ||
![Phasers to stun][phasers-image] | ||
|
||
A Clojure library for handling gross things in HTTP headers. Specifically, it can encode and parse Content-Disposition headers with special characters, and exposes clean APIs for working RFC-5987. | ||
|
||
## Usage | ||
|
||
### Content-Disposition | ||
|
||
```clj | ||
user=> (use 'header-utils.content-disposition) | ||
nil | ||
user=> (def s (encode "attachment" "Y͢o҉u f̴ee̡l̡ ̶fée͝bl̢e.͡.pdf")) | ||
#'user/s | ||
user=> s | ||
"attachment;filename*=UTF-8''Y%CD%A2o%D2%89u%20f%CC%B4ee%CC%A1l%CC%A1%20%CC%B6f%C3%A9e%CD%9Dbl%CC%A2e.%CD%A1.pdf" | ||
user=> (parse-type s) | ||
"attachment" | ||
user=> (parse-filename s) | ||
"Y͢o҉u f̴ee̡l̡ ̶fée͝bl̢e.͡.pdf" | ||
``` | ||
|
||
You can also specify language and additional parameters. | ||
|
||
### Other tools | ||
|
||
My goal in writing this library was the handle Content-Disposition, but I took some pains to make the proximate tools as useful as possible. Specifically: | ||
|
||
* `header-utils.parameters` - encode and parse RFC-5987 parameters | ||
* `header-utils.encoding` - common tools for reading/writing header values | ||
* `header-utils.parser` - internal tool useful in extending the library (e.g. adding direct support for additional headers) | ||
|
||
## Todo | ||
|
||
This library contains all the utilities required for adding explicit support for other headers, and actually doing so should be relatively easy. I'll add that support if/when I need them or you send me a PR. | ||
|
||
## License | ||
|
||
Copyright © 2015 Zensight | ||
|
||
Distributed under the MIT License. See [LICENSE][] for more info. | ||
|
||
[documentation-url]: http://icambron.github.io/twix.js/docs.html | ||
|
||
[license-image]: http://img.shields.io/badge/license-MIT-blue.svg?style=flat-square | ||
[license]: LICENSE.md | ||
|
||
[clojars-url]: https://clojars.org/co.zensight/header-utils | ||
[clojars-img]: https://img.shields.io/clojars/v/co.zensight/header-utils.svg?style=flat-square | ||
|
||
[travis-url]: http://travis-ci.org/zensight/header-utils | ||
[travis-image]: http://img.shields.io/travis/zensight/header-utils/develop.svg?style=flat-square | ||
|
||
[phasers-image]: https://img.shields.io/badge/phasers-stun-green.svg?style=flat-square |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
(defproject co.zensight/header-utils "0.1.0-SNAPSHOT" | ||
:description "Tools for working HTTP headers" | ||
:url "http://github.com/zensight/header-utils" | ||
:license {:name "The MIT License (MIT)" | ||
:url "http://opensource.org/licenses/mit-license.html"} | ||
:scm {:name "git" | ||
:url "https://github.com/Zensight/file-buffer"} | ||
:dependencies [[org.clojure/clojure "1.6.0"] | ||
[instaparse "1.4.1"]]) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
;rfc-6266, see content-disposition.clj | ||
content-disposition = <LWSP> content-disposition-type *( <LWSP> <";"> <LWSP> content-disposition-param) | ||
content-disposition-type = token | ||
content-disposition-param = parameter | ||
|
||
;rfc-5987, see parameters.clj | ||
parameter = reg-parameter / ext-parameter | ||
reg-parameter = parmname LWSP "=" LWSP reg-value | ||
ext-parameter = parmname "*" LWSP "=" LWSP ext-value | ||
parmname = 1*attr-char | ||
|
||
ext-value = charset <"'"> [ language ] <"'"> ext-value-chars | ||
charset = "UTF-8" / "ISO-8859-1" / mime-charset | ||
mime-charset = 1*mime-charsetc | ||
mime-charsetc = ALPHA / DIGIT / #'[!#$%&+-^_`{}~]' | ||
language = *( ALPHA / DIGIT / "-" ) | ||
|
||
ext-value-chars = *( pct-encoded / attr-char ) | ||
|
||
pct-encoded = "%" HEXDIG HEXDIG | ||
attr-char = #'[^()<>@,;:\\"/\[\]?={} \t*\'%]' | ||
|
||
;rfc-2616, see parser.clj | ||
reg-value = token / quoted-string | ||
|
||
token = 1*token-char | ||
token-char = #'[^()<>@,;:\\"/\[\]?={} \t]' | ||
|
||
quoted-string = ( <DQUOTE> *( qdtext / quoted-pair ) <DQUOTE> ) | ||
qdtext = #'[^"]' | ||
quoted-pair = "\" DQUOTE |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,58 @@ | ||
(ns header-utils.content-disposition | ||
"Tools for encoding and parsing Content-Disposition headers according to RFC 6266." | ||
(:require [clojure.string :as str] | ||
[header-utils.parameters :as parm] | ||
[header-utils.parser :as p])) | ||
|
||
(def header-name "Content-Disposition") | ||
|
||
(def ^:private xform-6266 | ||
{:content-disposition (fn [& children] | ||
(reduce (fn [result [name & children]] | ||
(condp = name | ||
:content-disposition-type (assoc result :type (str/lower-case (first children))) | ||
:content-disposition-param (update-in result [:parameters] merge (first children)) | ||
result)) | ||
{} children))}) | ||
|
||
(defn- parse [value] | ||
(if (empty? value) | ||
nil | ||
(->> | ||
(p/parse value :content-disposition (merge xform-6266 parm/xform-5987)) | ||
(merge {:parameters []})))) | ||
|
||
(defn- disposition-type [parsed] | ||
(when parsed | ||
(:type parsed))) | ||
|
||
(defn- parameter [parsed name] | ||
(when parsed | ||
(parm/find-parameter (:parameters parsed) name))) | ||
|
||
(defn- filename [parsed] | ||
(parameter parsed "filename")) | ||
|
||
(def parse-type | ||
"Retrieve the disposition type from the Content-Disposition value. Typically \"inline\" or \"attachment\"." | ||
(comp disposition-type parse)) | ||
|
||
(def parse-filename | ||
"Retrieve the (decoded) filename from the Content-Disposition value. Prefers extended values to regular ones if both are present. May be nil." | ||
(comp filename parse)) | ||
|
||
(defn parse-parameter | ||
"Retrieve an arbitrary parameter from the Content-Disposition value." | ||
[value param-name] | ||
(-> value parse (parameter param-name))) | ||
|
||
(defn encode | ||
"Write the value of the Content-Disposition header (i.e. just the right-hand side) for a type, filename, an option language (e.g. \"en\"), and an optional map of other parameters. Filename may be nil." | ||
([type filename] (encode type filename nil {})) | ||
([type filename language more] | ||
(let [parameters (if filename (merge more {:filename filename}) more)] | ||
(->> | ||
(conj (map | ||
(fn [[k v]] (parm/encode (name k) v language)) parameters) | ||
type) | ||
(str/join ";"))))) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,52 @@ | ||
(ns header-utils.encoding | ||
"Tools for encoding and decoding strings in http headers." | ||
(:require [clojure.string :as s] | ||
[clojure.set :as se]) | ||
(:import [java.net URLDecoder URLEncoder])) | ||
|
||
;;Some of this is redundant with the grammar in the parser. That's a bummer, but it's hard to fix without inlining | ||
;;the grammar itself, which is a pain because Clojure lacks heredocs. | ||
(def separator-chars #{\( \) \< \> \@ \, \; \: \\ \" \/ \[ \] \? \= \{ \} \space \tab}) | ||
(def non-attr-chars (se/union #{\* \' \%} separator-chars)) | ||
|
||
(defn- normalize-charset [charset] | ||
(s/upper-case charset)) | ||
|
||
(defn- ascii? [c] | ||
(< 31 (int c) 127)) | ||
|
||
(defn- attr-char? [c] | ||
(and (ascii? c) | ||
(not (non-attr-chars c)))) | ||
|
||
(defn quote-str | ||
"Quote if needed, otherwise leave as-is." | ||
[value] | ||
(if (some separator-chars value) | ||
(as-> | ||
value | ||
$ | ||
(s/replace $ #"\\" "\\\\\\\\") ;;yes, 8 fucking backslashes | ||
(s/replace $ #"\"" "\\\\\"") | ||
(str "\"" $ "\"")) | ||
value)) | ||
|
||
(defn percent-decode | ||
"Decode %HEX HEX to the appropriate encoding." | ||
[value encoding] | ||
(URLDecoder/decode value (normalize-charset encoding))) | ||
|
||
(defn percent-encode | ||
"Encode with %HEX HEX for values outside of allowed attribute values." | ||
[value encoding] | ||
(as-> | ||
(for [c value] | ||
(if (and (attr-char? c) (not= \+ c)) ;;we're cheating here so that we can use URLEncoder, which replaces spaces with + | ||
c | ||
(URLEncoder/encode (str c) (normalize-charset encoding)))) | ||
$ | ||
(apply str $) | ||
(s/replace $ #"\+" "%20"))) | ||
|
||
(defn all-ascii? [value] | ||
(every? ascii? value)) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,47 @@ | ||
(ns header-utils.parameters | ||
"Tools for encoding and parsing http header parameters according to RFC 5987." | ||
(:require [clojure.string :as str] | ||
[header-utils.encoding :as e] | ||
[header-utils.parser :as p])) | ||
|
||
(def xform-5987 | ||
{:parameter identity | ||
:reg-parameter (fn [name & others] {name (p/value-in-tag :reg-value others)}) | ||
:ext-parameter (fn [name & others] {(str name "*") (p/value-in-tag :ext-value others)}) | ||
:ext-value (fn [& items] | ||
(let [charset (p/value-in-tag :charset items) | ||
value-chars (p/value-in-tag :ext-value-chars items)] | ||
[:ext-value (e/percent-decode value-chars (str/upper-case charset))])) | ||
:ext-value-chars (fn [& s] [:ext-value-chars (apply str s)]) | ||
:parmname str | ||
:mime-charsetc str | ||
:mime-charset str | ||
:attr-char str | ||
:pct-encoded str | ||
:HEXDIG str | ||
:ALPHA str | ||
:DIGIT str}) | ||
|
||
(defn encode | ||
"Encodes the name, value, and optionally language as an RFC-5897 header, handling all encoding for you. Returns a string that looks like | ||
name=value or name=quoted-value or name*=encoded-value, depending on the contents of the string and the options provided." | ||
([name value] (encode name value nil)) | ||
([name value language] | ||
(let [simple-name (.replace name "*" "")] | ||
;;I'm *super* unclear on whether ISO-8859-1 chars outside of US-ASCII are allowed in tokens or quoted strings. | ||
;;2616 implies they aren't but 5987 implies they are. We're going to assume they're not. | ||
(if (or language (not (e/all-ascii? value))) | ||
(str simple-name "*=" (str "UTF-8'" language "'" (e/percent-encode value "UTF-8"))) | ||
(str simple-name "=" (e/quote-str value)))))) | ||
|
||
(defn parse | ||
"Parse a parameter, e.g. `(parse \"name=value\")`. Handles encodings transparently. Currently discards language." | ||
[string] | ||
(p/parse string :parameter xform-5987)) | ||
|
||
(defn find-parameter | ||
"Find a value by `param-name` in a `param-map` (such as the one produced by calling `parse` and merging the results). Prefers extended versions of the parameters. Useful for eliding the difference between [param]* and [param]." | ||
[param-map param-name] | ||
(if-let [starred (get param-map (str param-name "*"))] | ||
starred | ||
(get param-map param-name))) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
(ns header-utils.parser | ||
"Parsing utilities for header-utils. Probably only useful in extending the library." | ||
(:require [instaparse.core :as insta])) | ||
|
||
(def ^:private parser (insta/parser (clojure.java.io/resource "grammar.txt") :input-format :abnf)) | ||
|
||
;;utilities for xforming the parsing results | ||
|
||
(defn tagged | ||
"Returns a function that takes an instaparse node and returns true if the node has name `tag`." | ||
[tag] | ||
#(#{tag} (first %))) | ||
|
||
(defn value-in-tag | ||
"Shortcut for finding the first value in the first node with a given tag." | ||
[tag list] (-> tag tagged (filter list) first second)) | ||
|
||
(def ^:private xform-2616 | ||
{:token str | ||
:token-char str | ||
:quoted-pair (constantly "\"") | ||
:qdtext str | ||
:quoted-string str}) | ||
|
||
(defn parse* | ||
"Given a string and a starting rule from the grammar, parse the string and return the raw instaparse tree. Useful for debugging." | ||
[value start] | ||
(binding [instaparse.abnf/*case-insensitive* true] | ||
(parser value :start start))) | ||
|
||
(defn parse | ||
"Given a string, a starting rule from the grammar, and a transformation map, return a parsed structure. Useful for extending this library." | ||
[value start xform] | ||
(->> | ||
(parse* value start) | ||
(insta/transform (merge xform xform-2616)))) | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
(ns header-utils.content-disposition-test | ||
(:require [header-utils.content-disposition :as cd] | ||
[clojure.test :refer :all])) | ||
|
||
(deftest filename-test | ||
(are [x y] (= (cd/parse-filename x) y) | ||
"Attachment; filename=example.html" "example.html" | ||
"attachment;\r\n filename*= UTF-8''%e2%82%ac%20rates" "€ rates" | ||
"attachment; filename=\"EURO rates\"; filename*=utf-8''%e2%82%ac%20rates" "€ rates" | ||
"attachment;filename=\"this has spaces.pdf\"" "this has spaces.pdf" | ||
"inline" nil | ||
"asdfas:e҉rasdfasfe;a*^*&F" nil | ||
"attachment:filename=improperly spaced" nil | ||
"" nil | ||
nil nil)) | ||
|
||
(deftest disposition-type-test | ||
(are [x y] (= (cd/parse-type x) y) | ||
"Attachment; filename=example.html" "attachment" | ||
"inline" "inline" | ||
"INLINE" "inline" | ||
"" nil | ||
nil nil)) | ||
|
||
(deftest parameter-test | ||
(are [x y] (= (cd/parse-parameter x "random") y) | ||
"Attachment;random=cheese" "cheese" | ||
"inline;filename=dude; random=goats" "goats" | ||
"inline;random*=UTF-8''special%D2%89" "special҉")) | ||
|
||
(deftest encode-test | ||
(are [x y] (= (apply cd/encode x) y) | ||
["inline" nil] "inline" | ||
["inline" "foo.txt"] "inline;filename=foo.txt" | ||
["inline" "foo bar.txt"] "inline;filename=\"foo bar.txt\"" | ||
["inline" "foo \" bar.txt"] "inline;filename=\"foo \\\" bar.txt\"" | ||
["inline" "special҉"] "inline;filename*=UTF-8''special%D2%89" | ||
["inline" "need-language" "en" {}] "inline;filename*=UTF-8'en'need-language" | ||
["inline" "more-params" nil {:foo "bar"}] "inline;filename=more-params;foo=bar" | ||
["inline" "more-params" nil {:foo "special҉"}] "inline;filename=more-params;foo*=UTF-8''special%D2%89")) | ||
|
Oops, something went wrong.