#> # A tibble: 2 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
-#> 1 ada 227µs 227µs 4405. 2.49KB 0
-#> 2 urltools 229µs 229µs 4373. 2.49KB 0
+#> 1 ada 2.43ms 2.43ms 411. 2.49KB 0
+#> 2 urltools 526.26µs 526.26µs 1900. 2.49KB 0
For further benchmark results, see benchmark.md
in data_raw
.
There are four more groups of functions available to work with url parsing:
@@ -239,7 +239,7 @@ Dev status
-
+
diff --git a/pkgdown.yml b/pkgdown.yml
index 86a93b1..a0010a2 100644
--- a/pkgdown.yml
+++ b/pkgdown.yml
@@ -3,7 +3,7 @@ pkgdown: 2.0.7
pkgdown_sha: ~
articles:
adaR: adaR.html
-last_built: 2024-01-31T21:44Z
+last_built: 2024-02-01T13:42Z
urls:
reference: https://schochastics.github.io/adaR/reference
article: https://schochastics.github.io/adaR/articles
diff --git a/search.json b/search.json
index 81d492b..f743425 100644
--- a/search.json
+++ b/search.json
@@ -1 +1 @@
-[{"path":"https://schochastics.github.io/adaR/LICENSE.html","id":null,"dir":"","previous_headings":"","what":"MIT License","title":"MIT License","text":"Copyright (c) 2023 adaR authors Permission hereby granted, free charge, person obtaining copy software associated documentation files (“Software”), deal Software without restriction, including without limitation rights use, copy, modify, merge, publish, distribute, sublicense, /sell copies Software, permit persons Software furnished , subject following conditions: copyright notice permission notice shall included copies substantial portions Software. SOFTWARE PROVIDED “”, WITHOUT WARRANTY KIND, EXPRESS IMPLIED, INCLUDING LIMITED WARRANTIES MERCHANTABILITY, FITNESS PARTICULAR PURPOSE NONINFRINGEMENT. EVENT SHALL AUTHORS COPYRIGHT HOLDERS LIABLE CLAIM, DAMAGES LIABILITY, WHETHER ACTION CONTRACT, TORT OTHERWISE, ARISING , CONNECTION SOFTWARE USE DEALINGS SOFTWARE.","code":""},{"path":"https://schochastics.github.io/adaR/articles/adaR.html","id":"a-primer-on-urls","dir":"Articles","previous_headings":"","what":"A primer on URLs","title":"Introduction to adaR","text":"URL (Uniform Resource Locator) serves reference web resource specific components give information resource can fetched. table gives overview components valid URL. full URL might look something like : However, URLs can simple just scheme host (e.g., http://example.com). presence specific combination components can vary based exact nature purpose URL. terms necessarily unambiguous (sub) terms need explanation. protocol can also called scheme. hostname+port called host adaR. Additionally, query referred search fragment hash adaR. relevant subcomponents given following table. wait, . table gives definition several terms relevance dealing URLs adaR package.","code":"https://username:password@example.com:8080/directory/file.html?key1=value1&key2=value2#section2"},{"path":"https://schochastics.github.io/adaR/articles/adaR.html","id":"whatwg-compliant","dir":"Articles","previous_headings":"","what":"“WHATWG compliant”","title":"Introduction to adaR","text":"underlying C++ code adaR, ada-url “WHATWG copliant”. /WHATWG? Web Hypertext Application Technology Working Group (WHATWG) community people interested evolving web standards tests. founded individuals Apple, Mozilla Foundation, Opera Software 2004, W3C workshop. Apple, Mozilla Opera becoming increasingly concerned W3C’s direction XHTML, lack interest HTML, apparent disregard needs real-world web developers. , response, organisations set mission address concerns Web Hypertext Application Technology Working Group born. WHATWG working ? WHATWG’s focus standards implementable web browsers, associated tests. existing work can found . standard relevance package, url standard. “WHATWG compliant” means, ada-url follows url standard.","code":""},{"path":"https://schochastics.github.io/adaR/articles/adaR.html","id":"parsing-urls","dir":"Articles","previous_headings":"","what":"Parsing urls","title":"Introduction to adaR","text":"function ada_url_parse() decomposes url components shown first table. function can deal punycode percent encoding generally handle types edge cases well. ada_url_parse() power horse adaR always returns components URL. Specific components can parsed ada_get_*() set functions. ada_has_*() can used check certain components present . ada_set_*() can used set specific components URL. ada_clear_*() can used remove certain components.","code":"ada_url_parse(\"https://user_1:password_1@example.org:8080/dir/../api?q=1#frag\") #> href protocol username #> 1 https://user_1:password_1@example.org:8080/api?q=1#frag https: user_1 #> password host hostname port pathname search hash #> 1 password_1 example.org:8080 example.org 8080 /api ?q=1 #frag corner_cases <- c( \"https://example.com:8080\", \"http://user:password@example.com\", \"http://[2001:0db8:85a3:0000:0000:8a2e:0370:7334]:8080\", \"https://example.com/path/to/resource?query=value&another=thing#fragment\", \"http://sub.sub.example.com\", \"ftp://files.example.com:2121/download/file.txt\", \"http://example.com/path with spaces/and&special=characters?\", \"https://user:pa%40ssword@example.com/path\", \"http://example.com/..//a/b/../c/./d.html\", \"https://example.com:8080/over/under?query=param#and-a-fragment\", \"http://192.168.0.1/path/to/resource\", \"http://3com.com/path/to/resource\", \"http://example.com/%7Eusername/\", \"https://example.com/a?query=value&query=value2\", \"https://example.com/a/b/c/..\", \"ws://websocket.example.com:9000/chat\", \"https://example.com:65535/edge-case-port\", \"file:///home/user/file.txt\", \"http://example.com/a/b/c/%2F%2F\", \"http://example.com/a/../a/../a/../a/\", \"https://example.com/./././a/\", \"http://example.com:8080/a;b?c=d#e\", \"http://@example.com\", \"http://example.com/@test\", \"http://example.com/@@@/a/b\", \"https://example.com:0/\", \"http://example.com/%25path%20with%20encoded%20chars\", \"https://example.com/path?query=%26%3D%3F%23\", \"http://example.com:8080/?query=value#fragment#fragment2\", \"https://example.xn--80akhbyknj4f/path/to/resource\", \"https://example.co.uk/path/to/resource\", \"http://username:pass%23word@example.net\", \"ftp://downloads.example.edu:3030/files/archive.zip\", \"https://example.com:8080/this/is/a/deeply/nested/path/to/a/resource\", \"http://another-example.com/..//test/./demo.html\", \"https://sub2.sub1.example.org:5000/login?user=test#section2\", \"ws://chat.example.biz:5050/livechat\", \"http://192.168.1.100/a/b/c/d\", \"https://secure.example.shop/cart?item=123&quantity=5\", \"http://example.travel/%60%21%40%23%24%25%5E%26*()\", \"https://example.museum/path/to/artifact?search=ancient\", \"ftp://secure-files.example.co:4040/files/document.docx\", \"https://test.example.aero/booking?flight=abc123\", \"http://example.asia/%E2%82%AC%E2%82%AC/path\", \"http://subdomain.example.tel/contact?name=john\", \"ws://game-server.example.jobs:2020/match?id=xyz\", \"http://example.mobi/path/with/mobile/content\", \"https://example.name/family/tree?name=smith\", \"http://192.168.2.2/path?query1=value1&query2=value2\", \"http://example.pro/professional/services\", \"https://example.info/information/page\", \"http://example.int/internal/systems/login\", \"https://example.post/postal/services\", \"http://example.xxx/age/verification\", \"https://example.xxx/another/edge/case/path?with=query#and-fragment\" ) df <- ada_url_parse(corner_cases) df[, -1] #> protocol username password host #> 1 https: example.com:8080 #> 2 http: user password example.com #> 3 http: [2001:db8:85a3::8a2e:370:7334]:8080 #> 4 https: example.com #> 5 http: sub.sub.example.com #> 6 ftp: files.example.com:2121 #> 7 http: example.com #> 8 https: user pa@ssword example.com #> 9 http: example.com #> 10 https: example.com:8080 #> 11 http: 192.168.0.1 #> 12 http: 3com.com #> 13 http: example.com #> 14 https: example.com #> 15 https: example.com #> 16 ws: websocket.example.com:9000 #> 17 https: example.com:65535 #> 18 file: #> 19 http: example.com #> 20 http: example.com #> 21 https: example.com #> 22 http: example.com:8080 #> 23 http: example.com #> 24 http: example.com #> 25 http: example.com #> 26 https: example.com:0 #> 27 http: example.com #> 28 https: example.com #> 29 http: example.com:8080 #> 30 https: example.испытание #> 31 https: example.co.uk #> 32 http: username pass#word example.net #> 33 ftp: downloads.example.edu:3030 #> 34 https: example.com:8080 #> 35 http: another-example.com #> 36 https: sub2.sub1.example.org:5000 #> 37 ws: chat.example.biz:5050 #> 38 http: 192.168.1.100 #> 39 https: secure.example.shop #> 40 http: example.travel #> 41 https: example.museum #> 42 ftp: secure-files.example.co:4040 #> 43 https: test.example.aero #> 44 http: example.asia #> 45 http: subdomain.example.tel #> 46 ws: game-server.example.jobs:2020 #> 47 http: example.mobi #> 48 https: example.name #> 49 http: 192.168.2.2 #> 50 http: example.pro #> 51 https: example.info #> 52 http: example.int #> 53 https: example.post #> 54 http: example.xxx #> 55 https: example.xxx #> hostname port #> 1 example.com 8080 #> 2 example.com #> 3 [2001:db8:85a3::8a2e:370:7334] 8080 #> 4 example.com #> 5 sub.sub.example.com #> 6 files.example.com 2121 #> 7 example.com #> 8 example.com #> 9 example.com #> 10 example.com 8080 #> 11 192.168.0.1 #> 12 3com.com #> 13 example.com #> 14 example.com #> 15 example.com #> 16 websocket.example.com 9000 #> 17 example.com 65535 #> 18 #> 19 example.com #> 20 example.com #> 21 example.com #> 22 example.com 8080 #> 23 example.com #> 24 example.com #> 25 example.com #> 26 example.com 0 #> 27 example.com #> 28 example.com #> 29 example.com 8080 #> 30 example.испытание #> 31 example.co.uk #> 32 example.net #> 33 downloads.example.edu 3030 #> 34 example.com 8080 #> 35 another-example.com #> 36 sub2.sub1.example.org 5000 #> 37 chat.example.biz 5050 #> 38 192.168.1.100 #> 39 secure.example.shop #> 40 example.travel #> 41 example.museum #> 42 secure-files.example.co 4040 #> 43 test.example.aero #> 44 example.asia #> 45 subdomain.example.tel #> 46 game-server.example.jobs 2020 #> 47 example.mobi #> 48 example.name #> 49 192.168.2.2 #> 50 example.pro #> 51 example.info #> 52 example.int #> 53 example.post #> 54 example.xxx #> 55 example.xxx #> pathname search #> 1 / #> 2 / #> 3 / #> 4 /path/to/resource ?query=value&another=thing #> 5 / #> 6 /download/file.txt #> 7 /path with spaces/and&special=characters #> 8 /path #> 9 //a/c/d.html #> 10 /over/under ?query=param #> 11 /path/to/resource #> 12 /path/to/resource #> 13 /~username/ #> 14 /a ?query=value&query=value2 #> 15 /a/b/ #> 16 /chat #> 17 /edge-case-port #> 18 /home/user/file.txt #> 19 /a/b/c/// #> 20 /a/ #> 21 /a/ #> 22 /a;b ?c=d #> 23 / #> 24 /@test #> 25 /@@@/a/b #> 26 / #> 27 /%path with encoded chars #> 28 /path ?query=&=?# #> 29 / ?query=value #> 30 /path/to/resource #> 31 /path/to/resource #> 32 / #> 33 /files/archive.zip #> 34 /this/is/a/deeply/nested/path/to/a/resource #> 35 //test/demo.html #> 36 /login ?user=test #> 37 /livechat #> 38 /a/b/c/d #> 39 /cart ?item=123&quantity=5 #> 40 /`!@#$%^&*() #> 41 /path/to/artifact ?search=ancient #> 42 /files/document.docx #> 43 /booking ?flight=abc123 #> 44 /€€/path #> 45 /contact ?name=john #> 46 /match ?id=xyz #> 47 /path/with/mobile/content #> 48 /family/tree ?name=smith #> 49 /path ?query1=value1&query2=value2 #> 50 /professional/services #> 51 /information/page #> 52 /internal/systems/login #> 53 /postal/services #> 54 /age/verification #> 55 /another/edge/case/path ?with=query #> hash #> 1 #> 2 #> 3 #> 4 #fragment #> 5 #> 6 #> 7 #> 8 #> 9 #> 10 #and-a-fragment #> 11 #> 12 #> 13 #> 14 #> 15 #> 16 #> 17 #> 18 #> 19 #> 20 #> 21 #> 22 #e #> 23 #> 24 #> 25 #> 26 #> 27 #> 28 #> 29 #fragment#fragment2 #> 30 #> 31 #> 32 #> 33 #> 34 #> 35 #> 36 #section2 #> 37 #> 38 #> 39 #> 40 #> 41 #> 42 #> 43 #> 44 #> 45 #> 46 #> 47 #> 48 #> 49 #> 50 #> 51 #> 52 #> 53 #> 54 #> 55 #and-fragment ada_get_hostname(corner_cases) #> [1] \"example.com\" \"example.com\" #> [3] \"[2001:db8:85a3::8a2e:370:7334]\" \"example.com\" #> [5] \"sub.sub.example.com\" \"files.example.com\" #> [7] \"example.com\" \"example.com\" #> [9] \"example.com\" \"example.com\" #> [11] \"192.168.0.1\" \"3com.com\" #> [13] \"example.com\" \"example.com\" #> [15] \"example.com\" \"websocket.example.com\" #> [17] \"example.com\" \"\" #> [19] \"example.com\" \"example.com\" #> [21] \"example.com\" \"example.com\" #> [23] \"example.com\" \"example.com\" #> [25] \"example.com\" \"example.com\" #> [27] \"example.com\" \"example.com\" #> [29] \"example.com\" \"example.испытание\" #> [31] \"example.co.uk\" \"example.net\" #> [33] \"downloads.example.edu\" \"example.com\" #> [35] \"another-example.com\" \"sub2.sub1.example.org\" #> [37] \"chat.example.biz\" \"192.168.1.100\" #> [39] \"secure.example.shop\" \"example.travel\" #> [41] \"example.museum\" \"secure-files.example.co\" #> [43] \"test.example.aero\" \"example.asia\" #> [45] \"subdomain.example.tel\" \"game-server.example.jobs\" #> [47] \"example.mobi\" \"example.name\" #> [49] \"192.168.2.2\" \"example.pro\" #> [51] \"example.info\" \"example.int\" #> [53] \"example.post\" \"example.xxx\" #> [55] \"example.xxx\" ada_has_search(corner_cases) #> [1] FALSE FALSE FALSE TRUE FALSE FALSE TRUE FALSE FALSE TRUE FALSE FALSE #> [13] FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE #> [25] FALSE FALSE FALSE TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE TRUE #> [37] FALSE FALSE TRUE FALSE TRUE FALSE TRUE FALSE TRUE TRUE FALSE TRUE #> [49] TRUE FALSE FALSE FALSE FALSE FALSE TRUE ada_set_hostname(\"https://example.de/test\", \"example.com\") #> [1] \"https://example.com/test\" url <- \"https://user_1:password_1@example.org:8080/dir/../api?q=1#frag\" ada_clear_port(url) #> [1] \"https://user_1:password_1@example.org/api?q=1#frag\" ada_clear_hash(url) #> [1] \"https://user_1:password_1@example.org:8080/api?q=1\" ada_clear_search(url) #> [1] \"https://user_1:password_1@example.org:8080/api#frag\""},{"path":"https://schochastics.github.io/adaR/articles/adaR.html","id":"public-suffic-extraction","dir":"Articles","previous_headings":"","what":"Public suffic extraction","title":"Introduction to adaR","text":"package also implements public suffix extractor public_suffix(), based lookup Public Suffix List. Note list, include registry suffixes (e.g., com, co.uk), controlled domain name registry governed ICANN. include “private” suffixes (e.g., blogspot.com) allow people register subdomains. Hence, use term domain sense “top domain registry suffix”. See https://github.com/google/guava/wiki/InternetDomainNameExplained details. wondering last url. list also contains wildcard suffixes *.kawasaki.jp need matched.","code":"urls <- c( \"https://subsub.sub.domain.co.uk\", \"https://domain.api.gov.uk\", \"https://thisisnotpart.butthisispartoftheps.kawasaki.jp\" ) public_suffix(urls) #> [1] \"co.uk\" \"gov.uk\" #> [3] \"butthisispartoftheps.kawasaki.jp\""},{"path":"https://schochastics.github.io/adaR/authors.html","id":null,"dir":"","previous_headings":"","what":"Authors","title":"Authors and Citation","text":"David Schoch. Author, maintainer. Chung-hong Chan. Author. Yagiz Nizipli. Contributor, copyright holder. author ada-url : Daniel Lemire. Contributor, copyright holder. author ada-url : ","code":""},{"path":"https://schochastics.github.io/adaR/authors.html","id":"citation","dir":"","previous_headings":"","what":"Citation","title":"Authors and Citation","text":"Schoch D, Chan C (2024). adaR: Fast 'WHATWG' Compliant URL Parser. R package version 0.3.2, https://github.com/gesistsa/adaR, https://gesistsa.github.io/adaR/.","code":"@Manual{, title = {adaR: A Fast 'WHATWG' Compliant URL Parser}, author = {David Schoch and Chung-hong Chan}, year = {2024}, note = {R package version 0.3.2, https://github.com/gesistsa/adaR}, url = {https://gesistsa.github.io/adaR/}, }"},{"path":"https://schochastics.github.io/adaR/index.html","id":"adar-","dir":"","previous_headings":"","what":"A Fast WHATWG Compliant URL Parser","title":"A Fast WHATWG Compliant URL Parser","text":"adaR wrapper ada-url, WHATWG-compliant fast URL parser written modern C++ . implements several auxilliary functions work urls: public suffix extraction (top level domain excluding private domains) like psl fast c++ implementation utils::URLdecode (~40x speedup) general information URL parsing can found introductory vignette via vignette(\"adaR\"). adaR part series R packages analyse webtracking data: webtrackR: preprocess raw webtracking data domainator: classify domains adaR: parse urls","code":""},{"path":"https://schochastics.github.io/adaR/index.html","id":"installation","dir":"","previous_headings":"","what":"Installation","title":"A Fast WHATWG Compliant URL Parser","text":"can install development version adaR GitHub : version CRAN can installed ","code":"# install.packages(\"devtools\") devtools::install_github(\"gesistsa/adaR\") install.packages(\"adaR\")"},{"path":"https://schochastics.github.io/adaR/index.html","id":"example","dir":"","previous_headings":"","what":"Example","title":"A Fast WHATWG Compliant URL Parser","text":"basic example shows returned components URL. solves problems urltools complex urls. “raw” url parse using ada extremely fast (see ada-url.com) carry R tricky. performance still compatible urltools::url_parse noted advantage accuracy practical circumstances. benchmark results, see benchmark.md data_raw. four groups functions available work url parsing: ada_get_*() get specific component ada_has_*() check specific component present ada_set_*() set specific component URLS ada_clear_*() remove specific component URLS","code":"library(adaR) ada_url_parse(\"https://user_1:password_1@example.org:8080/dir/../api?q=1#frag\") #> href protocol username #> 1 https://user_1:password_1@example.org:8080/api?q=1#frag https: user_1 #> password host hostname port pathname search hash #> 1 password_1 example.org:8080 example.org 8080 /api ?q=1 #frag /* * https://user:pass@example.com:1234/foo/bar?baz#quux * | | | | ^^^^| | | * | | | | | | | `----- hash_start * | | | | | | `--------- search_start * | | | | | `----------------- pathname_start * | | | | `--------------------- port * | | | `----------------------- host_end * | | `---------------------------------- host_start * | `--------------------------------------- username_end * `--------------------------------------------- protocol_end */ urltools::url_parse(\"https://www.google.com/maps/place/Pennsylvania+Station/@40.7519848,-74.0015045,14. 7z/data=!4m5!3m4!1s0x89c259ae15b2adcb:0x7955420634fd7eba!8m2!3d40.750568!4d-73.993519\") #> scheme domain port #> 1 https 40.7519848,-74.0015045,14.\\n 7z #> path #> 1 data=!4m5!3m4!1s0x89c259ae15b2adcb:0x7955420634fd7eba!8m2!3d40.750568!4d-73.993519 #> parameter fragment #> 1 ada_url_parse(\"https://www.google.com/maps/place/Pennsylvania+Station/@40.7519848,-74.0015045,14.7z/data=!4m 5!3m4!1s0x89c259ae15b2adcb:0x7955420634fd7eba!8m2!3d40.750568!4d-73.993519\") #> href #> 1 https://www.google.com/maps/place/Pennsylvania+Station/@40.7519848,-74.0015045,14.7z/data=!4m 5!3m4!1s0x89c259ae15b2adcb:0x7955420634fd7eba!8m2!3d40.750568!4d-73.993519 #> protocol username password host hostname port #> 1 https: www.google.com www.google.com #> pathname #> 1 /maps/place/Pennsylvania+Station/@40.7519848,-74.0015045,14.7z/data=!4m 5!3m4!1s0x89c259ae15b2adcb:0x7955420634fd7eba!8m2!3d40.750568!4d-73.993519 #> search hash #> 1 bench::mark( ada = ada_url_parse(\"https://user_1:password_1@example.org:8080/dir/../api?q=1#frag\", decode = FALSE), urltools = urltools::url_parse(\"https://user_1:password_1@example.org:8080/dir/../api?q=1#frag\"), iterations = 1, check = FALSE ) #> # A tibble: 2 × 6 #> expression min median `itr/sec` mem_alloc `gc/sec` #> #> 1 ada 227µs 227µs 4405. 2.49KB 0 #> 2 urltools 229µs 229µs 4373. 2.49KB 0"},{"path":"https://schochastics.github.io/adaR/index.html","id":"public-suffix-extraction","dir":"","previous_headings":"","what":"Public Suffix extraction","title":"A Fast WHATWG Compliant URL Parser","text":"public_suffix() extracts top level domain public suffix list, excluding private domains. wondering last url. list also contains wildcard suffixes *.kawasaki.jp need matched.","code":"urls <- c( \"https://subsub.sub.domain.co.uk\", \"https://domain.api.gov.uk\", \"https://thisisnotpart.butthisispartoftheps.kawasaki.jp\" ) public_suffix(urls) #> [1] \"co.uk\" \"gov.uk\" #> [3] \"butthisispartoftheps.kawasaki.jp\""},{"path":"https://schochastics.github.io/adaR/index.html","id":"acknowledgement","dir":"","previous_headings":"","what":"Acknowledgement","title":"A Fast WHATWG Compliant URL Parser","text":"logo created portrait Ada Lovelace, early pioneer Computer Science.","code":""},{"path":"https://schochastics.github.io/adaR/reference/ada_clear_port.html","id":null,"dir":"Reference","previous_headings":"","what":"Clear a specific component of URL — ada_clear_port","title":"Clear a specific component of URL — ada_clear_port","text":"functions clears specific component URL.","code":""},{"path":"https://schochastics.github.io/adaR/reference/ada_clear_port.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Clear a specific component of URL — ada_clear_port","text":"","code":"ada_clear_port(url, decode = TRUE) ada_clear_hash(url, decode = TRUE) ada_clear_search(url, decode = TRUE)"},{"path":"https://schochastics.github.io/adaR/reference/ada_clear_port.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Clear a specific component of URL — ada_clear_port","text":"url character. one URL parsed decode logical. Whether decode output (see utils::URLdecode()), default TRUE","code":""},{"path":"https://schochastics.github.io/adaR/reference/ada_clear_port.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Clear a specific component of URL — ada_clear_port","text":"character, NA valid URL","code":""},{"path":"https://schochastics.github.io/adaR/reference/ada_clear_port.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Clear a specific component of URL — ada_clear_port","text":"","code":"url <- \"https://user_1:password_1@example.org:8080/dir/../api?q=1#frag\" ada_clear_port(url) #> [1] \"https://user_1:password_1@example.org/api?q=1#frag\" ada_clear_hash(url) #> [1] \"https://user_1:password_1@example.org:8080/api?q=1\" ada_clear_search(url) #> [1] \"https://user_1:password_1@example.org:8080/api#frag\""},{"path":"https://schochastics.github.io/adaR/reference/ada_get_href.html","id":null,"dir":"Reference","previous_headings":"","what":"Get a specific component of URL — ada_get_href","title":"Get a specific component of URL — ada_get_href","text":"functions get specific component URL.","code":""},{"path":"https://schochastics.github.io/adaR/reference/ada_get_href.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Get a specific component of URL — ada_get_href","text":"","code":"ada_get_href(url, decode = TRUE) ada_get_username(url, decode = TRUE) ada_get_password(url, decode = TRUE) ada_get_port(url, decode = TRUE) ada_get_hash(url, decode = TRUE) ada_get_host(url, decode = TRUE) ada_get_hostname(url, decode = TRUE) ada_get_pathname(url, decode = TRUE) ada_get_search(url, decode = TRUE) ada_get_protocol(url, decode = TRUE) ada_get_domain(url, decode = TRUE) ada_get_basename(url)"},{"path":"https://schochastics.github.io/adaR/reference/ada_get_href.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Get a specific component of URL — ada_get_href","text":"url character. one URL parsed decode logical. Whether decode output (see utils::URLdecode()), default TRUE","code":""},{"path":"https://schochastics.github.io/adaR/reference/ada_get_href.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Get a specific component of URL — ada_get_href","text":"character, NA valid URL","code":""},{"path":"https://schochastics.github.io/adaR/reference/ada_get_href.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Get a specific component of URL — ada_get_href","text":"","code":"url <- \"https://user_1:password_1@example.org:8080/dir/../api?q=1#frag\" ada_get_href(url) #> [1] \"https://user_1:password_1@example.org:8080/api?q=1#frag\" ada_get_username(url) #> [1] \"user_1\" ada_get_password(url) #> [1] \"password_1\" ada_get_port(url) #> [1] \"8080\" ada_get_hash(url) #> [1] \"#frag\" ada_get_host(url) #> [1] \"example.org:8080\" ada_get_hostname(url) #> [1] \"example.org\" ada_get_pathname(url) #> [1] \"/api\" ada_get_search(url) #> [1] \"?q=1\" ada_get_protocol(url) #> [1] \"https:\" ada_get_domain(url) #> [1] \"example.org\" ada_get_basename(url) #> [1] \"https://example.org\" ## these functions are vectorized urls <- c(\"http://www.google.com\", \"http://www.google.com:80\", \"noturl\") ada_get_port(urls) #> [1] \"\" \"\" NA"},{"path":"https://schochastics.github.io/adaR/reference/ada_has_credentials.html","id":null,"dir":"Reference","previous_headings":"","what":"Check if URL has a certain component — ada_has_credentials","title":"Check if URL has a certain component — ada_has_credentials","text":"functions check URL certain component.","code":""},{"path":"https://schochastics.github.io/adaR/reference/ada_has_credentials.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Check if URL has a certain component — ada_has_credentials","text":"","code":"ada_has_credentials(url) ada_has_empty_hostname(url) ada_has_hostname(url) ada_has_non_empty_username(url) ada_has_non_empty_password(url) ada_has_port(url) ada_has_hash(url) ada_has_search(url)"},{"path":"https://schochastics.github.io/adaR/reference/ada_has_credentials.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Check if URL has a certain component — ada_has_credentials","text":"url character. one URL parsed","code":""},{"path":"https://schochastics.github.io/adaR/reference/ada_has_credentials.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Check if URL has a certain component — ada_has_credentials","text":"logical, NA valid URL.","code":""},{"path":"https://schochastics.github.io/adaR/reference/ada_has_credentials.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Check if URL has a certain component — ada_has_credentials","text":"","code":"url <- c(\"https://user_1:password_1@example.org:8080/dir/../api?q=1#frag\") ada_has_credentials(url) #> [1] TRUE ada_has_empty_hostname(url) #> [1] FALSE ada_has_hostname(url) #> [1] TRUE ada_has_non_empty_username(url) #> [1] TRUE ada_has_non_empty_password(url) #> [1] TRUE ada_has_port(url) #> [1] TRUE ada_has_hash(url) #> [1] TRUE ada_has_search(url) #> [1] TRUE ## these functions are vectorized urls <- c(\"http://www.google.com\", \"http://www.google.com:80\", \"noturl\") ada_has_port(urls) #> [1] FALSE FALSE NA"},{"path":"https://schochastics.github.io/adaR/reference/ada_set_href.html","id":null,"dir":"Reference","previous_headings":"","what":"Set a specific component of URL — ada_set_href","title":"Set a specific component of URL — ada_set_href","text":"functions set specific component URL.","code":""},{"path":"https://schochastics.github.io/adaR/reference/ada_set_href.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Set a specific component of URL — ada_set_href","text":"","code":"ada_set_href(url, input, decode = TRUE) ada_set_username(url, input, decode = TRUE) ada_set_password(url, input, decode = TRUE) ada_set_port(url, input, decode = TRUE) ada_set_host(url, input, decode = TRUE) ada_set_hostname(url, input, decode = TRUE) ada_set_pathname(url, input, decode = TRUE) ada_set_protocol(url, input, decode = TRUE) ada_set_search(url, input, decode = TRUE) ada_set_hash(url, input, decode = TRUE)"},{"path":"https://schochastics.github.io/adaR/reference/ada_set_href.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Set a specific component of URL — ada_set_href","text":"url character. one URL parsed input character. containing new component URL. Vector length 1 length url. decode logical. Whether decode output (see utils::URLdecode()), default TRUE","code":""},{"path":"https://schochastics.github.io/adaR/reference/ada_set_href.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Set a specific component of URL — ada_set_href","text":"character, NA valid URL","code":""},{"path":"https://schochastics.github.io/adaR/reference/ada_set_href.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Set a specific component of URL — ada_set_href","text":"","code":"url <- \"https://user_1:password_1@example.org:8080/dir/../api?q=1#frag\" ada_set_href(url, \"https://google.de\") #> [1] \"https://google.de/\" ada_set_username(url, \"user_2\") #> [1] \"https://user_2:password_1@example.org:8080/api?q=1#frag\" ada_set_password(url, \"hunter2\") #> [1] \"https://user_1:hunter2@example.org:8080/api?q=1#frag\" ada_set_port(url, \"1234\") #> [1] \"https://user_1:password_1@example.org:1234/api?q=1#frag\" ada_set_hash(url, \"#section1\") #> [1] \"https://user_1:password_1@example.org:8080/api?q=1#section1\" ada_set_host(url, \"example.de\") #> [1] \"https://user_1:password_1@example.de:8080/api?q=1#frag\" ada_set_hostname(url, \"example.de\") #> [1] \"https://user_1:password_1@example.de:8080/api?q=1#frag\" ada_set_pathname(url, \"path/\") #> [1] \"https://user_1:password_1@example.org:8080/path/?q=1#frag\" ada_set_search(url, \"q=2\") #> [1] \"https://user_1:password_1@example.org:8080/api?q=2#frag\" ada_set_protocol(url, \"ws:\") #> [1] \"ws://user_1:password_1@example.org:8080/api?q=1#frag\""},{"path":"https://schochastics.github.io/adaR/reference/ada_url_parse.html","id":null,"dir":"Reference","previous_headings":"","what":"Use ada-url to parse a url — ada_url_parse","title":"Use ada-url to parse a url — ada_url_parse","text":"Use ada-url parse url","code":""},{"path":"https://schochastics.github.io/adaR/reference/ada_url_parse.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Use ada-url to parse a url — ada_url_parse","text":"","code":"ada_url_parse(url, decode = TRUE)"},{"path":"https://schochastics.github.io/adaR/reference/ada_url_parse.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Use ada-url to parse a url — ada_url_parse","text":"url character. one URL parsed decode logical. Whether decode output (see utils::URLdecode()), default TRUE","code":""},{"path":"https://schochastics.github.io/adaR/reference/ada_url_parse.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Use ada-url to parse a url — ada_url_parse","text":"data frame url components: href, protocol, username, password, host, hostname, port, pathname, search, hash","code":""},{"path":"https://schochastics.github.io/adaR/reference/ada_url_parse.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Use ada-url to parse a url — ada_url_parse","text":"details returned components refer introductory vignette.","code":""},{"path":"https://schochastics.github.io/adaR/reference/ada_url_parse.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Use ada-url to parse a url — ada_url_parse","text":"","code":"ada_url_parse(\"https://user_1:password_1@example.org:8080/dir/../api?q=1#frag\") #> href protocol username #> 1 https://user_1:password_1@example.org:8080/api?q=1#frag https: user_1 #> password host hostname port pathname search hash #> 1 password_1 example.org:8080 example.org 8080 /api ?q=1 #frag"},{"path":"https://schochastics.github.io/adaR/reference/public_suffix.html","id":null,"dir":"Reference","previous_headings":"","what":"Extract the public suffix from a vector of domains or hostnames — public_suffix","title":"Extract the public suffix from a vector of domains or hostnames — public_suffix","text":"Extract public suffix vector domains hostnames","code":""},{"path":"https://schochastics.github.io/adaR/reference/public_suffix.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Extract the public suffix from a vector of domains or hostnames — public_suffix","text":"","code":"public_suffix(domains)"},{"path":"https://schochastics.github.io/adaR/reference/public_suffix.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Extract the public suffix from a vector of domains or hostnames — public_suffix","text":"domains character. vector domains hostnames","code":""},{"path":"https://schochastics.github.io/adaR/reference/public_suffix.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Extract the public suffix from a vector of domains or hostnames — public_suffix","text":"public suffixes domains character vector","code":""},{"path":"https://schochastics.github.io/adaR/reference/public_suffix.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Extract the public suffix from a vector of domains or hostnames — public_suffix","text":"","code":"public_suffix(\"http://example.com\") #> [1] \"com\" # doesn't work for general URLs public_suffix(\"http://example.com/path/to/file\") #> [1] NA # extracting hostname first does the trick public_suffix(ada_get_hostname(\"http://example.com/path/to/file\")) #> [1] \"com\""},{"path":"https://schochastics.github.io/adaR/reference/url_decode2.html","id":null,"dir":"Reference","previous_headings":"","what":"Function to percent-decode characters in URLs — url_decode2","title":"Function to percent-decode characters in URLs — url_decode2","text":"Similar utils::URLdecode","code":""},{"path":"https://schochastics.github.io/adaR/reference/url_decode2.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Function to percent-decode characters in URLs — url_decode2","text":"","code":"url_decode2(url)"},{"path":"https://schochastics.github.io/adaR/reference/url_decode2.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Function to percent-decode characters in URLs — url_decode2","text":"url character vector","code":""},{"path":"https://schochastics.github.io/adaR/reference/url_decode2.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Function to percent-decode characters in URLs — url_decode2","text":"precent decoded URLs character vector","code":""},{"path":"https://schochastics.github.io/adaR/reference/url_decode2.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Function to percent-decode characters in URLs — url_decode2","text":"","code":"url_decode2(\"Hello%20World\") #> [1] \"Hello World\""},{"path":"https://schochastics.github.io/adaR/news/index.html","id":"adar-032","dir":"Changelog","previous_headings":"","what":"adaR 0.3.2","title":"adaR 0.3.2","text":"fixed #66","code":""},{"path":"https://schochastics.github.io/adaR/news/index.html","id":"adar-031","dir":"Changelog","previous_headings":"","what":"adaR 0.3.1","title":"adaR 0.3.1","text":"CRAN release: 2023-11-16 bumped ada-url 2.7.3 transferred repository schochastics gesistsa","code":""},{"path":"https://schochastics.github.io/adaR/news/index.html","id":"adar-030","dir":"Changelog","previous_headings":"","what":"adaR 0.3.0","title":"adaR 0.3.0","text":"CRAN release: 2023-10-16 bump ada_url version 2.7.0 #58 export ada_clear_*() functions #57 export ada_set_*() functions #15 h/t @chainsawriot c++ template added ada_get_basename() #56","code":""},{"path":"https://schochastics.github.io/adaR/news/index.html","id":"adar-020","dir":"Changelog","previous_headings":"","what":"adaR 0.2.0","title":"adaR 0.2.0","text":"CRAN release: 2023-10-01 split C++ file isolate original ada-url code h/t Chung-hong Chan (@chainsawriot) add support public suffix extraction #14 add support punycode #18 added url_decode2 fast alternative utils::URLdecode improved vectorization ada_get_* ada_has_* #26 #30 h/t Chung-hong Chan (@chainsawriot) fixed #47 h/t Chung-hong Chan (@chainsawriot) added ada_get_domain() #43","code":""},{"path":"https://schochastics.github.io/adaR/news/index.html","id":"adar-010","dir":"Changelog","previous_headings":"","what":"adaR 0.1.0","title":"adaR 0.1.0","text":"added ada_url_parser added ada_get_* error handling wrong urls #2 fixed #5 h/t Chung-hong Chan (@chainsawriot) add checks #7 vectorized functions #4 tests h/t Chung-hong Chan (@chainsawriot)","code":""}]
+[{"path":"https://schochastics.github.io/adaR/LICENSE.html","id":null,"dir":"","previous_headings":"","what":"MIT License","title":"MIT License","text":"Copyright (c) 2023 adaR authors Permission hereby granted, free charge, person obtaining copy software associated documentation files (“Software”), deal Software without restriction, including without limitation rights use, copy, modify, merge, publish, distribute, sublicense, /sell copies Software, permit persons Software furnished , subject following conditions: copyright notice permission notice shall included copies substantial portions Software. SOFTWARE PROVIDED “”, WITHOUT WARRANTY KIND, EXPRESS IMPLIED, INCLUDING LIMITED WARRANTIES MERCHANTABILITY, FITNESS PARTICULAR PURPOSE NONINFRINGEMENT. EVENT SHALL AUTHORS COPYRIGHT HOLDERS LIABLE CLAIM, DAMAGES LIABILITY, WHETHER ACTION CONTRACT, TORT OTHERWISE, ARISING , CONNECTION SOFTWARE USE DEALINGS SOFTWARE.","code":""},{"path":"https://schochastics.github.io/adaR/articles/adaR.html","id":"a-primer-on-urls","dir":"Articles","previous_headings":"","what":"A primer on URLs","title":"Introduction to adaR","text":"URL (Uniform Resource Locator) serves reference web resource specific components give information resource can fetched. table gives overview components valid URL. full URL might look something like : However, URLs can simple just scheme host (e.g., http://example.com). presence specific combination components can vary based exact nature purpose URL. terms necessarily unambiguous (sub) terms need explanation. protocol can also called scheme. hostname+port called host adaR. Additionally, query referred search fragment hash adaR. relevant subcomponents given following table. wait, . table gives definition several terms relevance dealing URLs adaR package.","code":"https://username:password@example.com:8080/directory/file.html?key1=value1&key2=value2#section2"},{"path":"https://schochastics.github.io/adaR/articles/adaR.html","id":"whatwg-compliant","dir":"Articles","previous_headings":"","what":"“WHATWG compliant”","title":"Introduction to adaR","text":"underlying C++ code adaR, ada-url “WHATWG copliant”. /WHATWG? Web Hypertext Application Technology Working Group (WHATWG) community people interested evolving web standards tests. founded individuals Apple, Mozilla Foundation, Opera Software 2004, W3C workshop. Apple, Mozilla Opera becoming increasingly concerned W3C’s direction XHTML, lack interest HTML, apparent disregard needs real-world web developers. , response, organisations set mission address concerns Web Hypertext Application Technology Working Group born. WHATWG working ? WHATWG’s focus standards implementable web browsers, associated tests. existing work can found . standard relevance package, url standard. “WHATWG compliant” means, ada-url follows url standard.","code":""},{"path":"https://schochastics.github.io/adaR/articles/adaR.html","id":"parsing-urls","dir":"Articles","previous_headings":"","what":"Parsing urls","title":"Introduction to adaR","text":"function ada_url_parse() decomposes url components shown first table. function can deal punycode percent encoding generally handle types edge cases well. ada_url_parse() power horse adaR always returns components URL. Specific components can parsed ada_get_*() set functions. ada_has_*() can used check certain components present . ada_set_*() can used set specific components URL. ada_clear_*() can used remove certain components.","code":"ada_url_parse(\"https://user_1:password_1@example.org:8080/dir/../api?q=1#frag\") #> href protocol username #> 1 https://user_1:password_1@example.org:8080/api?q=1#frag https: user_1 #> password host hostname port pathname search hash #> 1 password_1 example.org:8080 example.org 8080 /api ?q=1 #frag corner_cases <- c( \"https://example.com:8080\", \"http://user:password@example.com\", \"http://[2001:0db8:85a3:0000:0000:8a2e:0370:7334]:8080\", \"https://example.com/path/to/resource?query=value&another=thing#fragment\", \"http://sub.sub.example.com\", \"ftp://files.example.com:2121/download/file.txt\", \"http://example.com/path with spaces/and&special=characters?\", \"https://user:pa%40ssword@example.com/path\", \"http://example.com/..//a/b/../c/./d.html\", \"https://example.com:8080/over/under?query=param#and-a-fragment\", \"http://192.168.0.1/path/to/resource\", \"http://3com.com/path/to/resource\", \"http://example.com/%7Eusername/\", \"https://example.com/a?query=value&query=value2\", \"https://example.com/a/b/c/..\", \"ws://websocket.example.com:9000/chat\", \"https://example.com:65535/edge-case-port\", \"file:///home/user/file.txt\", \"http://example.com/a/b/c/%2F%2F\", \"http://example.com/a/../a/../a/../a/\", \"https://example.com/./././a/\", \"http://example.com:8080/a;b?c=d#e\", \"http://@example.com\", \"http://example.com/@test\", \"http://example.com/@@@/a/b\", \"https://example.com:0/\", \"http://example.com/%25path%20with%20encoded%20chars\", \"https://example.com/path?query=%26%3D%3F%23\", \"http://example.com:8080/?query=value#fragment#fragment2\", \"https://example.xn--80akhbyknj4f/path/to/resource\", \"https://example.co.uk/path/to/resource\", \"http://username:pass%23word@example.net\", \"ftp://downloads.example.edu:3030/files/archive.zip\", \"https://example.com:8080/this/is/a/deeply/nested/path/to/a/resource\", \"http://another-example.com/..//test/./demo.html\", \"https://sub2.sub1.example.org:5000/login?user=test#section2\", \"ws://chat.example.biz:5050/livechat\", \"http://192.168.1.100/a/b/c/d\", \"https://secure.example.shop/cart?item=123&quantity=5\", \"http://example.travel/%60%21%40%23%24%25%5E%26*()\", \"https://example.museum/path/to/artifact?search=ancient\", \"ftp://secure-files.example.co:4040/files/document.docx\", \"https://test.example.aero/booking?flight=abc123\", \"http://example.asia/%E2%82%AC%E2%82%AC/path\", \"http://subdomain.example.tel/contact?name=john\", \"ws://game-server.example.jobs:2020/match?id=xyz\", \"http://example.mobi/path/with/mobile/content\", \"https://example.name/family/tree?name=smith\", \"http://192.168.2.2/path?query1=value1&query2=value2\", \"http://example.pro/professional/services\", \"https://example.info/information/page\", \"http://example.int/internal/systems/login\", \"https://example.post/postal/services\", \"http://example.xxx/age/verification\", \"https://example.xxx/another/edge/case/path?with=query#and-fragment\" ) df <- ada_url_parse(corner_cases) df[, -1] #> protocol username password host #> 1 https: example.com:8080 #> 2 http: user password example.com #> 3 http: [2001:db8:85a3::8a2e:370:7334]:8080 #> 4 https: example.com #> 5 http: sub.sub.example.com #> 6 ftp: files.example.com:2121 #> 7 http: example.com #> 8 https: user pa@ssword example.com #> 9 http: example.com #> 10 https: example.com:8080 #> 11 http: 192.168.0.1 #> 12 http: 3com.com #> 13 http: example.com #> 14 https: example.com #> 15 https: example.com #> 16 ws: websocket.example.com:9000 #> 17 https: example.com:65535 #> 18 file: #> 19 http: example.com #> 20 http: example.com #> 21 https: example.com #> 22 http: example.com:8080 #> 23 http: example.com #> 24 http: example.com #> 25 http: example.com #> 26 https: example.com:0 #> 27 http: example.com #> 28 https: example.com #> 29 http: example.com:8080 #> 30 https: example.испытание #> 31 https: example.co.uk #> 32 http: username pass#word example.net #> 33 ftp: downloads.example.edu:3030 #> 34 https: example.com:8080 #> 35 http: another-example.com #> 36 https: sub2.sub1.example.org:5000 #> 37 ws: chat.example.biz:5050 #> 38 http: 192.168.1.100 #> 39 https: secure.example.shop #> 40 http: example.travel #> 41 https: example.museum #> 42 ftp: secure-files.example.co:4040 #> 43 https: test.example.aero #> 44 http: example.asia #> 45 http: subdomain.example.tel #> 46 ws: game-server.example.jobs:2020 #> 47 http: example.mobi #> 48 https: example.name #> 49 http: 192.168.2.2 #> 50 http: example.pro #> 51 https: example.info #> 52 http: example.int #> 53 https: example.post #> 54 http: example.xxx #> 55 https: example.xxx #> hostname port #> 1 example.com 8080 #> 2 example.com #> 3 [2001:db8:85a3::8a2e:370:7334] 8080 #> 4 example.com #> 5 sub.sub.example.com #> 6 files.example.com 2121 #> 7 example.com #> 8 example.com #> 9 example.com #> 10 example.com 8080 #> 11 192.168.0.1 #> 12 3com.com #> 13 example.com #> 14 example.com #> 15 example.com #> 16 websocket.example.com 9000 #> 17 example.com 65535 #> 18 #> 19 example.com #> 20 example.com #> 21 example.com #> 22 example.com 8080 #> 23 example.com #> 24 example.com #> 25 example.com #> 26 example.com 0 #> 27 example.com #> 28 example.com #> 29 example.com 8080 #> 30 example.испытание #> 31 example.co.uk #> 32 example.net #> 33 downloads.example.edu 3030 #> 34 example.com 8080 #> 35 another-example.com #> 36 sub2.sub1.example.org 5000 #> 37 chat.example.biz 5050 #> 38 192.168.1.100 #> 39 secure.example.shop #> 40 example.travel #> 41 example.museum #> 42 secure-files.example.co 4040 #> 43 test.example.aero #> 44 example.asia #> 45 subdomain.example.tel #> 46 game-server.example.jobs 2020 #> 47 example.mobi #> 48 example.name #> 49 192.168.2.2 #> 50 example.pro #> 51 example.info #> 52 example.int #> 53 example.post #> 54 example.xxx #> 55 example.xxx #> pathname search #> 1 / #> 2 / #> 3 / #> 4 /path/to/resource ?query=value&another=thing #> 5 / #> 6 /download/file.txt #> 7 /path with spaces/and&special=characters #> 8 /path #> 9 //a/c/d.html #> 10 /over/under ?query=param #> 11 /path/to/resource #> 12 /path/to/resource #> 13 /~username/ #> 14 /a ?query=value&query=value2 #> 15 /a/b/ #> 16 /chat #> 17 /edge-case-port #> 18 /home/user/file.txt #> 19 /a/b/c/// #> 20 /a/ #> 21 /a/ #> 22 /a;b ?c=d #> 23 / #> 24 /@test #> 25 /@@@/a/b #> 26 / #> 27 /%path with encoded chars #> 28 /path ?query=&=?# #> 29 / ?query=value #> 30 /path/to/resource #> 31 /path/to/resource #> 32 / #> 33 /files/archive.zip #> 34 /this/is/a/deeply/nested/path/to/a/resource #> 35 //test/demo.html #> 36 /login ?user=test #> 37 /livechat #> 38 /a/b/c/d #> 39 /cart ?item=123&quantity=5 #> 40 /`!@#$%^&*() #> 41 /path/to/artifact ?search=ancient #> 42 /files/document.docx #> 43 /booking ?flight=abc123 #> 44 /€€/path #> 45 /contact ?name=john #> 46 /match ?id=xyz #> 47 /path/with/mobile/content #> 48 /family/tree ?name=smith #> 49 /path ?query1=value1&query2=value2 #> 50 /professional/services #> 51 /information/page #> 52 /internal/systems/login #> 53 /postal/services #> 54 /age/verification #> 55 /another/edge/case/path ?with=query #> hash #> 1 #> 2 #> 3 #> 4 #fragment #> 5 #> 6 #> 7 #> 8 #> 9 #> 10 #and-a-fragment #> 11 #> 12 #> 13 #> 14 #> 15 #> 16 #> 17 #> 18 #> 19 #> 20 #> 21 #> 22 #e #> 23 #> 24 #> 25 #> 26 #> 27 #> 28 #> 29 #fragment#fragment2 #> 30 #> 31 #> 32 #> 33 #> 34 #> 35 #> 36 #section2 #> 37 #> 38 #> 39 #> 40 #> 41 #> 42 #> 43 #> 44 #> 45 #> 46 #> 47 #> 48 #> 49 #> 50 #> 51 #> 52 #> 53 #> 54 #> 55 #and-fragment ada_get_hostname(corner_cases) #> [1] \"example.com\" \"example.com\" #> [3] \"[2001:db8:85a3::8a2e:370:7334]\" \"example.com\" #> [5] \"sub.sub.example.com\" \"files.example.com\" #> [7] \"example.com\" \"example.com\" #> [9] \"example.com\" \"example.com\" #> [11] \"192.168.0.1\" \"3com.com\" #> [13] \"example.com\" \"example.com\" #> [15] \"example.com\" \"websocket.example.com\" #> [17] \"example.com\" \"\" #> [19] \"example.com\" \"example.com\" #> [21] \"example.com\" \"example.com\" #> [23] \"example.com\" \"example.com\" #> [25] \"example.com\" \"example.com\" #> [27] \"example.com\" \"example.com\" #> [29] \"example.com\" \"example.испытание\" #> [31] \"example.co.uk\" \"example.net\" #> [33] \"downloads.example.edu\" \"example.com\" #> [35] \"another-example.com\" \"sub2.sub1.example.org\" #> [37] \"chat.example.biz\" \"192.168.1.100\" #> [39] \"secure.example.shop\" \"example.travel\" #> [41] \"example.museum\" \"secure-files.example.co\" #> [43] \"test.example.aero\" \"example.asia\" #> [45] \"subdomain.example.tel\" \"game-server.example.jobs\" #> [47] \"example.mobi\" \"example.name\" #> [49] \"192.168.2.2\" \"example.pro\" #> [51] \"example.info\" \"example.int\" #> [53] \"example.post\" \"example.xxx\" #> [55] \"example.xxx\" ada_has_search(corner_cases) #> [1] FALSE FALSE FALSE TRUE FALSE FALSE TRUE FALSE FALSE TRUE FALSE FALSE #> [13] FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE #> [25] FALSE FALSE FALSE TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE TRUE #> [37] FALSE FALSE TRUE FALSE TRUE FALSE TRUE FALSE TRUE TRUE FALSE TRUE #> [49] TRUE FALSE FALSE FALSE FALSE FALSE TRUE ada_set_hostname(\"https://example.de/test\", \"example.com\") #> [1] \"https://example.com/test\" url <- \"https://user_1:password_1@example.org:8080/dir/../api?q=1#frag\" ada_clear_port(url) #> [1] \"https://user_1:password_1@example.org/api?q=1#frag\" ada_clear_hash(url) #> [1] \"https://user_1:password_1@example.org:8080/api?q=1\" ada_clear_search(url) #> [1] \"https://user_1:password_1@example.org:8080/api#frag\""},{"path":"https://schochastics.github.io/adaR/articles/adaR.html","id":"public-suffic-extraction","dir":"Articles","previous_headings":"","what":"Public suffic extraction","title":"Introduction to adaR","text":"package also implements public suffix extractor public_suffix(), based lookup Public Suffix List. Note list, include registry suffixes (e.g., com, co.uk), controlled domain name registry governed ICANN. include “private” suffixes (e.g., blogspot.com) allow people register subdomains. Hence, use term domain sense “top domain registry suffix”. See https://github.com/google/guava/wiki/InternetDomainNameExplained details. wondering last url. list also contains wildcard suffixes *.kawasaki.jp need matched.","code":"urls <- c( \"https://subsub.sub.domain.co.uk\", \"https://domain.api.gov.uk\", \"https://thisisnotpart.butthisispartoftheps.kawasaki.jp\" ) public_suffix(urls) #> [1] \"co.uk\" \"gov.uk\" #> [3] \"butthisispartoftheps.kawasaki.jp\""},{"path":"https://schochastics.github.io/adaR/authors.html","id":null,"dir":"","previous_headings":"","what":"Authors","title":"Authors and Citation","text":"David Schoch. Author, maintainer. Chung-hong Chan. Author. Yagiz Nizipli. Contributor, copyright holder. author ada-url : Daniel Lemire. Contributor, copyright holder. author ada-url : ","code":""},{"path":"https://schochastics.github.io/adaR/authors.html","id":"citation","dir":"","previous_headings":"","what":"Citation","title":"Authors and Citation","text":"Schoch D, Chan C (2024). adaR: Fast 'WHATWG' Compliant URL Parser. R package version 0.3.2, https://github.com/gesistsa/adaR, https://gesistsa.github.io/adaR/.","code":"@Manual{, title = {adaR: A Fast 'WHATWG' Compliant URL Parser}, author = {David Schoch and Chung-hong Chan}, year = {2024}, note = {R package version 0.3.2, https://github.com/gesistsa/adaR}, url = {https://gesistsa.github.io/adaR/}, }"},{"path":"https://schochastics.github.io/adaR/index.html","id":"adar-","dir":"","previous_headings":"","what":"A Fast WHATWG Compliant URL Parser","title":"A Fast WHATWG Compliant URL Parser","text":"adaR wrapper ada-url, WHATWG-compliant fast URL parser written modern C++ . implements several auxilliary functions work urls: public suffix extraction (top level domain excluding private domains) like psl fast c++ implementation utils::URLdecode (~40x speedup) general information URL parsing can found introductory vignette via vignette(\"adaR\"). adaR part series R packages analyse webtracking data: webtrackR: preprocess raw webtracking data domainator: classify domains adaR: parse urls","code":""},{"path":"https://schochastics.github.io/adaR/index.html","id":"installation","dir":"","previous_headings":"","what":"Installation","title":"A Fast WHATWG Compliant URL Parser","text":"can install development version adaR GitHub : version CRAN can installed ","code":"# install.packages(\"devtools\") devtools::install_github(\"gesistsa/adaR\") install.packages(\"adaR\")"},{"path":"https://schochastics.github.io/adaR/index.html","id":"example","dir":"","previous_headings":"","what":"Example","title":"A Fast WHATWG Compliant URL Parser","text":"basic example shows returned components URL. solves problems urltools complex urls. “raw” url parse using ada extremely fast (see ada-url.com) carry R tricky. performance still compatible urltools::url_parse noted advantage accuracy practical circumstances. benchmark results, see benchmark.md data_raw. four groups functions available work url parsing: ada_get_*() get specific component ada_has_*() check specific component present ada_set_*() set specific component URLS ada_clear_*() remove specific component URLS","code":"library(adaR) ada_url_parse(\"https://user_1:password_1@example.org:8080/dir/../api?q=1#frag\") #> href protocol username #> 1 https://user_1:password_1@example.org:8080/api?q=1#frag https: user_1 #> password host hostname port pathname search hash #> 1 password_1 example.org:8080 example.org 8080 /api ?q=1 #frag /* * https://user:pass@example.com:1234/foo/bar?baz#quux * | | | | ^^^^| | | * | | | | | | | `----- hash_start * | | | | | | `--------- search_start * | | | | | `----------------- pathname_start * | | | | `--------------------- port * | | | `----------------------- host_end * | | `---------------------------------- host_start * | `--------------------------------------- username_end * `--------------------------------------------- protocol_end */ urltools::url_parse(\"https://www.google.com/maps/place/Pennsylvania+Station/@40.7519848,-74.0015045,14. 7z/data=!4m5!3m4!1s0x89c259ae15b2adcb:0x7955420634fd7eba!8m2!3d40.750568!4d-73.993519\") #> scheme domain port #> 1 https 40.7519848,-74.0015045,14.\\n 7z #> path #> 1 data=!4m5!3m4!1s0x89c259ae15b2adcb:0x7955420634fd7eba!8m2!3d40.750568!4d-73.993519 #> parameter fragment #> 1 ada_url_parse(\"https://www.google.com/maps/place/Pennsylvania+Station/@40.7519848,-74.0015045,14.7z/data=!4m 5!3m4!1s0x89c259ae15b2adcb:0x7955420634fd7eba!8m2!3d40.750568!4d-73.993519\") #> href #> 1 https://www.google.com/maps/place/Pennsylvania+Station/@40.7519848,-74.0015045,14.7z/data=!4m 5!3m4!1s0x89c259ae15b2adcb:0x7955420634fd7eba!8m2!3d40.750568!4d-73.993519 #> protocol username password host hostname port #> 1 https: www.google.com www.google.com #> pathname #> 1 /maps/place/Pennsylvania+Station/@40.7519848,-74.0015045,14.7z/data=!4m 5!3m4!1s0x89c259ae15b2adcb:0x7955420634fd7eba!8m2!3d40.750568!4d-73.993519 #> search hash #> 1 bench::mark( ada = ada_url_parse(\"https://user_1:password_1@example.org:8080/dir/../api?q=1#frag\", decode = FALSE), urltools = urltools::url_parse(\"https://user_1:password_1@example.org:8080/dir/../api?q=1#frag\"), iterations = 1, check = FALSE ) #> # A tibble: 2 × 6 #> expression min median `itr/sec` mem_alloc `gc/sec` #> #> 1 ada 2.43ms 2.43ms 411. 2.49KB 0 #> 2 urltools 526.26µs 526.26µs 1900. 2.49KB 0"},{"path":"https://schochastics.github.io/adaR/index.html","id":"public-suffix-extraction","dir":"","previous_headings":"","what":"Public Suffix extraction","title":"A Fast WHATWG Compliant URL Parser","text":"public_suffix() extracts top level domain public suffix list, excluding private domains. wondering last url. list also contains wildcard suffixes *.kawasaki.jp need matched.","code":"urls <- c( \"https://subsub.sub.domain.co.uk\", \"https://domain.api.gov.uk\", \"https://thisisnotpart.butthisispartoftheps.kawasaki.jp\" ) public_suffix(urls) #> [1] \"co.uk\" \"gov.uk\" #> [3] \"butthisispartoftheps.kawasaki.jp\""},{"path":"https://schochastics.github.io/adaR/index.html","id":"acknowledgement","dir":"","previous_headings":"","what":"Acknowledgement","title":"A Fast WHATWG Compliant URL Parser","text":"logo created portrait Ada Lovelace, early pioneer Computer Science.","code":""},{"path":"https://schochastics.github.io/adaR/reference/ada_clear_port.html","id":null,"dir":"Reference","previous_headings":"","what":"Clear a specific component of URL — ada_clear_port","title":"Clear a specific component of URL — ada_clear_port","text":"functions clears specific component URL.","code":""},{"path":"https://schochastics.github.io/adaR/reference/ada_clear_port.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Clear a specific component of URL — ada_clear_port","text":"","code":"ada_clear_port(url, decode = TRUE) ada_clear_hash(url, decode = TRUE) ada_clear_search(url, decode = TRUE)"},{"path":"https://schochastics.github.io/adaR/reference/ada_clear_port.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Clear a specific component of URL — ada_clear_port","text":"url character. one URL parsed decode logical. Whether decode output (see utils::URLdecode()), default TRUE","code":""},{"path":"https://schochastics.github.io/adaR/reference/ada_clear_port.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Clear a specific component of URL — ada_clear_port","text":"character, NA valid URL","code":""},{"path":"https://schochastics.github.io/adaR/reference/ada_clear_port.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Clear a specific component of URL — ada_clear_port","text":"","code":"url <- \"https://user_1:password_1@example.org:8080/dir/../api?q=1#frag\" ada_clear_port(url) #> [1] \"https://user_1:password_1@example.org/api?q=1#frag\" ada_clear_hash(url) #> [1] \"https://user_1:password_1@example.org:8080/api?q=1\" ada_clear_search(url) #> [1] \"https://user_1:password_1@example.org:8080/api#frag\""},{"path":"https://schochastics.github.io/adaR/reference/ada_get_href.html","id":null,"dir":"Reference","previous_headings":"","what":"Get a specific component of URL — ada_get_href","title":"Get a specific component of URL — ada_get_href","text":"functions get specific component URL.","code":""},{"path":"https://schochastics.github.io/adaR/reference/ada_get_href.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Get a specific component of URL — ada_get_href","text":"","code":"ada_get_href(url, decode = TRUE) ada_get_username(url, decode = TRUE) ada_get_password(url, decode = TRUE) ada_get_port(url, decode = TRUE) ada_get_hash(url, decode = TRUE) ada_get_host(url, decode = TRUE) ada_get_hostname(url, decode = TRUE) ada_get_pathname(url, decode = TRUE) ada_get_search(url, decode = TRUE) ada_get_protocol(url, decode = TRUE) ada_get_domain(url, decode = TRUE) ada_get_basename(url)"},{"path":"https://schochastics.github.io/adaR/reference/ada_get_href.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Get a specific component of URL — ada_get_href","text":"url character. one URL parsed decode logical. Whether decode output (see utils::URLdecode()), default TRUE","code":""},{"path":"https://schochastics.github.io/adaR/reference/ada_get_href.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Get a specific component of URL — ada_get_href","text":"character, NA valid URL","code":""},{"path":"https://schochastics.github.io/adaR/reference/ada_get_href.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Get a specific component of URL — ada_get_href","text":"","code":"url <- \"https://user_1:password_1@example.org:8080/dir/../api?q=1#frag\" ada_get_href(url) #> [1] \"https://user_1:password_1@example.org:8080/api?q=1#frag\" ada_get_username(url) #> [1] \"user_1\" ada_get_password(url) #> [1] \"password_1\" ada_get_port(url) #> [1] \"8080\" ada_get_hash(url) #> [1] \"#frag\" ada_get_host(url) #> [1] \"example.org:8080\" ada_get_hostname(url) #> [1] \"example.org\" ada_get_pathname(url) #> [1] \"/api\" ada_get_search(url) #> [1] \"?q=1\" ada_get_protocol(url) #> [1] \"https:\" ada_get_domain(url) #> [1] \"example.org\" ada_get_basename(url) #> [1] \"https://example.org\" ## these functions are vectorized urls <- c(\"http://www.google.com\", \"http://www.google.com:80\", \"noturl\") ada_get_port(urls) #> [1] \"\" \"\" NA"},{"path":"https://schochastics.github.io/adaR/reference/ada_has_credentials.html","id":null,"dir":"Reference","previous_headings":"","what":"Check if URL has a certain component — ada_has_credentials","title":"Check if URL has a certain component — ada_has_credentials","text":"functions check URL certain component.","code":""},{"path":"https://schochastics.github.io/adaR/reference/ada_has_credentials.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Check if URL has a certain component — ada_has_credentials","text":"","code":"ada_has_credentials(url) ada_has_empty_hostname(url) ada_has_hostname(url) ada_has_non_empty_username(url) ada_has_non_empty_password(url) ada_has_port(url) ada_has_hash(url) ada_has_search(url)"},{"path":"https://schochastics.github.io/adaR/reference/ada_has_credentials.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Check if URL has a certain component — ada_has_credentials","text":"url character. one URL parsed","code":""},{"path":"https://schochastics.github.io/adaR/reference/ada_has_credentials.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Check if URL has a certain component — ada_has_credentials","text":"logical, NA valid URL.","code":""},{"path":"https://schochastics.github.io/adaR/reference/ada_has_credentials.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Check if URL has a certain component — ada_has_credentials","text":"","code":"url <- c(\"https://user_1:password_1@example.org:8080/dir/../api?q=1#frag\") ada_has_credentials(url) #> [1] TRUE ada_has_empty_hostname(url) #> [1] FALSE ada_has_hostname(url) #> [1] TRUE ada_has_non_empty_username(url) #> [1] TRUE ada_has_non_empty_password(url) #> [1] TRUE ada_has_port(url) #> [1] TRUE ada_has_hash(url) #> [1] TRUE ada_has_search(url) #> [1] TRUE ## these functions are vectorized urls <- c(\"http://www.google.com\", \"http://www.google.com:80\", \"noturl\") ada_has_port(urls) #> [1] FALSE FALSE NA"},{"path":"https://schochastics.github.io/adaR/reference/ada_set_href.html","id":null,"dir":"Reference","previous_headings":"","what":"Set a specific component of URL — ada_set_href","title":"Set a specific component of URL — ada_set_href","text":"functions set specific component URL.","code":""},{"path":"https://schochastics.github.io/adaR/reference/ada_set_href.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Set a specific component of URL — ada_set_href","text":"","code":"ada_set_href(url, input, decode = TRUE) ada_set_username(url, input, decode = TRUE) ada_set_password(url, input, decode = TRUE) ada_set_port(url, input, decode = TRUE) ada_set_host(url, input, decode = TRUE) ada_set_hostname(url, input, decode = TRUE) ada_set_pathname(url, input, decode = TRUE) ada_set_protocol(url, input, decode = TRUE) ada_set_search(url, input, decode = TRUE) ada_set_hash(url, input, decode = TRUE)"},{"path":"https://schochastics.github.io/adaR/reference/ada_set_href.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Set a specific component of URL — ada_set_href","text":"url character. one URL parsed input character. containing new component URL. Vector length 1 length url. decode logical. Whether decode output (see utils::URLdecode()), default TRUE","code":""},{"path":"https://schochastics.github.io/adaR/reference/ada_set_href.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Set a specific component of URL — ada_set_href","text":"character, NA valid URL","code":""},{"path":"https://schochastics.github.io/adaR/reference/ada_set_href.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Set a specific component of URL — ada_set_href","text":"","code":"url <- \"https://user_1:password_1@example.org:8080/dir/../api?q=1#frag\" ada_set_href(url, \"https://google.de\") #> [1] \"https://google.de/\" ada_set_username(url, \"user_2\") #> [1] \"https://user_2:password_1@example.org:8080/api?q=1#frag\" ada_set_password(url, \"hunter2\") #> [1] \"https://user_1:hunter2@example.org:8080/api?q=1#frag\" ada_set_port(url, \"1234\") #> [1] \"https://user_1:password_1@example.org:1234/api?q=1#frag\" ada_set_hash(url, \"#section1\") #> [1] \"https://user_1:password_1@example.org:8080/api?q=1#section1\" ada_set_host(url, \"example.de\") #> [1] \"https://user_1:password_1@example.de:8080/api?q=1#frag\" ada_set_hostname(url, \"example.de\") #> [1] \"https://user_1:password_1@example.de:8080/api?q=1#frag\" ada_set_pathname(url, \"path/\") #> [1] \"https://user_1:password_1@example.org:8080/path/?q=1#frag\" ada_set_search(url, \"q=2\") #> [1] \"https://user_1:password_1@example.org:8080/api?q=2#frag\" ada_set_protocol(url, \"ws:\") #> [1] \"ws://user_1:password_1@example.org:8080/api?q=1#frag\""},{"path":"https://schochastics.github.io/adaR/reference/ada_url_parse.html","id":null,"dir":"Reference","previous_headings":"","what":"Use ada-url to parse a url — ada_url_parse","title":"Use ada-url to parse a url — ada_url_parse","text":"Use ada-url parse url","code":""},{"path":"https://schochastics.github.io/adaR/reference/ada_url_parse.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Use ada-url to parse a url — ada_url_parse","text":"","code":"ada_url_parse(url, decode = TRUE)"},{"path":"https://schochastics.github.io/adaR/reference/ada_url_parse.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Use ada-url to parse a url — ada_url_parse","text":"url character. one URL parsed decode logical. Whether decode output (see utils::URLdecode()), default TRUE","code":""},{"path":"https://schochastics.github.io/adaR/reference/ada_url_parse.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Use ada-url to parse a url — ada_url_parse","text":"data frame url components: href, protocol, username, password, host, hostname, port, pathname, search, hash","code":""},{"path":"https://schochastics.github.io/adaR/reference/ada_url_parse.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Use ada-url to parse a url — ada_url_parse","text":"details returned components refer introductory vignette.","code":""},{"path":"https://schochastics.github.io/adaR/reference/ada_url_parse.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Use ada-url to parse a url — ada_url_parse","text":"","code":"ada_url_parse(\"https://user_1:password_1@example.org:8080/dir/../api?q=1#frag\") #> href protocol username #> 1 https://user_1:password_1@example.org:8080/api?q=1#frag https: user_1 #> password host hostname port pathname search hash #> 1 password_1 example.org:8080 example.org 8080 /api ?q=1 #frag"},{"path":"https://schochastics.github.io/adaR/reference/public_suffix.html","id":null,"dir":"Reference","previous_headings":"","what":"Extract the public suffix from a vector of domains or hostnames — public_suffix","title":"Extract the public suffix from a vector of domains or hostnames — public_suffix","text":"Extract public suffix vector domains hostnames","code":""},{"path":"https://schochastics.github.io/adaR/reference/public_suffix.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Extract the public suffix from a vector of domains or hostnames — public_suffix","text":"","code":"public_suffix(domains)"},{"path":"https://schochastics.github.io/adaR/reference/public_suffix.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Extract the public suffix from a vector of domains or hostnames — public_suffix","text":"domains character. vector domains hostnames","code":""},{"path":"https://schochastics.github.io/adaR/reference/public_suffix.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Extract the public suffix from a vector of domains or hostnames — public_suffix","text":"public suffixes domains character vector","code":""},{"path":"https://schochastics.github.io/adaR/reference/public_suffix.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Extract the public suffix from a vector of domains or hostnames — public_suffix","text":"","code":"public_suffix(\"http://example.com\") #> [1] \"com\" # doesn't work for general URLs public_suffix(\"http://example.com/path/to/file\") #> [1] NA # extracting hostname first does the trick public_suffix(ada_get_hostname(\"http://example.com/path/to/file\")) #> [1] \"com\""},{"path":"https://schochastics.github.io/adaR/reference/url_decode2.html","id":null,"dir":"Reference","previous_headings":"","what":"Function to percent-decode characters in URLs — url_decode2","title":"Function to percent-decode characters in URLs — url_decode2","text":"Similar utils::URLdecode","code":""},{"path":"https://schochastics.github.io/adaR/reference/url_decode2.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Function to percent-decode characters in URLs — url_decode2","text":"","code":"url_decode2(url)"},{"path":"https://schochastics.github.io/adaR/reference/url_decode2.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Function to percent-decode characters in URLs — url_decode2","text":"url character vector","code":""},{"path":"https://schochastics.github.io/adaR/reference/url_decode2.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Function to percent-decode characters in URLs — url_decode2","text":"precent decoded URLs character vector","code":""},{"path":"https://schochastics.github.io/adaR/reference/url_decode2.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Function to percent-decode characters in URLs — url_decode2","text":"","code":"url_decode2(\"Hello%20World\") #> [1] \"Hello World\""},{"path":"https://schochastics.github.io/adaR/news/index.html","id":"adar-032","dir":"Changelog","previous_headings":"","what":"adaR 0.3.2","title":"adaR 0.3.2","text":"fixed #66","code":""},{"path":"https://schochastics.github.io/adaR/news/index.html","id":"adar-031","dir":"Changelog","previous_headings":"","what":"adaR 0.3.1","title":"adaR 0.3.1","text":"CRAN release: 2023-11-16 bumped ada-url 2.7.3 transferred repository schochastics gesistsa","code":""},{"path":"https://schochastics.github.io/adaR/news/index.html","id":"adar-030","dir":"Changelog","previous_headings":"","what":"adaR 0.3.0","title":"adaR 0.3.0","text":"CRAN release: 2023-10-16 bump ada_url version 2.7.0 #58 export ada_clear_*() functions #57 export ada_set_*() functions #15 h/t @chainsawriot c++ template added ada_get_basename() #56","code":""},{"path":"https://schochastics.github.io/adaR/news/index.html","id":"adar-020","dir":"Changelog","previous_headings":"","what":"adaR 0.2.0","title":"adaR 0.2.0","text":"CRAN release: 2023-10-01 split C++ file isolate original ada-url code h/t Chung-hong Chan (@chainsawriot) add support public suffix extraction #14 add support punycode #18 added url_decode2 fast alternative utils::URLdecode improved vectorization ada_get_* ada_has_* #26 #30 h/t Chung-hong Chan (@chainsawriot) fixed #47 h/t Chung-hong Chan (@chainsawriot) added ada_get_domain() #43","code":""},{"path":"https://schochastics.github.io/adaR/news/index.html","id":"adar-010","dir":"Changelog","previous_headings":"","what":"adaR 0.1.0","title":"adaR 0.1.0","text":"added ada_url_parser added ada_get_* error handling wrong urls #2 fixed #5 h/t Chung-hong Chan (@chainsawriot) add checks #7 vectorized functions #4 tests h/t Chung-hong Chan (@chainsawriot)","code":""}]