Go libraries and utilities for working with Unicode character data.
This package was forked from the cooperhewitt/ucd package but builds everything under its own aaronland namespace.
package main
import(
"fmt"
"flag"
"github.com/aaronland/go-ucd/v13"
)
func main(){
flag.Parse()
char := flag.Arg(0)
name := ucd.Name(char)
fmt.Println(name)
}
The following tools are included in the cmd directory. Note however that you will need to compile them yourself. You can do this (and all the steps in-between using the handy Makefile and the build target included in this repository. Like this:
$> make tools
This will build the ucd and ucd-server applications and place them in the bin directory.
$> bin/ucd A
LATIN CAPITAL LETTER A
$> ucd THIS → WAY
LATIN CAPITAL LETTER T
LATIN CAPITAL LETTER H
LATIN CAPITAL LETTER I
LATIN CAPITAL LETTER S
SPACE
RIGHTWARDS ARROW
SPACE
LATIN CAPITAL LETTER W
LATIN CAPITAL LETTER A
LATIN CAPITAL LETTER Y
ucd supports the Unicode Han Data character set, or at least endeavours to. There may still be bugs.
$> bin/ucd 䍕
NET; WEB; NETWORK, NET FOR CATCHING RABBIT
$> bin/ucd-server --help
Usage of ./ucd-server:
-host="localhost": host
-port=8080: port
To install as an init.d script, copy the example provided, replace the values of UCD_USER, UCD_DAEMON and UCD_PORT, and start the service.
$> sudo cp init.d/ucd-server.sh.example /etc/init.d/ucd-server.sh
$> sudo service ucd-server start
$> curl -X GET -s 'http://localhost:8080/?text=♕%20HAT' | python -mjson.tool
{
"Chars": [
{
"Char": "\u2655",
"Hex": "2655",
"Name": "WHITE CHESS QUEEN"
},
{
"Char": " ",
"Hex": "0020",
"Name": "SPACE"
},
{
"Char": "H",
"Hex": "0048",
"Name": "LATIN CAPITAL LETTER H"
},
{
"Char": "A",
"Hex": "0041",
"Name": "LATIN CAPITAL LETTER A"
},
{
"Char": "T",
"Hex": "0054",
"Name": "LATIN CAPITAL LETTER T"
}
]
}
$> curl -H 'Accept: text/plain' -s 'http://localhost:8080/?text=♕%20HAT%20WITH%20😸'
WHITE CHESS QUEEN
SPACE
LATIN CAPITAL LETTER H
LATIN CAPITAL LETTER A
LATIN CAPITAL LETTER T
SPACE
LATIN CAPITAL LETTER W
LATIN CAPITAL LETTER I
LATIN CAPITAL LETTER T
LATIN CAPITAL LETTER H
SPACE
GRINNING CAT FACE WITH SMILING EYES
go-ucd supports Unicode 13.0 as of February 16, 2021 and requires Go 1.16 or higher to compile.
This package exports data defined in the UnicodeData.txt and the Unihan.zip. Both are available from
http://unicode.org/Public/UCD/latest/ucd/.
If the Unicode consortium releases newer data files and you want or need to
updated your version of go-ucd before we do you do so manually by using the
ucd-build-unicodedata and ucd-build-unihan tools included in the bin
directory. For example:
go run ./cmd/ucd-build-unicodedata.go > ./unicodedata/unicodedata.go
go run ./cmd/ucd-build-unihan.go > ./unihan/unihan.go
Note: You will need to recompile your ucd and ucd-server binaries manually.
Many thanks to friend and Go-friend Richard Crowley who is always kind and patient answering my Go-related questions. Go is lovely but Go is weird.