Skip to content

Using Tika Translate

Chris Mattmann edited this page Mar 22, 2017 · 2 revisions

Start the Tika server with Classpath pointing to your language keys

Use Tika-Python

$ java -cp ./language-keys:tika-server/target/tika-server-1.15-SNAPSHOT.jar org.apache.tika.server.TikaServerCli
Mar 21, 2017 9:17:19 PM org.apache.tika.parser.image.ImageParser <clinit>
WARNING: JBIG2ImageReader not loaded. jbig2 files will be ignored
Mar 21, 2017 9:17:19 PM org.apache.tika.server.TikaServerCli main
INFO: Starting Apache Tika 1.15-SNAPSHOT server
Mar 21, 2017 9:17:20 PM org.apache.cxf.endpoint.ServerImpl initDestination
INFO: Setting the server's publish address to be http://localhost:9998/
Mar 21, 2017 9:17:20 PM org.slf4j.impl.JCLLoggerAdapter info
INFO: jetty-8.y.z-SNAPSHOT
Mar 21, 2017 9:17:20 PM org.slf4j.impl.JCLLoggerAdapter info
INFO: Started SelectChannelConnector@localhost:9998
Mar 21, 2017 9:17:20 PM org.apache.tika.server.TikaServerCli main
INFO: Started

here is the structure of language-keys:

$ tree language-keys
./language-keys
└── org
    └── apache
        └── tika
            └── language
                └── translate
                    ├── translator.google.properties
                    ├── translator.lingo24.properties
                    └── translator.microsoft.properties

5 directories, 3 files 

Each of the properties files has your associated keys, e.g., as shown here:

$ python2.7
Python 2.7.11 (default, Apr 14 2016, 22:11:07) 
[GCC 4.2.1 Compatible Apple LLVM 7.3.0 (clang-703.0.29)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from tika import translate
>>> translate.from_buffer('bonjour mon ami!', 'fr', 'en')
u'Hello, my friend!'
>>>