GitHub - charlesreiss/trace-analysis

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
elephant-bird-libs		elephant-bird-libs
lib		lib
native-lib-Linux-amd64		native-lib-Linux-amd64
project		project
scripts		scripts
spark-home		spark-home
src		src
.gitignore		.gitignore
README		README
build.sbt		build.sbt
config-twadoop.yml		config-twadoop.yml
protoc		protoc
protoc-gen-twadoop		protoc-gen-twadoop
run.sh.example		run.sh.example

Repository files navigation

What is this?

  Tools for converting the Google trace to LZO compressed protobuf files
  (for which Twitter's Elephant Bird has Hadoop input/output formats).
  
  Some tools for doing possibly interesting joins of that data in Spark.

  The ability to run an interactive spark shell against the trace data.

To build:

  Dependencies: 
    You will need scala-build-tool.
    
    If you aren't using 64-bit Linux JVM, you will need to get versions of
    everything in native-lib-Linux-amd64.

    You will need the 'protoc' binary installed somewhere.

    Get spark from github.com/mesos/spark;
    build it with sbt/sbt publish-local

  Copy and customize project/Local.scala.example to project/Local.scala
  The only mandatory piece is specifying the directories. You can use HDFS
  paths.

  Copy and customize spark-home/spark-executor.

  sbt test mklauncher

To run:

  # start a Mesos cluster
  export MASTER='mesos://master@MACHINE-WHERE-MESOS-MASTER-RUNS:port/'
  target/scala-sbt spark.repl.Main
  [at the scala> prompt] :l scripts/some-script.scala
  (or other scala commands)

You should start with the 'import.scala' script.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

charlesreiss/trace-analysis

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages