Skip to content

charlesreiss/trace-analysis

Repository files navigation

What is this?

  Tools for converting the Google trace to LZO compressed protobuf files
  (for which Twitter's Elephant Bird has Hadoop input/output formats).
  
  Some tools for doing possibly interesting joins of that data in Spark.

  The ability to run an interactive spark shell against the trace data.

To build:

  Dependencies: 
    You will need scala-build-tool.
    
    If you aren't using 64-bit Linux JVM, you will need to get versions of
    everything in native-lib-Linux-amd64.

    You will need the 'protoc' binary installed somewhere.

    Get spark from github.com/mesos/spark;
    build it with sbt/sbt publish-local

  Copy and customize project/Local.scala.example to project/Local.scala
  The only mandatory piece is specifying the directories. You can use HDFS
  paths.

  Copy and customize spark-home/spark-executor.

  sbt test mklauncher

To run:

  # start a Mesos cluster
  export MASTER='mesos://master@MACHINE-WHERE-MESOS-MASTER-RUNS:port/'
  target/scala-sbt spark.repl.Main
  [at the scala> prompt] :l scripts/some-script.scala
  (or other scala commands)

You should start with the 'import.scala' script.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published