From 016a9ec6a4f8f8585c70634ac3f5b5290b6aec58 Mon Sep 17 00:00:00 2001 From: Jaeho Shin Date: Fri, 12 Feb 2016 05:16:05 -0800 Subject: [PATCH] Corrects links to walkthrough to spouse example tutorial --- doc/browsing.md | 2 +- doc/changelog/0.03-alpha.md | 2 +- doc/changelog/0.04.2-alpha.md | 4 +- doc/ddlog.md | 108 ---------------------------- doc/example-chunking.md | 2 +- doc/example-smoke.md | 2 +- doc/generating_negative_examples.md | 2 +- doc/incremental.md | 11 +-- doc/installation.md | 2 +- doc/labeling.md | 2 +- doc/opendata/index.md | 2 +- doc/opendata/schema.md | 2 +- doc/paleo.md | 10 +-- 13 files changed, 22 insertions(+), 129 deletions(-) delete mode 100644 doc/ddlog.md diff --git a/doc/browsing.md b/doc/browsing.md index 58214b13a..c320bfa62 100644 --- a/doc/browsing.md +++ b/doc/browsing.md @@ -78,7 +78,7 @@ DeepDive applications written in DDlog typically use multiple relations falling 3. Relation that holds predictions (random variables) * whose expectation predicted by DeepDive -For example, in the spouse example we use in [DeepDive's tutorial](walkthrough.md), the relations are: +For example, in the spouse example we use in [DeepDive's tutorial](example-spouse.md), the relations are: 1. Source * `articles` diff --git a/doc/changelog/0.03-alpha.md b/doc/changelog/0.03-alpha.md index 32d5fb233..a709f822c 100644 --- a/doc/changelog/0.03-alpha.md +++ b/doc/changelog/0.03-alpha.md @@ -47,7 +47,7 @@ no_toc: true - Updated `spouse_example` with implementations of different styles of extractors. - The `nlp_extractor` example has different table requirements and usage. See here: - [NLP extractor](../walkthrough-extras.md#nlp_extractor). + NLP extractor. - In the `db.default` configuration, users should define `dbname`, `host`, `port` and `user`. If not defined, by default system will use the environmental diff --git a/doc/changelog/0.04.2-alpha.md b/doc/changelog/0.04.2-alpha.md index a69213ccb..22fc41b42 100644 --- a/doc/changelog/0.04.2-alpha.md +++ b/doc/changelog/0.04.2-alpha.md @@ -10,8 +10,8 @@ This release focuses mostly on bug fixing and new features. - A first version of the [generic features library](../gen_feats.md) is now available as part of `ddlib`, the utility library included in DeepDive. -- The `spouse_example` example and the [application - walkthrough](../walkthrough.md) were +- The `spouse_example` example and the application + walkthrough were expanded to cover the use of [MindTagger](../labeling.md) and of the [generic features library](../gen_feats.md). - The ` --reg_param ` option was added to the diff --git a/doc/ddlog.md b/doc/ddlog.md deleted file mode 100644 index 3392194f8..000000000 --- a/doc/ddlog.md +++ /dev/null @@ -1,108 +0,0 @@ ---- -layout: default -title: Writing Applications in DDlog ---- - -# Writing Applications in DDlog - -DDlog is a higher-level language for writing DeepDive applications in succinct Datalog-like syntax. -We are gradually extending the language to allow expression of advanced SQL queries used by complex extractors as well as a variety of factors with Markov Logic-like syntax. -A reference for ddlog lanugage features can be found [here](https://github.com/HazyResearch/ddlog/wiki/DDlog-Language-Features). - -## Writing a DDlog Program - -A DDlog program consists of the following parts: - -1. Schema declarations for your source data, intermediate data, and variables. -2. Data transformation rules for extracting candidates, supervision labels, and features. -3. Scope rules for defining the domain of the variables. -4. Inference rules for describing dependency between the variables. - -We show how each part should look like using the [spouse example in our walk through](walkthrough.md). -All DDlog code should be placed in a file named `app.ddlog` under the DeepDive application. -A complete example written in DDlog can be found from [examples/spouse_example/postgres/ddlog](https://github.com/HazyResearch/deepdive/blob/master/examples/spouse_example/postgres/ddlog) - - -### Basic Syntax - -Each declaration and rule ends with a period (`.`). -The order of the rules have no meaning. -Comments in DDlog begin with a hash (`#`) character. - -### Schema Declaration - -First of all, we declare the relations we use throughout the program. -The order doesn't matter, but it's a good idea to place them at the beginning because that makes it easier to understand. - -#### Source Relations -We need a relation for holding raw text of all source documents. -Each relation has multiple columns whose name and type are declared one at a time separated by commas. - - - -#### Intermediate Relations -We'll populate the following `sentences` relation by parsing sentences from every articles and collecting NLP tags. - - - -Then, we'll map candidate mentions of people in each sentence using the following `people_mentions` relation (ignore `@` annotations). - - - -Candidate mentions of spouse relationships from them will then be stored in `has_spouse_candidates`. - - - -For each relationship candidate, we'll extract features. - - - -#### Variable Relations -Finally, we declare a variable relation that we want DeepDive to predict the marginal probability for us. - - - -### Candidate Mapping and Supervision Rules -A user-defined function for mapping candidates and supervision them is written in Python and declared as below with a name `ext_people`. -The function is supposed to take as input a triple of sentence id, array of words, and array of NER tags of the words and output rows like the `people_mentions` relation. -The Python implementation `udf/ext_people.py` takes the input as tab-separated values in each line and outputs in the same format. - - - -Then this user-defined function `ext_people` can be called in the following way to add more tuples to `people_mentions` taking triples from the `sentences` relation: - - - -In a similar way, we can have another UDF map candidate relationships and supervise them. - - - -### Feature Extraction Rules -Also, we use a UDF to extract features for the candidate relationships. - - - -### Inference Rules -Now we need to generate variables from the candidate relation. This is through -the *scoping rule*/*supervision rule* below. In the scoping rule, variables are generated from the -body and distinct on the key given in the head. The supervision for the variables -is annotated with `@label(l)`. - - - -Finally, we define a binary classifier for our boolean variable `has_spouse`. - -In this rule, we define a classifier based on the features, and the weight is tied -to the feature. diff --git a/doc/example-chunking.md b/doc/example-chunking.md index d0b77870b..988e12331 100644 --- a/doc/example-chunking.md +++ b/doc/example-chunking.md @@ -7,7 +7,7 @@ title: Text Chunking Example ## Introduction -In this document, we will describe an example application of text chunking using DeepDive, and demonstrate how to use **Multinomial variables**. This example assumes a working installation of DeepDive, and basic knowledge of how to build an application in DeepDive. Please go through the [example application walkthrough](walkthrough.md) before preceding. +In this document, we will describe an example application of text chunking using DeepDive, and demonstrate how to use **Multinomial variables**. This example assumes a working installation of DeepDive, and basic knowledge of how to build an application in DeepDive. Please go through the [tutorial with the spouse example application](example-spouse.md) before preceding. Text chunking consists of dividing a text in syntactically correlated parts of words. For example, the following sentence: diff --git a/doc/example-smoke.md b/doc/example-smoke.md index 5b61e8cbd..2ff83b657 100644 --- a/doc/example-smoke.md +++ b/doc/example-smoke.md @@ -63,7 +63,7 @@ deepdive { ``` ### Setting Up -Before running the example, please check that DeepDive has been properly [installed](http://deepdive.stanford.edu/doc/basics/installation.html) and the necessary files (app.ddlog, db.url, and deepdive.conf) and directory (input/) that are associated with this example are stored in the current working directory. Input directory should have data files (friends.tsv, person_has_cancer.tsv, person_smokes.tsv, and person.tsv). In order to use DeepDive a database instance must be running to accept requests, and the database location must be specified in the db.url. You can refer to the detailed [walkthrough](http://deepdive.stanford.edu/doc/basics/walkthrough/walkthrough.html) to setup the environemnt. +Before running the example, please check that DeepDive has been properly [installed](http://deepdive.stanford.edu/doc/basics/installation.html) and the necessary files (app.ddlog, db.url, and deepdive.conf) and directory (input/) that are associated with this example are stored in the current working directory. Input directory should have data files (friends.tsv, person_has_cancer.tsv, person_smokes.tsv, and person.tsv). In order to use DeepDive a database instance must be running to accept requests, and the database location must be specified in the db.url. You can refer to the detailed [tutorial](example-spouse.md) to setup the environemnt. ### Running diff --git a/doc/generating_negative_examples.md b/doc/generating_negative_examples.md index ae7a25cb8..57c13d479 100644 --- a/doc/generating_negative_examples.md +++ b/doc/generating_negative_examples.md @@ -43,5 +43,5 @@ For example, most people mention pairs in sentences are not spouses, so we can r -To see an example of how we generate negative evidence in DeepDive, refer to the [example application walkthrough](walkthrough.md#candidate_relations). +To see an example of how we generate negative evidence in DeepDive, refer to the [example application tutorial](example-spouse.md#1-3-extracting-candidate-relation-mentions). diff --git a/doc/incremental.md b/doc/incremental.md index fa0f5cb44..f531df4dc 100644 --- a/doc/incremental.md +++ b/doc/incremental.md @@ -5,8 +5,10 @@ title: Incremental Workflow # Building an Incremental Application -This document describes how to build an incremental application using [DDlog][] and -DeepDive. The example application is [the spouse example](walkthrough.md). +Drop this page? + +This document describes how to build an incremental application using DDlog and +DeepDive. The example application is [the spouse example](example-spouse.md). This document assumes you are familiar with basic concepts in DeepDive and the spouse application tutorial. @@ -38,7 +40,7 @@ the base part. The workflow can be summarized as follows. The incremental version of the spouse example is under [`examples/spouse_example/postgres/incremental`](https://github.com/HazyResearch/deepdive/tree/master/examples/spouse_example/postgres/incremental). The only difference is that all the arrays are transformed into string using `array_to_string` with delimiter `'~^~'` due to DDlog's limited support for array type. -You can follow the [corresponding section in the original walkthrough](walkthrough.md#loading_data) to load the data. +You can follow the [corresponding section in the original walkthrough](example-spouse.md#1-1-loading-raw-input-data) to load the data. Alternatively, you can try the handy scripts included in the incremental example provided in the source tree. @@ -51,7 +53,7 @@ cd examples/spouse_example/postgres/incremental ### Writing Application in DDlog In order to make use of the incremental support of DeepDive, the application must be written in DDlog. -Please refer to [DDlog tutorial][DDlog] for how to write your DeepDive application in DDlog. +Please refer to DDlog tutorial for how to write your DeepDive application in DDlog. Let's assume you have put the DDlog program shown below in a file named `spouse_example.f1.ddlog` under the application folder.