Skip to content

Commit

Permalink
add prompts documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
Fabio9705 committed Sep 18, 2024
1 parent d563222 commit d9a9784
Show file tree
Hide file tree
Showing 5 changed files with 638 additions and 0 deletions.
205 changes: 205 additions & 0 deletions docs/prompts/completion.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,205 @@
---
sidebar_position: 3
---

# Completion

The prompts shown in this page are related to **Completion** prompts created to perform the cell entity annotation task on tables.

## Completion Prompt (no example and no pool)

**Input prompt**:

```
Given a table T, it is possible to identify each of its elements through a pair of indices (i, j), where
i is the row index and j is the column index.
The elements of the table can be associated with a Wikidata ID if they have one.
Given the table
T =
Mont Blanc,4810,Alps,Graian Alps
K2,8611,Himalayas,Karakoram
Kilimanjaro,5895,Tanzania,East African Mountains
Mount Everest,8848,Himalayas,Himalayas
the association between the elements of the table and the Wikidata’s IDs is:
```

**Response obtained by GPT-4 is**:

```
Result =
(0, 0): Q7158
(0, 1): Q12197
(0, 2): Q12348
(0, 3): Q12198
(1, 0): Q18286
(1, 1): Q1204
(1, 2): Q1867
(1, 3): Q1849
(2, 0): Q17050
(2, 1): Q1348
(2, 2): Q9245
(2, 3): Q13357
(3, 0): Q513
(3, 1): Q513
(3, 2): Q1867
(3, 3): Q1849
The table T can be represented as a list of lists in Python as follows:
T = [['Mont Blanc', '4810
```

## Completion Prompt with example (no pool)

**Input prompt**:

```
Given a table T, it is possible to identify each of its elements through a pair of indices (i, j), where
i is the row index and j is the column index.
The elements of the table can be associated with a Wikidata ID if they have one.
For example, if we had the cell (2, 5) in table T containing the entity ”Milano,” and in Wikidata,
we had the entity ”Q490 Milano,” then through CEA, we associate (2, 5) = Q490
Given the table
T =
Mont Blanc,4810,Alps,Graian Alps
K2,8611,Himalayas,Karakoram
Kilimanjaro,5895,Tanzania,East African Mountains
Mount Everest,8848,Himalayas,Himalayas
the association between the elements of the table and the Wikidata’s IDs is:
```

**Response obtained by GPT-4 is**:

```
Result =
(1, 1) = Q1216
(1, 3) = Q1234
(2, 1) = Q1823
(2, 3) = Q1819
(3, 1) = Q8917
(3, 4) = Q170787
(4, 1) = Q513
(4, 3) = Q1819
# TODO: implement this function
#raise NotImplementedError()
wikidata = {}
for i in range(len(table)):
for j in range(len(table[i])):
if table[i][j] != "":
wikidata[(i+1,j+1)] = cea(table[i][j])
return wikidata
```

## Completion Prompt with pool (no example)

**Input prompt**:

```
Given a table T, it is possible to identify each of its elements through a pair of indices (i, j), where
i is the row index and j is the column index.
The elements of the table can be associated with a Wikidata ID if they have one.
The Wikidata IDs are in a pool P.
Given the table
T =
Mont Blanc,4810,Alps,Graian Alps
K2,8611,Himalayas,Karakoram
Kilimanjaro,5895,Tanzania,East African Mountains
Mount Everest,8848,Himalayas,Himalayas
and the pool
P =
Q513 Everest, Q524 Vesuvio, Q583 Mont Blanc
Q791 Taumatawhakatangihangakōauauotamateapōkaiwhenuakitānatahu, Q1286 Alps, Q1637969 K2
Q1373 Matterhorn, Q18869 Caucasus, Q1451 Pinatubo, Q1484 Mayon Volcano
Q5469 Karakoram, Q17009782 East African Mountain, Q3428 Bungsberg, Q3660 Mount Davis,
Q2353 Topfer, Q265406 Kilimanjaro, Q3309 Pec, Q3375 Zugspitze, Q1262 Graian Alps
Q3388 Grossglockner, Q43105 Mount Elbrus, Q655495 Tanzania, Q3403 Dufourspitze
Q4049 Medvědí vrch, Q4052 Dlouhé stráně, Q686902 Himalayas
the association between the elements of the table and the IDs in the pool is:
```

**Response obtained by GPT-4 is**:

```
(0, 0) -> Q583
(0, 1) -> None
(0, 2) -> Q1286
(0, 3) -> Q1262
(1, 0) -> Q1637969
(1, 1) -> None
(1, 2) -> Q686902
(1, 3) -> Q5469
(2, 0) -> Q265406
(2, 1) -> None
(2, 2) -> Q655495
(2, 3) -> Q17009782
(3, 0) -> Q513
(3, 1) -> None
(3, 2) -> Q686902
(3, 3) -> Q686902
The function should return the table with the Wikidata IDs instead of the names of
```

## Completion Prompt with pool and with example

**Input prompt**:

```
Given a table T, it is possible to identify each of its elements through a pair of indices (i, j), where
i is the row index and j is the column index.
The elements of the table can be associated with a Wikidata ID if they have one.
For example, if we had the cell (2, 5) in table T containing the entity ”Milano,” and in pool P, we
had the entity ”Q490 Milano,” then through CEA, we associate (2, 5) –>Q490.
The Wikidata IDs are in a pool P.
Given the table
T =
Mont Blanc,4810,Alps,Graian Alps
K2,8611,Himalayas,Karakoram
Kilimanjaro,5895,Tanzania,East African Mountains
Mount Everest,8848,Himalayas,Himalayas
and the pool
P =
Q513 Everest, Q524 Vesuvio, Q583 Mont Blanc
Q791 Taumatawhakatangihangakōauauotamateapōkaiwhenuakitānatahu, Q1286 Alps, Q1637969 K2
Q1373 Matterhorn, Q18869 Caucasus, Q1451 Pinatubo, Q1484 Mayon Volcano
Q5469 Karakoram, Q17009782 East African Mountain, Q3428 Bungsberg, Q3660 Mount Davis,
Q2353 Topfer, Q265406 Kilimanjaro, Q3309 Pec, Q3375 Zugspitze, Q1262 Graian Alps
Q3388 Grossglockner, Q43105 Mount Elbrus, Q655495 Tanzania, Q3403 Dufourspitze
Q4049 Medvědí vrch, Q4052 Dlouhé stráně, Q686902 Himalayas
the association between the elements of the table and the IDs in the pool is:
Prompt con esempio senza pool:
Given a table T, it is possible to identify each of its elements through a pair of indices (i, j), where
i is the row index and j is the column index.
The elements of the table can be associated with a Wikidata ID if they have one.
```

**Response obtained by GPT-4 is**:

```
(0, 0) = Q583
(0, 1) = None
(0, 2) = Q1286
(0, 3) = Q1262
(1, 0) = Q1637969
(1, 1) = None
(1, 2) = Q686902
(1, 3) = Q5469
(2, 0) = Q265406
(2, 1) = None
(2, 2) = Q655495
(2, 3) = Q17009782
(3, 0) = Q513
(3, 1) = None
(3, 2) = Q686902
(3, 3) = Q686902
```
169 changes: 169 additions & 0 deletions docs/prompts/execution.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,169 @@
---
sidebar_position: 1
---

# Execution

The prompts shown in this page are related to Execution prompts created to perform the cell entity annotation task on tables.

## Execution Prompt (no example and no candidates)

**Input prompt**:

```
T =
Mont Blanc,4810,Alps,Graian Alps
K2,8611,Himalayas,Karakoram
Kilimanjaro,5895,Tanzania,East African Mountains
Mount Everest,8848,Himalayas,Himalayas
Each element in the table T is identified by the indexes couple (i, j) where i is the row index and j
is the column index.
Perform a Cell Entity Annotation task on the table T using Wikidata as a Knowledge Graph
```

**Response obtained by GPT-4 is**:

```
public static void main(String[] args) throws IOException {
String table = "Mont Blanc,4810,Alps,Graian Alps\n" +
"K2,8611,Himalayas,Karakoram\n" +
"Kilimanjaro,5895,Tanzania,East African Mountains\n" +
"Mount Everest,8848,Himalayas,Himalayas";
String[] rows = table.split("\n");
String[] headers = rows[0].split(",");
List<List<String>> data = new ArrayList<>();
for (int i = 1; i < rows.length; i++) {
String[] row = rows[i].split(",");
data.add(Arrays.asList(row));
}
System.out.println(headers);
System.out.println(data);
// create a Wikidata object
Wikidata wikidata = new Wikidata();
// get the entity for each cell
List<List<Entity>> entities = new ArrayList<>();
```

## Execution Prompt with example (no candidates)

**Input prompt**:

```
T =
Mont Blanc,4810,Alps,Graian Alps
K2,8611,Himalayas,Karakoram
Kilimanjaro,5895,Tanzania,East African Mountains
Mount Everest,8848,Himalayas,Himalayas
Each element in the table T is identified by the indexes couple (i, j) where i is the row index and j
is the column index.
Perform a Cell Entity Annotation task on the table T using Wikidata as a Knowledge Graph
For example, if we had the cell (2, 5) in table T containing the entity ”Milano,” and in Wikidata,
we had the entity ”Q490 Milano,” then through CEA, we associate (2, 5) –>Q490
```

**Response obtained by GPT-4 is**:

```
GENERAZIONE
*/
public static void main(String[] args) throws IOException {
String path = "src/main/resources/table.csv";
Table table = new Table(path);
Wikidata wikidata = new Wikidata();
CellEntityAnnotation cea = new CellEntityAnnotation(table, wikidata);
cea.annotate();
System.out.println(table);
}
}
```

## Execution Prompt with pool (no example)

**Input prompt**:

```
T =
Mont Blanc,4810,Alps,Graian Alps
K2,8611,Himalayas,Karakoram
Kilimanjaro,5895,Tanzania,East African Mountains
Mount Everest,8848,Himalayas,Himalayas
Each element in the table T is identified by the indexes couple (i, j) where i is the row index and j
is the column index.
P =
Q513 Everest, Q524 Vesuvio, Q583 Mont Blanc
Q791 Taumatawhakatangihangakōauauotamateapōkaiwhenuakitānatahu, Q1286 Alps, Q1637969 K2
Q1373 Matterhorn, Q18869 Caucasus, Q1451 Pinatubo, Q1484 Mayon Volcano
Q5469 Karakoram, Q17009782 East African Mountain, Q3428 Bungsberg, Q3660 Mount Davis,
Q2353 Topfer, Q265406 Kilimanjaro, Q3309 Pec, Q3375 Zugspitze, Q1262 Graian Alps
Q3388 Grossglockner, Q43105 Mount Elbrus, Q655495 Tanzania, Q3403 Dufourspitze
Q4049 Medvědí vrch, Q4052 Dlouhé stráně, Q686902 Himalayas
In the pool P there are a group of wikidata’s ids. Perform a Cell Entity Annotation task on the
table T using the pool
```

**Response obtained by GPT-4 is**:

```
@Test
public void testCellEntityAnnotation() throws Exception {
String text = "Mont Blanc is the highest mountain in the Alps. K2 is the second highest mountain in the world. Kilimanjaro is the highest mountain in Africa. Mount Everest is the highest mountain in the world.";
List<Table> tables = new ArrayList<>();
Table table = new Table();
table.setDocId("test");
table.setId(0);
table.setCaption("Mountains");
table.setHeader(Arrays.asList("Name", "Height", "Location", "Range"));
List<List<String>> rows = new ArrayList<>();
rows.add(Arrays.asList("Mont Blanc", "4810", "Alps", "Graian Alps"));
rows.add(Arrays.asList("K2", "8611", "Himalayas", "Karakoram"));
rows.add(Arrays.asList("Kilimanjaro", "5895", "Tanzania", "East African Mountains
```

## Execution Prompt with pool and with example

**Input prompt**:

```
T =
Mont Blanc,4810,Alps,Graian Alps
K2,8611,Himalayas,Karakoram
Kilimanjaro,5895,Tanzania,East African Mountains
Mount Everest,8848,Himalayas,Himalayas
Each element in the table T is identified by the indexes couple (i, j) where i is the row index and j
is the column index.
P =
Q513 Everest, Q524 Vesuvio, Q583 Mont Blanc
Q791 Taumatawhakatangihangakōauauotamateapōkaiwhenuakitānatahu, Q1286 Alps, Q1637969 K2
Q1373 Matterhorn, Q18869 Caucasus, Q1451 Pinatubo, Q1484 Mayon Volcano
Q5469 Karakoram, Q17009782 East African Mountain, Q3428 Bungsberg, Q3660 Mount Davis,
Q2353 Topfer, Q265406 Kilimanjaro, Q3309 Pec, Q3375 Zugspitze, Q1262 Graian Alps
Q3388 Grossglockner, Q43105 Mount Elbrus, Q655495 Tanzania, Q3403 Dufourspitze
Q4049 Medvědí vrch, Q4052 Dlouhé stráně, Q686902 Himalayas
In the pool P there are a group of wikidata’s ids.
Perform a Cell Entity Annotation task on the table T using the pool P.
For example, if we had the cell (2, 5) in table T containing the entity ”Milano,” and in pool P, we
had the entity ”Q490 Milano,” then through CEA, we associate (2, 5) –>Q490.
```

**Response obtained by GPT-4 is**:

```
@Test
public void testCellEntityAnnotation() throws Exception {
String table = "Mont Blanc,4810,Alps,Graian Alps\n" +
"K2,8611,Himalayas,Karakoram\n" +
"Kilimanjaro,5895,Tanzania,East African Mountains\n" +
"Mount Everest,8848,Himalayas,Himalayas";
String pool = "Q513 Everest, Q524 Vesuvio, Q583 Mont Blanc\n" +
"Q791 Taumatawhakatangihangakōauauotamateapōkaiwhenuakitānatahu, Q1286 Alps, Q1637969 K2\n" +
"Q1373 Matterhorn, Q18869 Caucasus, Q1451 Pinatubo, Q1484 Mayon Volcano\n" +
"Q5469 Karakoram, Q17009782 East African Mountain
```
8 changes: 8 additions & 0 deletions docs/prompts/index.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# Prompts

On this page, various prompts provided to GPT-4 for the cell entity annotation task are presented. The prompts are organized into four distinct categories:

1. **Execution**: It involves instructing the model to execute a specific task on the given data and provides a clear instruction to perform a particular action or task;
2. **Request**: Prompt is structured as a question, asking for the outcome or result of performing a specified task on the provided data;
3. **Completion**: It deliberately leave part of the statement incomplete, tasking the model with filling in the missing information;
4. **Programming**: Prompt is formulated as pseudocode, presenting a specific programming logic or structure. The model is expected to understand and follow the provided code structure to perform the desired task.
Loading

0 comments on commit d9a9784

Please sign in to comment.