You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
doc = nlp("Discharge Date: November 15, 2008. Patient had temp reading of 102.6 degrees.")
54
+
doc = nlp("Discharge Date: 11/15/2008. Patient had temp reading of 102.6 degrees.")
51
55
for e in doc.ents:
52
56
if e._.value_extract:
53
57
print(e.text, e.label_, e._.value_extract)
54
-
## Discharge Date DISCHARGE_DATE November 15, 2008
58
+
## Discharge Date DISCHARGE_DATE 11/15/2008
55
59
## temp reading TEMP_READING 102.6 degrees
56
60
```
57
61
58
62
### Value Extraction patterns
59
-
There are two options for extracting values: n tokens and first found pattern.
60
-
61
-
#### N Tokens
62
-
This method will return n tokens past an entity of interest.
63
+
Returns all patterns within n tokens of entity of interest or within the same sentence. It relies on [spaCy token matching syntax](https://spacy.io/usage/rule-based-matching#matcher).
63
64
64
-
**Note:**
65
-
* if the immediate next token is whitespace or punctuation, it will be skipped.
66
-
* if the span of n tokens is part of an entity, the entire entity will be returned, even if it is past n tokens
This method will return the first found pattern past an entity of interest within n tokens or within the same sentence. It relies on [spaCy token matching syntax](https://spacy.io/usage/rule-based-matching#matcher).
Copy file name to clipboardExpand all lines: extractacy/test.py
+48-14Lines changed: 48 additions & 14 deletions
Original file line number
Diff line number
Diff line change
@@ -8,14 +8,50 @@ def build_docs():
8
8
docs=list()
9
9
docs.append(
10
10
(
11
-
"Discharge Date: November 15, 2008. Patient had temp reading of 102.6 degrees. Insurance claim sent to patient's account on file: 1112223.",
11
+
"Discharge Date: 11/15/2008. Patient had temp reading of 102.6 degrees. Insurance claim sent to patient's account on file: 1112223. 12/31/2008: Payment received.",
12
12
[
13
-
("Discharge Date", "November 15, 2008"),
14
-
("November 15, 2008", None),
15
-
("temp", "102.6 degrees"),
16
-
("102.6 degrees", None),
17
-
("account", "1112223"),
18
-
("1112223", None),
13
+
("Discharge Date", ["11/15/2008"]),
14
+
("11/15/2008", []),
15
+
("temp", ["102.6 degrees"]),
16
+
("102.6 degrees", []),
17
+
("account", ["1112223"]),
18
+
("1112223", []),
19
+
# ("12/31/2008", []),
20
+
("Payment received", ["12/31/2008"])
21
+
],
22
+
)
23
+
)
24
+
# testing a case where algorithm attempts to go left of a document start boundary
25
+
docs.append(
26
+
(
27
+
"Payment update: Funds deposited.",
28
+
[
29
+
("Payment update", []),
30
+
],
31
+
)
32
+
)
33
+
# testing a case where algorithm attempts to go right of a document end boundary
0 commit comments