Skip to content

Commit ebe88f1

Browse files
Implement lookup_citation model to deal w/ lambda
** Why are these changes being introduced: We need a way for TACOS to communicate with the new detector lambda that serves the next iteration of our citation detector. ** Relevant ticket(s): * https://mitlibraries.atlassian.net/browse/tco-183 ** How does this address that need: * The big change is that we add a lookup_citation model that handles communication with the lambda. The patterns used are similar to our other external integrations (the other lookup_ classes) - although this is not consistently implemented. * A secondary change is to define a fourth instance variable on the citation model, for @features. This is compiled from the existing work, reformatting and joining two other instance variables. * There is a mismatch between the features generated by the existing citation class and those expected by the new detector. Two features have different names, while one feature is not used at all. That change is handled via the extract_features method on lookup_citation, in order to make these changes as close as possible to the external call. ** Document any side effects to this change: * Using a lookup_* pattern for this is a departure from the way the other classes are used, and I'm tempted to refactor this PR in order to adopt a gateway pattern instead. The other lookup classes are used by GraphQL to look up external information after a detector has fired - while this lookup_citation is much earlier in the application flow, being part of the detector itself.
1 parent d2d6163 commit ebe88f1

File tree

8 files changed

+251
-3
lines changed

8 files changed

+251
-3
lines changed

.env.test

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,3 +3,6 @@ LINKRESOLVER_BASEURL=https://mit.primo.exlibrisgroup.com/discovery/openurl?insti
33
TACOS_EMAIL=[email protected]
44
LIBKEY_KEY=FAKE_LIBKEY_KEY
55
LIBKEY_ID=FAKE_LIBKEY_ID
6+
DETECTOR_LAMBDA_URL=http://localhost:3000
7+
DETECTOR_LAMBDA_PATH=/foo
8+
DETECTOR_LAMBDA_CHALLENGE_SECRET=secret_phrase

app/models/detector/citation.rb

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ class Detector
1010
# hallmarks of being a citation.
1111
# Phrases whose score is higher than the REQUIRED_SCORE value can be registered as a Detection.
1212
class Citation
13-
attr_reader :score, :subpatterns, :summary
13+
attr_reader :features, :score, :subpatterns, :summary
1414

1515
# shared singleton methods
1616
extend Detector::BulkChecker
@@ -67,10 +67,13 @@ def detection?
6767
# @return Nothing intentional. Data is written to Hashes `@subpatterns`, `@summary`,
6868
# and `@score` during processing.
6969
def initialize(phrase)
70+
@features = {}
7071
@subpatterns = {}
7172
@summary = {}
7273
pattern_checker(phrase)
7374
summarize(phrase)
75+
@features = @subpatterns.deep_dup.transform_values(&:length).merge(summary)
76+
@subpatterns.delete_if { |_, v| v == [] }
7477
@score = calculate_score
7578
end
7679

@@ -141,7 +144,7 @@ def commas(phrase)
141144
# @return hash
142145
def pattern_checker(phrase)
143146
CITATION_PATTERNS.each_pair do |type, pattern|
144-
@subpatterns[type.to_sym] = scan(pattern, phrase) if scan(pattern, phrase).present?
147+
@subpatterns[type.to_sym] = scan(pattern, phrase)
145148
end
146149
end
147150

app/models/lookup_citation.rb

Lines changed: 97 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,97 @@
1+
# frozen_string_literal: true
2+
3+
class LookupCitation
4+
# The info method is the way to return information about whether a given phrase is a citation. It consults an
5+
# external lambda service (address in env) and returns either a true or a false. The default if anything goes wrong
6+
# is to return false.
7+
#
8+
# @return Boolean or nil
9+
def info(phrase)
10+
return unless expected_env?
11+
12+
external_data = fetch(phrase)
13+
return if external_data == 'Error'
14+
15+
external_data
16+
end
17+
18+
private
19+
20+
def lambda_path
21+
ENV.fetch('DETECTOR_LAMBDA_PATH', nil)
22+
end
23+
24+
def lambda_secret
25+
ENV.fetch('DETECTOR_LAMBDA_CHALLENGE_SECRET', nil)
26+
end
27+
28+
def lambda_url
29+
ENV.fetch('DETECTOR_LAMBDA_URL', nil)
30+
end
31+
32+
# define_lambda connects to the detector lambda.
33+
#
34+
# @return Faraday connection
35+
def define_lambda
36+
Faraday.new(
37+
url: lambda_url,
38+
params: {}
39+
)
40+
end
41+
42+
# define_payload defines the Hash that will be sent to the lambda.
43+
#
44+
# @return Hash
45+
def define_payload(phrase)
46+
{
47+
action: 'predict',
48+
features: extract_features(phrase),
49+
challenge_secret: lambda_secret
50+
}
51+
end
52+
53+
# expected_env? confirms that all three required environment variables are defined.
54+
#
55+
# @return Boolean
56+
def expected_env?
57+
Rails.logger.error('No lambda URL defined') if lambda_url.nil?
58+
59+
Rails.logger.error('No lambda path defined') if lambda_path.nil?
60+
61+
Rails.logger.error('No lambda secret defined') if lambda_secret.nil?
62+
63+
[lambda_url, lambda_path, lambda_secret].all?(&:present?)
64+
end
65+
66+
# extract_features passes the search phrase through the citation detector, and massages the resulting features object
67+
# to correspond with what the lambda expects.
68+
#
69+
# @return Hash
70+
def extract_features(phrase)
71+
features = Detector::Citation.new(phrase).features
72+
features[:apa] = features.delete :apa_volume_issue
73+
features[:year] = features.delete :year_parens
74+
features.delete :characters
75+
features
76+
end
77+
78+
# Fetch handles the communication with the detector lambda: defining the connection, building the payload, and any
79+
# error handling with the response.
80+
#
81+
# @return Boolean or 'Error'
82+
def fetch(phrase)
83+
lambda = define_lambda
84+
payload = define_payload(phrase)
85+
86+
response = lambda.post(lambda_path, payload.to_json)
87+
88+
if response.status == 200
89+
JSON.parse(response.body)['response'] == 'true'
90+
else
91+
Rails.logger.error(response.body)
92+
Rails.logger.error(response.body['error'])
93+
94+
'Error'
95+
end
96+
end
97+
end

test/models/detector/citation_test.rb

Lines changed: 26 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,10 +4,12 @@
44

55
class Detector
66
class CitationTest < ActiveSupport::TestCase
7-
test 'detector::citation exposes three instance variables' do
7+
test 'detector::citation exposes four instance variables' do
88
t = terms('citation')
99
result = Detector::Citation.new(t.phrase)
1010

11+
assert_predicate result.features, :present?
12+
1113
assert_predicate result.score, :present?
1214

1315
assert_predicate result.summary, :present?
@@ -196,6 +198,29 @@ class CitationTest < ActiveSupport::TestCase
196198
assert_operator 0, :<, result.score
197199
end
198200

201+
test 'features instance method is a hash of integers' do
202+
result = Detector::Citation.new('simple search phrase')
203+
204+
assert_instance_of(Hash, result.features)
205+
206+
assert(result.features.all? { |_, v| v.integer? })
207+
end
208+
209+
test 'features instance method includes all elements of citation detector regardless of search string' do
210+
result_simple = Detector::Citation.new('simple')
211+
result_complex = Detector::Citation.new('Science Education and Cultural Diversity: Mapping the Field. Studies in Science Education, 24(1), 49–73.')
212+
213+
assert_equal result_simple.features.length, result_complex.features.length
214+
end
215+
216+
test 'features instance method should include all elements of citation patterns and summary thresholds' do
217+
patterns = Detector::Citation.const_get :CITATION_PATTERNS
218+
summary = Detector::Citation.const_get :SUMMARY_THRESHOLDS
219+
result = Detector::Citation.new('simple')
220+
221+
assert_equal (patterns.length + summary.length), result.features.length
222+
end
223+
199224
test 'detection? convenience method returns true for obvious citations' do
200225
result = Detector::Citation.new(terms('citation').phrase)
201226

test/models/lookup_citation_test.rb

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
# frozen_string_literal: true
2+
3+
require 'test_helper'
4+
5+
class LookupCitationTest < ActiveSupport::TestCase
6+
test 'DETECTOR_LAMBDA_CHALLENGE_SECRET is required' do
7+
ClimateControl.modify DETECTOR_LAMBDA_CHALLENGE_SECRET: nil do
8+
assert_nil(LookupCitation.new.info('ping'))
9+
end
10+
end
11+
12+
test 'DETECTOR_LAMBDA_PATH is required' do
13+
ClimateControl.modify DETECTOR_LAMBDA_PATH: nil do
14+
assert_nil(LookupCitation.new.info('ping'))
15+
end
16+
end
17+
18+
test 'DETECTOR_LAMBDA_URL is required' do
19+
ClimateControl.modify DETECTOR_LAMBDA_URL: nil do
20+
assert_nil(LookupCitation.new.info('ping'))
21+
end
22+
end
23+
24+
test 'lookup returns true when lambda running' do
25+
# These cassettes should be regenerated once the lambda is running in AWS. For now it will need to be running
26+
# on localhost should the cassettes need to be regenerated.
27+
VCR.use_cassette('lambda running') do
28+
prediction = LookupCitation.new.info('ping')
29+
30+
assert(prediction)
31+
end
32+
end
33+
34+
test 'lookup returns nil when challenge_secret is wrong' do
35+
ClimateControl.modify DETECTOR_LAMBDA_CHALLENGE_SECRET: 'something wrong' do
36+
VCR.use_cassette('lambda with wrong secret') do
37+
prediction = LookupCitation.new.info('oops')
38+
39+
assert_nil(prediction)
40+
end
41+
end
42+
end
43+
end

test/test_helper.rb

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@
1515
require 'rails/test_help'
1616

1717
VCR.configure do |config|
18+
config.ignore_localhost = false
1819
config.cassette_library_dir = 'test/vcr_cassettes'
1920
config.hook_into :webmock
2021

test/vcr_cassettes/lambda_running.yml

Lines changed: 37 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

test/vcr_cassettes/lambda_with_wrong_secret.yml

Lines changed: 39 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)