You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is an idea for a major feature + refactor I've discussed with @jgonggrijp . The core idea is to support adding backend "plugins" for analysis / visualisations, and convert existing visualisations to separate plugins.
There are a few things I'd hope to accomplish with this:
Modularise the application
Make it possible to build integration of I-analyzer with other applications, without losing its generalisability
Make I-analyzer more suitable to be hosted by other teams and interface with their software
How it works
Fundamentally, you would write an independent python module or package that is responsible for some kind of analysis. Our current visualisations (results count, search term frequency, wordcloud, related words, etc.) would all work as such modules.
When you set up an I-analyzer instance, you include these modules in the backend settings.py which will enable that analysis for your environment.
Of course, modules would need to conform to an API that I-analyzer expects to work with. If you're turning, say, the wordcloud into a plugin, the module should ultimately offer analysis on a set of documents for which the user has made a query. You could end up with the following endpoints:
Some metadata about the visualisations offered: name, description
A method that determines whether the visualisation should be available in a particular context
A method that returns a specification for a short form to set parameters for the analysis.
A method that takes a query, an elasticsearch client, configured parameters (per the specification above) and returns results (more on that later)
Right now, we have two types of analysis that I'd want to convert to this plugin structure, namely:
Analysis on a documents query
Analysis on a word model query
For generalisability, I would add a third option, namely:
Analysis on a single document
Results format
This is a tricky question. In our current visualisations (or the ones that are the most neatly structured), we return a JSON with the data (e.g. a value per year), and let the frontend figure out how to turn that into an interactive chart.
You could use this approach and generalise the data format somewhat, but it's quite limited. You can only use visualisations that we've written frontend support for, so you can't write a plugin for a network or map visualisation until we add that to the frontend.
My proposal would be that backend modules return JSON specifications of visualisations using the vega / vega-lite grammar. Vega is a javascript visualisation library, but importantly for us, it is entirely declarative, so you can fully define (interactive!) visualisations in a JSON object. Vega also supports a wide range of visualisation types (see their examples page).
I imagine we'll also want the module to present results in a format suitable for table data / CSV downloads, but that will be the smaller hurdle.
An even more powerful option would be that modules can essentially return a web component to embed in our frontend. That gives you a lot of power, but there is more complexity in both supporting or developing such modules.
For single-document analysis, you could also consider an option to return annotations on the text.
Extra hooks
We may want to consider adding extra "hooks" for plugins to interact with I-analyzer. For example, a module might add analysed multifields to an elasticsearch mapping, or provide extra options in the corpus configuration. You might also consider making other features plugin-based. None of that is immediately relevant, though.
enhancementimprovements to user functionalitymajormajor changes to functionality and/or the code basecode qualitycode & performance improvements that do not affect user functionality
1 participant
Converted from issue
This discussion was converted from issue #1340 on February 08, 2024 13:18.
Heading
Bold
Italic
Quote
Code
Link
Numbered list
Unordered list
Task list
Attach files
Mention
Reference
Menu
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
This is an idea for a major feature + refactor I've discussed with @jgonggrijp . The core idea is to support adding backend "plugins" for analysis / visualisations, and convert existing visualisations to separate plugins.
There are a few things I'd hope to accomplish with this:
How it works
Fundamentally, you would write an independent python module or package that is responsible for some kind of analysis. Our current visualisations (results count, search term frequency, wordcloud, related words, etc.) would all work as such modules.
When you set up an I-analyzer instance, you include these modules in the backend
settings.py
which will enable that analysis for your environment.Of course, modules would need to conform to an API that I-analyzer expects to work with. If you're turning, say, the wordcloud into a plugin, the module should ultimately offer analysis on a set of documents for which the user has made a query. You could end up with the following endpoints:
Right now, we have two types of analysis that I'd want to convert to this plugin structure, namely:
For generalisability, I would add a third option, namely:
Results format
This is a tricky question. In our current visualisations (or the ones that are the most neatly structured), we return a JSON with the data (e.g. a value per year), and let the frontend figure out how to turn that into an interactive chart.
You could use this approach and generalise the data format somewhat, but it's quite limited. You can only use visualisations that we've written frontend support for, so you can't write a plugin for a network or map visualisation until we add that to the frontend.
My proposal would be that backend modules return JSON specifications of visualisations using the vega / vega-lite grammar. Vega is a javascript visualisation library, but importantly for us, it is entirely declarative, so you can fully define (interactive!) visualisations in a JSON object. Vega also supports a wide range of visualisation types (see their examples page).
I imagine we'll also want the module to present results in a format suitable for table data / CSV downloads, but that will be the smaller hurdle.
An even more powerful option would be that modules can essentially return a web component to embed in our frontend. That gives you a lot of power, but there is more complexity in both supporting or developing such modules.
For single-document analysis, you could also consider an option to return annotations on the text.
Extra hooks
We may want to consider adding extra "hooks" for plugins to interact with I-analyzer. For example, a module might add analysed multifields to an elasticsearch mapping, or provide extra options in the corpus configuration. You might also consider making other features plugin-based. None of that is immediately relevant, though.
Beta Was this translation helpful? Give feedback.
All reactions