chore(deps): update dependency spacy to v3.5.0 #46
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR contains the following updates:
==3.4.4
->==3.5.0
Release Notes
explosion/spaCy
v3.5.0
: : New CLI commands, language updates, bug fixes and much moreCompare Source
✨ New features and improvements
apply
CLI command to annotate new documents with a trained pipeline (#11376).benchmark
CLI command to benchmark pipelines. The newbenchmark speed
subcommand measures the speed of a pipeline, thebenchmark accuracy
subcommand is a new alias forevaluate
(#11902).find-threshold
CLI command to identify an optimal threshold for classification models (#11280).FUZZY
Matcher
operator for fuzzy matches based on Levenshtein edit distance. In addition, theFUZZY
andREGEX
operators are now supported in combination withIN
/NOT_IN
. (#11359).typer
v0.7.x (#11720),mypy
0.990 (#11801) andtyping_extensions
v4.4.x (#12036).spacy.ConsoleLogger.v3
with expanded progress tracking (#11972).textcat
withspacy.textcat_scorer.v2
(#11696 and #11971) andspacy.textcat_multilabel_scorer.v2
(#11820).InMemoryLookupKB
(#11268).before_update
callback that is invoked at the start of each training step (#11739).SpanGroup
(#11380).displacy.serve
when the default port is in use (#11948).tok2vec
version (#11618).🔴 Bug fixes
tok2vec
ortransformer
layer.textcat
.Vocab.to_disk
respects the exclude setting forlookups
andvectors
.SpanGroup
andSpan
objects.The following changes may require you to update code that is using the relevant functionality:
textcat
ortextcat_multilabel
model - ensure that values are 0.0 or 1.0 as explained in the docs.KnowledgeBase
is now an abstract class, you should call the constructor of the newInMemoryLookupKB
instead when you want to use spaCy's default KB implementation. If you've written a custom KB that inherits fromKnowledgeBase
, you'll need to implement its abstract methods, or alternatively inherit fromInMemoryLookupKB
instead.The following changes may influence the output of your language pipeline or trained models:
pymorphy3
(#11345, #11811).tok2vec
defaults in all components (#11618).textcat
andtextcat_multilabel
components (#11698).textcat
andtextcat_multilabel
to fix a bug related tothreshold
fortextcat
and to make it possible to score multipletextcat
/textcat_multilabel
components in a single pipeline with custom scorers. If no custom scorers are used, thecat_p/r/f
scores will now only reflect the final component's labels and performance (#11696, #11820).token_acc
score to report the intended measure (# correct tokens / # predicted tokens
, the same as in spaCy v2). Thetoken_acc
scores for v3.5 will be lower for the same performance because they were incorrectly inflated in v3.0-v3.4. Thetoken_p/r/f
scores should remain unchanged (#12073).The following functionality will be changed in the near future - so it's best to start updating your scripts now to make them more generic:
master
branch tomain
.📦 Trained pipelines updates
IS_SPACE
as atok2vec
feature fortagger
andmorphologizer
components to improve tagging of non-whitespace vs. whitespace tokens.spacy-transformers
v1.2, which uses the exact alignment fromtokenizers
for fast tokenizers instead of the heuristic alignment fromspacy-alignments
. For all trained pipelines exceptja_core_news_trf
, the alignments between spaCy tokens and transformer tokens may be slightly different. More details about thespacy-transformers
changes in the v1.2.0 release notes.📖 Documentation and examples
biluo_to_iob
andiob_to_biluo
functions.👥 Contributors
@aaronzipp, @adrianeboyd, @albertvillanova, @ArchiDevil, @cfuerbachersparks, @damian-romero, @danieldk, @darigovresearch, @DSLituiev, @essenmitsosse, @gremur, @honnibal, @ines, @jmyerston, @JosPolfliet, @kadarakos, @koaning, @kwhumphreys, @ljvmiranda921, @MarcoGorelli, @orglce, @pmbaumgartner, @polm, @richardpaulhudson, @rmitsch, @ryndaniels, @shadeMe, @svlandeg, @thomashacker, @TrellixVulnTeam, @wannaphong, @zhiiw, @zrpxx
Configuration
📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).
🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.
♻ Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.
🔕 Ignore: Close this PR and you won't be reminded about this update again.
This PR has been generated by Mend Renovate. View repository job log here.