Whole document embedding

Hi there,

I was wondering whether it makes sense to "trick" LASER to consider a whole document made out of multiple sentences as a single sentence? That way I'd get a whole document embedding and wouldn't need to devise any aggregation method.

I know there's a limit of 12000 tokens on sentences (as per https://github.com/facebookresearch/LASER/blob/5767f189fb4b87227e9a5a36e03ace108dc3cc2f/source/embed.py#L343) but let's forget this for now please :) 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Whole document embedding #152

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Whole document embedding #152

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions