lambda.pytorch

PyTorch implementation of LambdaNetworks: Modeling long-range Interactions without Attention.

Lambda Networks apply associative law of matrix multiplication to reverse the computing order of self-attention, achieving the linear computation complexity regarding content interactions.

Similar techniques have been used previously in A²-Net and CGNL. Check out a collection of self-attention modules in another repository dot-product-attention.

Training Configuration

✓ SGD optimizer, initial learning rate 0.1, momentum 0.9, weight decay 0.0001

✓ epoch 130, batch size 256, 8x Tesla V100 GPUs, LR decay strategy cosine

✓ label smoothing 0.1

Pre-trained checkpoints

Architecture	Parameters	FLOPs	Top-1 / Top-5 Acc. (%)	Download
Lambda-ResNet-50	14.995M	6.576G	78.208 / 93.820	model \| log

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md
lambda_layer.py		lambda_layer.py
lambda_resnet.py		lambda_resnet.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

lambda.pytorch

Training Configuration

Pre-trained checkpoints

About

Uh oh!

Releases

Packages

Languages

QueeneTam/lambda.pytorch

Folders and files

Latest commit

History

Repository files navigation

lambda.pytorch

Training Configuration

Pre-trained checkpoints

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages