Sparse masks #108

oleksost · 2024-09-05T12:18:23Z

Implements sparse masks in 3 different ways:

masked linear: compute and memory inefficient as it has to update sparse weights that are kept in dense format
scattered sparse: more memory and compute efficient, keeps sparse weights only and uses torch.scatter_add to only update the sparse weights
sparse linear: uses spops kernels to make things even faster. This also supports structured operations, so block spacity should be fast out of the box (not 100% sure, will double check), This does not work on some GPUs (spops compiled for sm_80 architectures like A100)

Also implements mask updates. Currently, only SNIP updater is implemented and SPieL is in the pipeline.

TODOs:

Tests are not implemented yet.
When updating the mask periodically with SNIP, shall we accumulate all weight updates for all used masks so far on CPU? (like in masked linear case by default)
Do some profiling
Make sure block structure is leveraged
SPieL mask updater

Currently, manual profiler gives me this (for GPT-neo 125M with 0.5% sparcity):

SparseLinearModule (spops) with regular sparsity - Runtime: 0.066590s, Allocated Memory: 4552.14MB, Reserved Memory: 4645.19MB
SparseLinearModule (spops) with blcok sparsity - Runtime: 0.067642s, Allocated Memory: 4553.58MB, Reserved Memory: 4645.19MB
ScatteredSparseLinearModule with block sparsity - Runtime: 0.052826s, Allocated Memory: 4734.14MB, Reserved Memory: 4817.16MB
ScatteredSparseLinearModule with regular sparsity - Runtime: 0.052953s, Allocated Memory: 4734.66MB, Reserved Memory: 4817.16MB
MaskedLinear with regular sparsity - Runtime: 0.056629s, Allocated Memory: 4892.71MB, Reserved Memory: 4970.25MB
MaskedLinear with block sparsity - Runtime: 0.055440s, Allocated Memory: 4889.36MB, Reserved Memory: 4978.64MB

So ScatteredSparseLinearModule is the fastest now but spops SparseLinearModule uses the least memory.

Profilled block sparse mult. with profile_block_sparcity.py: stk and triton block sparse outperform naive torch.matmul (see profile_block_sparcity.py):

mttl/config.py

mttl/models/modifiers/sparse_mask.py

pclucas14

Nice man, very happy with this PR

pclucas14 · 2024-09-18T01:01:41Z

mttl/models/modifiers/sparse_utils/profile_block_sparcity.py

@@ -0,0 +1,200 @@
+# several options to compare for block sparce operations:


small typo in the file name (sparsity instead of sparsity)

pclucas14 · 2024-09-18T01:04:29Z

projects/modular_llm/train_experts.py

@@ -19,6 +20,26 @@
 from mttl.utils import generate_random_string, rank_zero_only_and_wait, remote_login


+def setup_profiler(args: ExpertConfig):


can we put this in utils ? @matheper I know you want single use code to not be in utils but I think this could be useful somewhere else in the future

pclucas14 · 2024-09-18T01:04:49Z

requirements.txt

where is this used ?

its used by rouge evaluators (not automatically installed dependency)

oleksost added 8 commits August 26, 2024 15:52

wip

3e1fc4b

snip gpu: works on v100 for me

d583249

batched projection

3e2f77d

Merge branch 'main' into snip_clustering

7d27a86

sparse masks with spops

5b814ba

clean up

6d0c927

added scattered implementation

5f0b196

removed callback

9e09b4a

oleksost requested review from sordonia and pclucas14 September 5, 2024 22:42

oleksost changed the title ~~WIP, DO NOT MERGE: Sparse masks~~ Sparse masks Sep 5, 2024

spops

f21ae88

sordonia reviewed Sep 6, 2024

View reviewed changes

mttl/config.py Outdated Show resolved Hide resolved

mttl/models/modifiers/sparse_mask.py Outdated Show resolved Hide resolved

mttl/models/modifiers/sparse_mask.py Outdated Show resolved Hide resolved

oleksost added 3 commits September 6, 2024 09:36

renamed config to aguments

d75827e

Merge remote-tracking branch 'origin/main' into sparse_masks

4ba52e6

reorganized

44967a2

oleksost requested a review from sordonia September 9, 2024 16:59

oleksost added 13 commits September 9, 2024 13:04

removed unneccessary arguments

3618279

black

b053dd9

requirements

94022d8

requirements

2a1a9b1

requirements

6ecda06

requirements

9de2ba8

try import spops

ff126f8

tests

7c7b090

black

74f72ba

black

8518ac1

added profiler

e5ae2bf

black

97a4c66

as expert

8c92375

oleksost added 15 commits September 13, 2024 13:27

profile block sparsity

28e0e87

black

4b2be32

block sparsity

b044e18

reorg

4c0e045

black

3fed1ea

black

46b78a8

benchmarking

1e4e972

black

7ffb4f9

profiling sparse amsks

24dbe09

config for benchmarking

e1b8e2e

black

7762e1f

accumulate snips weights

369c404

snip accumulation test

da81782

nvm

ac8cad1

to file

d43c572

sordonia approved these changes Sep 17, 2024

View reviewed changes

pclucas14 approved these changes Sep 18, 2024

View reviewed changes

oleksost added 12 commits September 18, 2024 13:46

rename file

1e2be9b

remove nltk

7150cb3

added some sparse implementations

bd0ab93

black

effdc04

removed not implemented

ff31684

add_experts_from_library to module

b599551

linear_sd import

8fdc608

nvm

ec9d485

readded mask_updater param

660cb46

readded mask updated + black

9cc2573

hf hub version

481d0b2

eaded nltk

0bc1662

oleksost merged commit 0163d2f into main Sep 19, 2024
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sparse masks #108

Sparse masks #108

oleksost commented Sep 5, 2024 •

edited

Loading

pclucas14 left a comment

pclucas14 Sep 18, 2024

oleksost Sep 18, 2024

pclucas14 Sep 18, 2024

pclucas14 Sep 18, 2024

oleksost Sep 18, 2024

oleksost Sep 19, 2024

		@@ -0,0 +1,200 @@
		# several options to compare for block sparce operations:

		@@ -19,6 +20,26 @@
		from mttl.utils import generate_random_string, rank_zero_only_and_wait, remote_login


		def setup_profiler(args: ExpertConfig):

Sparse masks #108

Sparse masks #108

Conversation

oleksost commented Sep 5, 2024 • edited Loading

pclucas14 left a comment

Choose a reason for hiding this comment

pclucas14 Sep 18, 2024

Choose a reason for hiding this comment

oleksost Sep 18, 2024

Choose a reason for hiding this comment

pclucas14 Sep 18, 2024

Choose a reason for hiding this comment

pclucas14 Sep 18, 2024

Choose a reason for hiding this comment

oleksost Sep 18, 2024

Choose a reason for hiding this comment

oleksost Sep 19, 2024

Choose a reason for hiding this comment

oleksost commented Sep 5, 2024 •

edited

Loading