CSR/CSC Tracking issue #443

hameerabbasi · 2021-03-16T09:39:14Z

ivirshup · 2021-03-18T04:24:39Z

It would also be good to figure out which attributes should be immutable, and for which classes this applies.

To me, information that is "part of the type" should be immutable. Now, because there isn't a super strong idea of "type of an array" in the ecosystem, I think this can be fuzzy. E.g. is the dtype of an array part of it's type? In a sense yes, since it controls what operations are allowed, but we can't isinstance(x, array[int]).

To me, it makes sense for arrays to be parametric on their number of dimensions and dtype. For sparse arrays with compressed axes, I think it makes sense for the compressed axis to be part of the type too.

ivirshup · 2021-03-18T04:24:55Z

@GenevieveBuckley, @ryan-williams, potentially of interest to you. I think this is a key target for getting good sparse array support in dask -> having dask support in anndata.

GenevieveBuckley · 2021-03-18T06:44:04Z

It's great to see you working on this @ivirshup

Was there any specific input you wanted from me, or just cc-ing for awareness?

Also, if you want to do some pair programming to work on dask + anndata, I'd be up for that. Let me know if that's something you'd find useful.

hameerabbasi · 2021-03-18T09:29:34Z

To me, it makes sense for arrays to be parametric on their number of dimensions and dtype. For sparse arrays with compressed axes, I think it makes sense for the compressed axis to be part of the type too.

+1, I agree with all of this.

I also added the points @ivirshup mentioned.

AmPhIbIaN26 · 2021-04-07T21:17:04Z

I was looking to work on this issue, @hameerabbasi @GenevieveBuckley @ivirshup if you dont mind could you tell me a bit more about it since I am new to open source and sparse

hameerabbasi · 2021-04-08T08:46:05Z

@AmPhIbIaN26 Please don't tag everyone personally, it sends out a load of e-mails to people who may not want them

As for the answer to your question, look at our contributing page, and follow the links; read closely. If you have any specific questions, ask in the Gitter chat and not here.

AmPhIbIaN26 · 2021-04-08T09:52:48Z

Thanks and sorry for the inconvenience, ill follow up in glitter.

ivirshup · 2021-04-21T08:00:43Z

Sorry for the late response on this! I'm quite busy with PhD/ life commitments at the moment but should hopefully have more time to work on this next month.

@GenevieveBuckley, mostly just letting you know! The pair programming could definitely be useful, I'll let you know when I have a chance to dip my toes back into dask.

@hameerabbasi a couple questions:

First, I'm trying to figure out the broadcasting, in particular result types. How does this sound for a promotion hierarchy:

input types	output types
CSR	CSR
CSC	CSC
CSR,CSC	CSR
T<:{CSC,CSR}, COO(1d)	T

The main ideas here being:

CSR is preferred over CSC if we have to choose
- This sorta fits the "C-ordered by default" numpy api
- Alternatives include: this becomes COO, choose whichever type came first
1d COO is treated similarly to dense arrays for finding the result type (no effect). That is, a one dimensional COO array won't cause the output to be COO.

Second, while trying some benchmarks for the stuff in my sparse_wrapper repo, I noticed sparsity structure can have a large effect on performance. I think it would be important to have a good sample of matrices to use in the benchmarks. There are quite a few collections of sparse matrices, and a number of benchmarking papers in this field I was looking at getting cases from. Do you have any preferences or recommendations here?

Benchmark dataset sources

Sources, strategies

Often take from the suite sparse collection:

ivirshup · 2021-05-05T04:24:46Z

I now have some time to work on this!

@hameerabbasi, do you think we could have a short call about this? Maybe early next week? In particular I'd like to get a sense of where this sits in the broader vision for the project, especially given the work on taco integration.

hameerabbasi · 2021-05-05T04:29:52Z

@ivirshup I've sent you a message on Gitter, let's chat there. 😄

hameerabbasi added the enhancement Indicates new feature requests label Mar 16, 2021

ryan-williams mentioned this issue Mar 26, 2021

improve support for np.matrix / scipy.sparse.spmatrix Array-blocks dask/dask#7468

Closed

3 tasks

hammer mentioned this issue Sep 20, 2021

CSC/ CSR classes #433

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CSR/CSC Tracking issue #443

CSR/CSC Tracking issue #443

hameerabbasi commented Mar 16, 2021 •

edited

Loading

ivirshup commented Mar 18, 2021 •

edited

Loading

ivirshup commented Mar 18, 2021

GenevieveBuckley commented Mar 18, 2021

hameerabbasi commented Mar 18, 2021

AmPhIbIaN26 commented Apr 7, 2021

hameerabbasi commented Apr 8, 2021

AmPhIbIaN26 commented Apr 8, 2021 •

edited

Loading

ivirshup commented Apr 21, 2021

ivirshup commented May 5, 2021

hameerabbasi commented May 5, 2021

CSR/CSC Tracking issue #443

CSR/CSC Tracking issue #443

Comments

hameerabbasi commented Mar 16, 2021 • edited Loading

ivirshup commented Mar 18, 2021 • edited Loading

ivirshup commented Mar 18, 2021

GenevieveBuckley commented Mar 18, 2021

hameerabbasi commented Mar 18, 2021

AmPhIbIaN26 commented Apr 7, 2021

hameerabbasi commented Apr 8, 2021

AmPhIbIaN26 commented Apr 8, 2021 • edited Loading

ivirshup commented Apr 21, 2021

ivirshup commented May 5, 2021

hameerabbasi commented May 5, 2021

hameerabbasi commented Mar 16, 2021 •

edited

Loading

ivirshup commented Mar 18, 2021 •

edited

Loading

AmPhIbIaN26 commented Apr 8, 2021 •

edited

Loading