Possibility to emit indices in bulk action with `return_stream?: true` #1367

maennchen · 2024-08-07T17:57:05Z

Is your feature request related to a problem? Please describe.

I'm writing a Mix task, which imports a large file via a bulk insert. To display progress, I would like to be able to log a message every n entries so that I can see the progress. TO do this, I have to enable return_records? at the moment. This uses memory unnecessarily.

Describe the solution you'd like

A new option emit_indices? which will return the current index instead of the whole record. Only available when return_stream?: true and return_records?: false.

Describe alternatives you've considered

None

Express the feature either with a change to resource syntax, or with a change to the resource interface

For example

rows
|> Ash.bulk_create(Resource, :action, return_stream?: true, emit_indices?: true)
|> Enum.each(fn
  {:omitted_record, i} when mod(i, 100) -> IO.puts "imported #{i} records"
  _ -> :discard
end)

Additional context

None

The text was updated successfully, but these errors were encountered:

maennchen · 2024-08-13T23:38:43Z

We should consider if we'd rather add the option emit_insert_counts?. This would allow to not set RETURNING in postgres and therefore save a lot of memory.

zachdaniel · 2024-08-14T00:23:45Z

We would synthesize the indices based on batch sizes, so we won't have to return records

zachdaniel · 2024-08-14T00:24:06Z

Like add the next 500 integers to the stream.

maennchen · 2024-08-14T06:43:50Z

I don't think we always can. If it is an upsert, we won't know which indexes were created and which not, right?

zachdaniel · 2024-08-14T12:53:28Z

🤔 I was thinking that emit_indices returns the number of inputs that were handled, not the number of records that were created.

maennchen · 2024-08-14T12:56:56Z

@zachdaniel I think that would be confusing given that it would also not show up in the result records...

zachdaniel · 2024-08-14T12:59:58Z

🤔 potentially. It could be confusing the other way around as well, like if you do huge bulk upsert and everything is match and you get back 0. But maybe not. Perhaps we should emit both, each batch? So ask the data layer to return a count of inserts, or records if it can't do that, and then emit something like {500, 350} after each batch, being the number of inputs handled and the number of resulting created records.

Perhaps for future proof-ness we should do something like %Ash.BulkResult.BatchStatus{inputs: 500, created: 350}

maennchen · 2024-08-14T13:35:08Z

That sounds like a good compromise :)

maennchen · 2024-08-14T19:56:18Z

Started working on a PR.

maennchen added enhancement New feature or request needs review labels Aug 7, 2024

maennchen mentioned this issue Aug 7, 2024

Warn on bulk action with return_stream?: true with no results. #1368

Closed

zachdaniel removed the needs review label Aug 7, 2024

maennchen mentioned this issue Aug 15, 2024

Emit Batch Status for Bulk Actions #1390

Closed

11 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Possibility to emit indices in bulk action with `return_stream?: true` #1367

Possibility to emit indices in bulk action with `return_stream?: true` #1367

maennchen commented Aug 7, 2024

maennchen commented Aug 13, 2024

zachdaniel commented Aug 14, 2024

zachdaniel commented Aug 14, 2024

maennchen commented Aug 14, 2024

zachdaniel commented Aug 14, 2024

maennchen commented Aug 14, 2024

zachdaniel commented Aug 14, 2024

maennchen commented Aug 14, 2024

maennchen commented Aug 14, 2024

Possibility to emit indices in bulk action with return_stream?: true #1367

Possibility to emit indices in bulk action with return_stream?: true #1367

Comments

maennchen commented Aug 7, 2024

maennchen commented Aug 13, 2024

zachdaniel commented Aug 14, 2024

zachdaniel commented Aug 14, 2024

maennchen commented Aug 14, 2024

zachdaniel commented Aug 14, 2024

maennchen commented Aug 14, 2024

zachdaniel commented Aug 14, 2024

maennchen commented Aug 14, 2024

maennchen commented Aug 14, 2024

Possibility to emit indices in bulk action with `return_stream?: true` #1367

Possibility to emit indices in bulk action with `return_stream?: true` #1367