{"payload":{"feedbackUrl":"https://github.com/orgs/community/discussions/53140","repo":{"id":135768037,"defaultBranch":"main","name":"DALI","ownerLogin":"NVIDIA","currentUserCanPush":false,"isFork":false,"isEmpty":false,"createdAt":"2018-06-01T22:18:01.000Z","ownerAvatar":"https://avatars.githubusercontent.com/u/1728152?v=4","public":true,"private":false,"isOrgOwned":true},"refInfo":{"name":"","listCacheKey":"v0:1726480220.0","currentOid":""},"activityList":{"items":[{"before":"94f02ad69abe149f345684ef2aba3e13d246881a","after":"f34a2270b4dea7f8bce25f5ca8028e01385df1a7","ref":"refs/heads/main","pushedAt":"2024-09-19T15:34:42.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"mzient","name":"Michał Zientkiewicz","path":"/mzient","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/17044837?s=80&v=4"},"commit":{"message":"Optimize TensorList::Resize (#5638)\n\n* simple inline functions are moved to the header\r\n* shared_ptr in ShareData is now passed by value, allowing move semantics and reducing the number of atomic operations\r\n* some code motion to improve inlining (e.g. wrapping frequent calls to DLL_PUBLIC functions into a trampoline function)\r\n\r\n---------\r\n\r\nSigned-off-by: Michal Zientkiewicz ","shortMessageHtmlLink":"Optimize TensorList::Resize (#5638)"}},{"before":"56b4acc9ac1477bb627eba5065bbed8049e47295","after":"3c97d9b611f242bf455fc79b324b2db53cc6544a","ref":"refs/heads/release_v1.42","pushedAt":"2024-09-18T12:46:23.000Z","pushType":"push","commitsCount":2,"pusher":{"login":"stiepan","name":"Kamil Tokarski","path":"/stiepan","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/11878086?s=80&v=4"},"commit":{"message":"Add metadata-only inputs. (#5635)\n\n* Assign a stream to CPU inputs with GPU (non-metadata) inputs\r\n* Add Metadata input device - this declares that the input is used for metadata (shape, dtype, etc) access only\r\n* Don't synchronize metadata inputs in executor.\r\n* Don't prolong the lifetime of metadata-only inputs.\r\n* Add InputDevice specifier to random number generators shape_like input.\r\n\r\n---------\r\n\r\nSigned-off-by: Michał Zientkiewicz ","shortMessageHtmlLink":"Add metadata-only inputs. (#5635)"}},{"before":"89042095c46285ee4a9fa93920411fc7045310d7","after":"94f02ad69abe149f345684ef2aba3e13d246881a","ref":"refs/heads/main","pushedAt":"2024-09-18T12:44:33.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"mzient","name":"Michał Zientkiewicz","path":"/mzient","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/17044837?s=80&v=4"},"commit":{"message":"Add metadata-only inputs. (#5635)\n\n* Assign a stream to CPU inputs with GPU (non-metadata) inputs\r\n* Add Metadata input device - this declares that the input is used for metadata (shape, dtype, etc) access only\r\n* Don't synchronize metadata inputs in executor.\r\n* Don't prolong the lifetime of metadata-only inputs.\r\n* Add InputDevice specifier to random number generators shape_like input.\r\n\r\n---------\r\n\r\nSigned-off-by: Michał Zientkiewicz ","shortMessageHtmlLink":"Add metadata-only inputs. (#5635)"}},{"before":"869afb340a9c9963a7f84672317a8546bb89cd57","after":"89042095c46285ee4a9fa93920411fc7045310d7","ref":"refs/heads/main","pushedAt":"2024-09-17T15:19:59.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"mzient","name":"Michał Zientkiewicz","path":"/mzient","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/17044837?s=80&v=4"},"commit":{"message":"TypeTable/TypeInfo optimization (#5634)\n\n* TypeTable/TypeInfo optimization\r\n\r\n- TypeInfo uses string_view for type name\r\n- TypeTable stores types in an array\r\n- TypeTable read access is lockless\r\n\r\n---------\r\n\r\nSigned-off-by: Michal Zientkiewicz ","shortMessageHtmlLink":"TypeTable/TypeInfo optimization (#5634)"}},{"before":"3399b74e518c13e14bd908345a6c78f91dac3553","after":"869afb340a9c9963a7f84672317a8546bb89cd57","ref":"refs/heads/main","pushedAt":"2024-09-16T13:39:20.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"mzient","name":"Michał Zientkiewicz","path":"/mzient","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/17044837?s=80&v=4"},"commit":{"message":"Schema-based input device check (#5631)\n\n* Check input device in OpSpec::AddInput.\r\n* Add `Any` and `MatchBackendOrCPU` input devices\r\n* Fix InputDevice in operators. Add Any device capability to \"shape_like\" inputs.\r\n* Add python-side backend validation in logical expressions.\r\n\r\n---------\r\n\r\nSigned-off-by: Michal Zientkiewicz ","shortMessageHtmlLink":"Schema-based input device check (#5631)"}},{"before":null,"after":"56b4acc9ac1477bb627eba5065bbed8049e47295","ref":"refs/heads/release_v1.42","pushedAt":"2024-09-16T09:50:20.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"stiepan","name":"Kamil Tokarski","path":"/stiepan","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/11878086?s=80&v=4"},"commit":{"message":"Update VERSION to 1.42.0\n\nSigned-off-by: Kamil Tokarski ","shortMessageHtmlLink":"Update VERSION to 1.42.0"}},{"before":"408c18bb0d8a7c1b300e02fd7f6bb58369fdf4c6","after":"3399b74e518c13e14bd908345a6c78f91dac3553","ref":"refs/heads/main","pushedAt":"2024-09-16T09:47:38.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"stiepan","name":"Kamil Tokarski","path":"/stiepan","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/11878086?s=80&v=4"},"commit":{"message":"Update VERSION to 1.43.0dev\n\nSigned-off-by: Kamil Tokarski ","shortMessageHtmlLink":"Update VERSION to 1.43.0dev"}},{"before":"4b1833a65eb213be1e361a9b9dafb8fe8143c62e","after":"408c18bb0d8a7c1b300e02fd7f6bb58369fdf4c6","ref":"refs/heads/main","pushedAt":"2024-09-11T18:06:15.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"mzient","name":"Michał Zientkiewicz","path":"/mzient","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/17044837?s=80&v=4"},"commit":{"message":"Enable GPU->CPU transfers (#5593)\n\n* Add experimental_exec_dynamic flag to Pipeline to enable new executor\r\n* Add DataNode.cpu() that triggers a GPU->CPU copy\r\n* Remove checks that prevented GPU->CPU transitions from Python and Pipeline class\r\n* Remove checks that prevented CPU operators from taking GPU inputs\r\n* Use old executor's graph lowering to run the checks\r\n* Add cpu->gpu tests\r\n\r\n* TODO: Improve input backend checks (#5631)\r\n* TODO: Add tensorflow support\r\n\r\n---------\r\n\r\nSigned-off-by: Michal Zientkiewicz ","shortMessageHtmlLink":"Enable GPU->CPU transfers (#5593)"}},{"before":"f1a9a4dfa584746799494e9a82e4ba075953a6c2","after":"4b1833a65eb213be1e361a9b9dafb8fe8143c62e","ref":"refs/heads/main","pushedAt":"2024-09-11T05:29:28.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"JanuszL","name":"Janusz Lisiecki","path":"/JanuszL","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/39967756?s=80&v=4"},"commit":{"message":"Adds `enable_frame_num` to the experimental video reader (#5628)\n\n- adds an ability to output frame numbers in the experimental video\r\n reader\r\n- this allows making VideoReaderDecoderCpuTest.RandomShuffle_* test\r\n independent from the hardcoding of the expected frame order (in case\r\n random generator implementation changes)\r\n\r\nSigned-off-by: Janusz Lisiecki ","shortMessageHtmlLink":"Adds enable_frame_num to the experimental video reader (#5628)"}},{"before":"2117f88bfde3d8cdb3358dd1ceec86f75e345c23","after":"f1a9a4dfa584746799494e9a82e4ba075953a6c2","ref":"refs/heads/main","pushedAt":"2024-09-10T11:01:49.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"mzient","name":"Michał Zientkiewicz","path":"/mzient","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/17044837?s=80&v=4"},"commit":{"message":"Executor2 class implementation & tests (#5528)\n\n* Add Executor2 class, implementing ExecutorBase interface and using exec2::ExecGraph\r\n* Use a queue of futures to represent the output buffer queue.\r\n* Add CI test jobs\r\n* Add an environment variable that allows the user to use executor 2.0 instead of the default AsyncPipelinedExecutor\r\n\r\n---------\r\n\r\nSigned-off-by: Michal Zientkiewicz ","shortMessageHtmlLink":"Executor2 class implementation & tests (#5528)"}},{"before":"4b19d4393c4fe5381c1ec47feef90a6ba04f53ad","after":"2117f88bfde3d8cdb3358dd1ceec86f75e345c23","ref":"refs/heads/main","pushedAt":"2024-09-08T22:27:44.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"mzient","name":"Michał Zientkiewicz","path":"/mzient","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/17044837?s=80&v=4"},"commit":{"message":"Executor 2.0: Per-operator stream assignment policy (#5620)\n\nAssignment algorithm:\r\n- Each node holds a set of stream ids that are considered ready after the node is done. Each \"ready\" stream id is associated with a use count. When the stream id is used, it's use count is bumped up by 1. The nodes' ready sets contain the use count at the time at which they were inserted into the ready set.\r\n- The stream for the current node (nodes are processed in topological order) is obtained by looking at the \"ready\" sets of the preceding nodes and taking the lowest id which have use count equal to one found int the ready set. If the ready sets were empty (or all streams have had their use count bumped up since insertion into the ready set), a new stream id is generated.\r\n- When the stream id is assigned to a node, it's use count is bumped up by 1 and it's inserted into the current node's ready set.\r\n- The ready set for the current node is a union of input nodes' ready set's + current stream id.\r\n\r\n----\r\n\r\nSigned-off-by: Michał Zientkiewicz ","shortMessageHtmlLink":"Executor 2.0: Per-operator stream assignment policy (#5620)"}},{"before":"b3a1889c9b7f17dee4ae52f142d3c61cdc0bee22","after":"4b19d4393c4fe5381c1ec47feef90a6ba04f53ad","ref":"refs/heads/main","pushedAt":"2024-09-06T09:54:03.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"mzient","name":"Michał Zientkiewicz","path":"/mzient","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/17044837?s=80&v=4"},"commit":{"message":"Fix multiple initialization attempts in optical flow operator. (#5624)\n\nSigned-off-by: Michał Zientkiewicz ","shortMessageHtmlLink":"Fix multiple initialization attempts in optical flow operator. (#5624)"}},{"before":"4c3ec3c68f68a5bdad8df8448fd91108ba7482e3","after":"b3a1889c9b7f17dee4ae52f142d3c61cdc0bee22","ref":"refs/heads/main","pushedAt":"2024-09-05T19:02:00.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"mzient","name":"Michał Zientkiewicz","path":"/mzient","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/17044837?s=80&v=4"},"commit":{"message":"Fix null pointer access when clearing incomplete workspace payload. (#5622)\n\n- Add Workspace::ArgumentInputPtr\r\n- use it when clearing an incomplete workspace.\r\n\r\n-----\r\n\r\nSigned-off-by: Michał Zientkiewicz ","shortMessageHtmlLink":"Fix null pointer access when clearing incomplete workspace payload. (#…"}},{"before":"22304f422aaad9a1fc035eae31d596f4f91ed9b3","after":"4c3ec3c68f68a5bdad8df8448fd91108ba7482e3","ref":"refs/heads/main","pushedAt":"2024-09-05T10:04:13.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"JanuszL","name":"Janusz Lisiecki","path":"/JanuszL","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/39967756?s=80&v=4"},"commit":{"message":"Move to CUDA 12.6U1 (#5616)\n\nSigned-off-by: Janusz Lisiecki ","shortMessageHtmlLink":"Move to CUDA 12.6U1 (#5616)"}},{"before":"6efdeef070ea5d7993c33048a2851c1719c9eb72","after":"22304f422aaad9a1fc035eae31d596f4f91ed9b3","ref":"refs/heads/main","pushedAt":"2024-09-04T13:52:42.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"mzient","name":"Michał Zientkiewicz","path":"/mzient","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/17044837?s=80&v=4"},"commit":{"message":"Executor 2.0: Stream assignment (#5602)\n\n* Add ExecNode stream assignment algorithms and tests.\r\nNOTE: Follow up #5620 implements per-operator assignment.\r\n---------\r\nSigned-off-by: Michal Zientkiewicz ","shortMessageHtmlLink":"Executor 2.0: Stream assignment (#5602)"}},{"before":"623c2585c16d8b46c19d1e960cb9c8851be4faf8","after":"6efdeef070ea5d7993c33048a2851c1719c9eb72","ref":"refs/heads/main","pushedAt":"2024-09-02T13:57:50.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"mzient","name":"Michał Zientkiewicz","path":"/mzient","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/17044837?s=80&v=4"},"commit":{"message":"Tasking: Test returning multiple outputs of type std::any. (#5529)\n\nTest handling of multiple outputs of type std::any.\r\nThe desired behavior is that passing an iterable object with element type std::any doesn't wrap any into another layer of any,\r\nbut rather forwards the contained object.\r\n\r\nWhen returning:\r\n```C++\r\nreturn vector{ 1, 2.5f, string(\"cat\") };\r\n```\r\nWhen getting the results:\r\n```C+++\r\nint integer = task->GetInputValue(0);\r\nfloat fraction = task->GetInputValue(1);\r\nstring mammal = task->GetInputValue(2);\r\n```\r\n\r\n----\r\n\r\nSigned-off-by: Michał Zientkiewicz ","shortMessageHtmlLink":"Tasking: Test returning multiple outputs of type std::any. (#5529)"}},{"before":"2f6e6272b0f8ed505b4fb611f85bdb659759d43d","after":"623c2585c16d8b46c19d1e960cb9c8851be4faf8","ref":"refs/heads/main","pushedAt":"2024-09-02T09:53:38.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"stiepan","name":"Kamil Tokarski","path":"/stiepan","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/11878086?s=80&v=4"},"commit":{"message":"Patch OSS vulnerabilities (#5612)\n\n* Use the most recent DALI_DEPS_VERSION\r\n* Reflect the FFmpeg update in conda build\r\n---------\r\n\r\nSigned-off-by: Kamil Tokarski ","shortMessageHtmlLink":"Patch OSS vulnerabilities (#5612)"}},{"before":"bdcf16077cb95849a98ee8a0b6b9d0c06fcd1cc7","after":"2f6e6272b0f8ed505b4fb611f85bdb659759d43d","ref":"refs/heads/main","pushedAt":"2024-09-02T09:15:59.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"mzient","name":"Michał Zientkiewicz","path":"/mzient","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/17044837?s=80&v=4"},"commit":{"message":"Executor 2.0: Graph lowering (#5595)\n\n* Add OpGraph->ExecGraph lowering.\r\n* Extend ExecGraph tests with\r\n\r\nSigned-off-by: Michał Zientkiewicz ","shortMessageHtmlLink":"Executor 2.0: Graph lowering (#5595)"}},{"before":"65d8d8b3e283d6bb28d416d85cc60ecda9a255e3","after":"bdcf16077cb95849a98ee8a0b6b9d0c06fcd1cc7","ref":"refs/heads/main","pushedAt":"2024-08-26T06:01:12.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"JanuszL","name":"Janusz Lisiecki","path":"/JanuszL","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/39967756?s=80&v=4"},"commit":{"message":"Make DALI tests compatible with Python 3.12 (#5452)\n\n- adds additional replacements for functions used by nose\r\n and removed in python 3.12\r\n- remove numpy version pinning as the <1.26 is not compatible\r\n with python 3.12\r\n\r\nSigned-off-by: Janusz Lisiecki \r\nSigned-off-by: Kamil Tokarski \r\nCo-authored-by: Kamil Tokarski ","shortMessageHtmlLink":"Make DALI tests compatible with Python 3.12 (#5452)"}},{"before":"594a218b28a89e0c66a88092d4e090ae7c907270","after":"65d8d8b3e283d6bb28d416d85cc60ecda9a255e3","ref":"refs/heads/main","pushedAt":"2024-08-23T15:06:31.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"JanuszL","name":"Janusz Lisiecki","path":"/JanuszL","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/39967756?s=80&v=4"},"commit":{"message":"Adjust the L3 perf test threshold for H100 runners (#5606)\n\n- adjusts RN50 and EfficientNet perf threshold for L3 tests\r\n- turn off SHARP for L3 tests\r\n\r\nSigned-off-by: Janusz Lisiecki ","shortMessageHtmlLink":"Adjust the L3 perf test threshold for H100 runners (#5606)"}},{"before":"2d67119abe08ba2e8571d1bbdbb74532e906635c","after":"594a218b28a89e0c66a88092d4e090ae7c907270","ref":"refs/heads/main","pushedAt":"2024-08-19T10:26:32.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"JanuszL","name":"Janusz Lisiecki","path":"/JanuszL","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/39967756?s=80&v=4"},"commit":{"message":"Add L1 image decoder DALI test (#5601)\n\n* Add L1 image decoder DALI test\r\n\r\nSigned-off-by: Janusz Lisiecki ","shortMessageHtmlLink":"Add L1 image decoder DALI test (#5601)"}},{"before":"ad97f800f88d8076aabcfdad3f35ce5a425b58a6","after":"2d67119abe08ba2e8571d1bbdbb74532e906635c","ref":"refs/heads/main","pushedAt":"2024-08-13T06:48:56.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"JanuszL","name":"Janusz Lisiecki","path":"/JanuszL","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/39967756?s=80&v=4"},"commit":{"message":"Bump nvImageCodec to v0.3.0 and nvJPEG2k to 0.8 (#5604)\n\n- bumps nvImageCodec to v0.3.0\r\n- bumps up nvJPEG2k to 0.8\r\n\r\nSigned-off-by: Janusz Lisiecki ","shortMessageHtmlLink":"Bump nvImageCodec to v0.3.0 and nvJPEG2k to 0.8 (#5604)"}},{"before":"123f4f2eed2ff77433706c9a54320ebb77696115","after":"ad97f800f88d8076aabcfdad3f35ce5a425b58a6","ref":"refs/heads/main","pushedAt":"2024-08-12T10:11:28.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"stiepan","name":"Kamil Tokarski","path":"/stiepan","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/11878086?s=80&v=4"},"commit":{"message":"Update VERSION to 1.42.0dev\n\nSigned-off-by: Kamil Tokarski ","shortMessageHtmlLink":"Update VERSION to 1.42.0dev"}},{"before":null,"after":"040b354f8d3bad417f36a973e2249e4d46174273","ref":"refs/heads/release_v1.41","pushedAt":"2024-08-12T10:00:51.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"stiepan","name":"Kamil Tokarski","path":"/stiepan","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/11878086?s=80&v=4"},"commit":{"message":"Update VERSION to 1.41.0\n\nSigned-off-by: Kamil Tokarski ","shortMessageHtmlLink":"Update VERSION to 1.41.0"}},{"before":"bff5aef09c5d4c76743646dbb23afe721ec3052a","after":"123f4f2eed2ff77433706c9a54320ebb77696115","ref":"refs/heads/main","pushedAt":"2024-08-12T08:15:47.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"JanuszL","name":"Janusz Lisiecki","path":"/JanuszL","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/39967756?s=80&v=4"},"commit":{"message":"Fixes problems with fetching LFS objects during nvImageCodec conda build (#5603)\n\n- conda does bare mirror first and then clones the code to the build dir\r\n it also fetches the LFS object, but it does that only for the built reference, if there are\r\n other objects they are left out. Then it does the full cone and checkout and that is why it\r\n complains about missing objects. Also, it doesn't allow running any post-clone hooks.\r\n see https://github.com/conda/conda-build/issues/1462\r\n\r\nSigned-off-by: Janusz Lisiecki ","shortMessageHtmlLink":"Fixes problems with fetching LFS objects during nvImageCodec conda bu…"}},{"before":"7cdfa0e4aecf05ec56b0096f6302967db9bad086","after":"bff5aef09c5d4c76743646dbb23afe721ec3052a","ref":"refs/heads/main","pushedAt":"2024-08-09T16:29:14.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"mzient","name":"Michał Zientkiewicz","path":"/mzient","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/17044837?s=80&v=4"},"commit":{"message":"Executor 2.0: ExecGraph (#5587)\n\n## Overall design\r\nThe executor uses `dali::tasking` as the main run-time library. `dali::tasking` ensures that the tasks are executed in correct order.\r\nIt's the job of the executor to define the dependencies between the tasks and apply other constraints.\r\n\r\nThe heart of the system is `ExecGraph` - a graph structure that stores the detailed information about the execution of the pipeline's graph nodes.\r\n`ExecGraph` consists of `ExecNodes` connected with `ExecEdges`.\r\nThere are two kinds of graph nodes:\r\nOperator node - an `ExecNode` which stores an instance of a DALI `OperatorBase`. Its inputs and outputs correspond precisely to the ones in operator's `OpSpec`.\r\nOutput node - a node which gathers outputs of the operator nodes that comprise pipeline's output.. The inputs of this node are pipeline's outputs. It returns a `dali::Workspace` by value.\r\n\r\nThe `ExecGraph` is normally created by lowering the `graph::OpGraph`.\r\n\r\nThe life cycle of the graph is:\r\n* Construction (e.g. by lowering)\r\n* Topological sorting\r\n* Validation\r\n* Analysis & optimization\r\n* Usage (in a loop)\r\n * PrepareIteration\r\n * Launch\r\n\r\n## Graph structure\r\nExecGraph is a directed acyclic graph which stores the nodes and edges in linked lists. Each node has an array of input edges and an array of output descriptors.\r\nAn output descriptor aggregates a list of consumers and some properties of the output (device, pinnedness and similar).\r\nExecEdge is a structure used for navigating the graph. It links an output of a producer node to an input of a consumer node.\r\nThe graph has no dangling edges. All graph inputs start with an (inputless) ExternalSource node and all pipeline outputs contribute to the output node. Unused outputs have an output descriptor, but no outgoing edges.\r\n\r\nAfter construction the graph is sorted and analyzed. The sorting is a topological sort with an additional partitioning that guarantees that BatchSizeProviders appear first.\r\n\r\n## Implementing order of execution\r\nThe order of execution is implemented with dali::tasking dependency mechanisms.\r\nSee: http://dali-ci-01:7070/docs/17070353/doxygen/html/namespacedali_1_1tasking.html\r\n\r\n### Task state and integrity\r\nEach task has the current main task and the previous main task. Since operators are non-reentrant and potentially stateful, the current task succeeds the previous task. Simply adding main_task_->Succeed(prev_task_) ensures that the tasking::Scheduler will not begin the task until the previous iteration is complete.\r\n### Data dependencies\r\nEach task's main task subscribes to the outputs of the producers. This not only guarantees the order, but provides a mechanism by which the data is passed between operators. Each operator node returns one task output per one operator output.\r\n### Concurrency limit\r\nFor various reasons we may want to limit the concurrency of operators. Obviously an operator cannot run in parallel with itself due to reasons outlined above - but we may also want to limit the number of concurrently running different operators from various groups. For example, due to technical limitations of `dali::ThreadPool`, it's impossible to run multiple CPU operators simultaneously because concurrent submission of work to the threadpool results in a hang.\r\nConcurrency is limited with a tasking::Semaphore shared pointer stored in a node.\r\n### Output buffer limit\r\nWhen scheduling multiple iterations ahead, it's possible for \"bubbles\" to form - if an operator produces its data quickly but its consumers are slow, the operator node would create multiple output buffers which would live inside tasking framework as data being passed between tasks. In order to limit the number of active output buffers we need another semaphore - but this semaphore needs to be lowered until all consumers are done with the data. To achieve this an auxiliary (empty) task is scheduled to succeed all of the consumers and, upon completion, it raises the semaphore.\r\n\r\n#### Example\r\nIn this example the operator Op1 has a maximum of 2 output buffers. The iteration 1 proceeds without delay (only waiting for the previous iteration of each operator). In iteration 2, however, the operator Op1 has to wait before it allocates an output buffer. The blue boxes represent the operators' \"main\" tasks, the red boxes - the \"release_outputs\" task and the green boxes - semaphore operations.\r\n![image](https://github.com/user-attachments/assets/a3025551-96af-42e3-a238-544df2d66121)\r\n\r\n_NOTE: This diagram represents task life cycle, not thread activity - with tasking::Scheduler the worker threads never actually wait as long as there are tasks to execute. Resource (e.g. semaphore) acquisition follows \"'wait all\" semantics, so the extended \"acquire\" boxes are not an accurate representation of \"waiting on a semaphore\"._\r\n\r\n## Workspace lifecycle\r\nThe new executor follows a \"linear\" memory usage model - buffers are created as needed and thrown away as soon as they're no longer used. The memory pool is solely responsible for efficient memory recycling.\r\nDespite the buffers' being disposable, the workspace object contains some additional structure (e.g. mapping of argument input names to indices) which we don't need to recreate each time the operator runs. Because of that, each ExecNode has a Workspace object which stores the workspace. The workspace is removed from ExecNode at the beginning of the task body and returned to it when the task completes.\r\nLife cycle:\r\n- Get workspace from ExecNode\r\n- (run the task)\r\n- Clear workspace\r\n- Put workspace back in ExecNode\r\n\r\n_(*) Clearing the workspace means removing all TensorLists from it_\r\n\r\n## ExecNode Tasks\r\n### Operator task\r\nThe operator task performs the following operations:\r\n- get the inputs from parent tasks\r\n- wait for inputs in the operator's stream/order\r\n- put the inputs into the workspace\r\n- apply default input layouts, if necessary\r\n- compute the batch size\r\n- create the outputs\r\n- run operator's Setup\r\n- Resize the outputs, if necessary\r\n- run operator's Run\r\n- propagate metadata\r\n- record CUDA events\r\n- restore empty input layouts, if necessary\r\n\r\n### Output task\r\n- wait for the inputs\r\n- construct the \"output workspace\" where the outputs are workspace outputs (and the tasks's inputs)\r\n- move the output workspace to the task's return value\r\n\r\n---------\r\n\r\nSigned-off-by: Michał Zientkiewicz ","shortMessageHtmlLink":"Executor 2.0: ExecGraph (#5587)"}},{"before":"eb57df4faf33e2fae0224c415b2aeb32149a0919","after":"7cdfa0e4aecf05ec56b0096f6302967db9bad086","ref":"refs/heads/main","pushedAt":"2024-08-09T14:04:42.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"JanuszL","name":"Janusz Lisiecki","path":"/JanuszL","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/39967756?s=80&v=4"},"commit":{"message":"Enable more Python types to be supported by the DALI python function (#5598)\n\n- adds support to DALI to and from DLPack conversion\r\n- extends types support for fn.ones, fn.full to include\r\n int8, uin32, uint64\r\n- adds tests\r\n\r\nSigned-off-by: Janusz Lisiecki ","shortMessageHtmlLink":"Enable more Python types to be supported by the DALI python function (#…"}},{"before":"c7ca14eeeec4fb1aaeea429d6a922ad053c7700c","after":"eb57df4faf33e2fae0224c415b2aeb32149a0919","ref":"refs/heads/main","pushedAt":"2024-08-08T16:16:21.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"jantonguirao","name":"Joaquin Anton","path":"/jantonguirao","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/3891217?s=80&v=4"},"commit":{"message":"Fix the --python-tag option passed to python setup.py bdist_wheel command (#5600)\n\nSigned-off-by: Joaquin Anton Guirao ","shortMessageHtmlLink":"Fix the --python-tag option passed to python setup.py bdist_wheel com…"}},{"before":"69b845f92009e7af01d955d2781099ef32a1668b","after":"c7ca14eeeec4fb1aaeea429d6a922ad053c7700c","ref":"refs/heads/main","pushedAt":"2024-08-08T13:53:44.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"mzient","name":"Michał Zientkiewicz","path":"/mzient","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/17044837?s=80&v=4"},"commit":{"message":"Remove usages of `std::call_once`. (#5599)\n\nstd::call_once has a bug in some versions of glibc.\r\nA custom implementation has been provided.\r\nUsages with static once_flag have been replaced with \"magic statics\".\r\nOptical flow doesn't need atomicity and the once semantics were not\r\nrequired at all.\r\nOther usages use the custom implementation.\r\n\r\nA stale warning about antialiasing (also guarded with call once) was\r\nremoved.\r\n\r\n---------\r\n\r\nSigned-off-by: Michał Zientkiewicz ","shortMessageHtmlLink":"Remove usages of std::call_once. (#5599)"}},{"before":"bccecb7bcb77863cf199feff07151314e9cdf41f","after":"69b845f92009e7af01d955d2781099ef32a1668b","ref":"refs/heads/main","pushedAt":"2024-08-08T11:32:13.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"JanuszL","name":"Janusz Lisiecki","path":"/JanuszL","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/39967756?s=80&v=4"},"commit":{"message":"Move to CUDA 12.6 (#5596)\n\nSigned-off-by: Janusz Lisiecki ","shortMessageHtmlLink":"Move to CUDA 12.6 (#5596)"}}],"hasNextPage":true,"hasPreviousPage":false,"activityType":"all","actor":null,"timePeriod":"all","sort":"DESC","perPage":30,"cursor":"djE6ks8AAAAEuwq9cQA","startCursor":null,"endCursor":null}},"title":"Activity · NVIDIA/DALI"}