Skip to content

Commit

Permalink
btl/uct: add support for using an another memory domain to form conne…
Browse files Browse the repository at this point in the history
…ctions

The UCT BTL looks for a connect-to-iface interface in each memory domain to
form connections for connect-to-endpoint transports. For example, with ib the
btl will pick the UD transport as the means to setup RC. While there are
connection transports available (RDMACM) I chose using UD (etc) to support
networks that did not necessarily provide a connection transport.

I am currently working with improving support for Open MPI on a RoCEv2 system
that does not provide support for UD (yet). This breaks the assumption that
there will always be a connect-to-ifact transport available in all memory
domains. To fix this issue this change updates the detection logic to locate
a suitable transport for making connections (tcp by default). If a memory
domain does not have a suitable connection transport the alternate will be
used instead. This has been tested on our broken-UD system and works well.

It a connection-only transport is not needed the extra transport module is
destroyed and the in-memory domain connection transport is used.

Signed-off-by: Nathan Hjelm <[email protected]>
  • Loading branch information
hjelmn committed Sep 23, 2024
1 parent e4b98d7 commit 4dc817c
Show file tree
Hide file tree
Showing 6 changed files with 475 additions and 237 deletions.
23 changes: 22 additions & 1 deletion opal/mca/btl/uct/btl_uct.h
Original file line number Diff line number Diff line change
Expand Up @@ -139,9 +139,15 @@ struct mca_btl_uct_component_t {

/** allowed UCT memory domains */
char *memory_domains;
mca_btl_uct_include_list_t memory_domain_list;

/** allowed transports */
char *allowed_transports;
mca_btl_uct_include_list_t allowed_transport_list;

/** transports to consider for forming connections */
char *connection_domains;
mca_btl_uct_include_list_t connection_domain_list;

/** number of worker contexts to create */
int num_contexts_per_module;
Expand All @@ -153,6 +159,10 @@ struct mca_btl_uct_component_t {

/** disable UCX memory hooks */
bool disable_ucx_memory_hooks;

/** alternate connection-only module that can be used if no suitable
* connection tl is found. this is usually a tcp tl. */
mca_btl_uct_module_t *conn_module;
};
typedef struct mca_btl_uct_component_t mca_btl_uct_component_t;

Expand Down Expand Up @@ -289,7 +299,8 @@ struct mca_btl_base_endpoint_t *mca_btl_uct_get_ep(struct mca_btl_base_module_t
opal_proc_t *proc);

int mca_btl_uct_query_tls(mca_btl_uct_module_t *module, mca_btl_uct_md_t *md,
uct_tl_resource_desc_t *tl_descs, unsigned tl_count);
uct_tl_resource_desc_t *tl_descs, unsigned tl_count,
bool evaluate_for_conn_only);
int mca_btl_uct_process_connection_request(mca_btl_uct_module_t *module,
mca_btl_uct_conn_req_t *req);

Expand Down Expand Up @@ -336,5 +347,15 @@ static inline bool mca_btl_uct_tl_requires_connection_tl(mca_btl_uct_tl_t *tl)
return !(MCA_BTL_UCT_TL_ATTR(tl, 0).cap.flags & UCT_IFACE_FLAG_CONNECT_TO_IFACE);
}

/**
* @brief Find the rank of `name` in the include list `list`.
*
* @param[in] name name to find
* @param[in] list list to search
*
* A negative result means the name is not present or the list is negated.
*/
int mca_btl_uct_include_list_rank (const char *name, const mca_btl_uct_include_list_t *list);

END_C_DECLS
#endif
Loading

0 comments on commit 4dc817c

Please sign in to comment.