1
linux/Documentation/infiniband/user_mad.txt
Roland Dreier 2be8e3ee8e IB/umad: Add P_Key index support
Add support for setting the P_Key index of sent MADs and getting the
P_Key index of received MADs.  This requires a change to the layout of
the ABI structure struct ib_user_mad_hdr, so to avoid breaking
compatibility, we default to the old (unchanged) ABI and add a new
ioctl IB_USER_MAD_ENABLE_PKEY that allows applications that are aware
of the new ABI to opt into using it.

We plan on switching to the new ABI by default in a year or so, and
this patch adds a warning that is printed when an application uses the
old ABI, to push people towards converting to the new ABI.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Reviewed-by: Hal Rosenstock <hal@xsigo.com>
2007-10-09 19:59:15 -07:00

149 lines
4.8 KiB
Plaintext

USERSPACE MAD ACCESS
Device files
Each port of each InfiniBand device has a "umad" device and an
"issm" device attached. For example, a two-port HCA will have two
umad devices and two issm devices, while a switch will have one
device of each type (for switch port 0).
Creating MAD agents
A MAD agent can be created by filling in a struct ib_user_mad_reg_req
and then calling the IB_USER_MAD_REGISTER_AGENT ioctl on a file
descriptor for the appropriate device file. If the registration
request succeeds, a 32-bit id will be returned in the structure.
For example:
struct ib_user_mad_reg_req req = { /* ... */ };
ret = ioctl(fd, IB_USER_MAD_REGISTER_AGENT, (char *) &req);
if (!ret)
my_agent = req.id;
else
perror("agent register");
Agents can be unregistered with the IB_USER_MAD_UNREGISTER_AGENT
ioctl. Also, all agents registered through a file descriptor will
be unregistered when the descriptor is closed.
Receiving MADs
MADs are received using read(). The receive side now supports
RMPP. The buffer passed to read() must be at least one
struct ib_user_mad + 256 bytes. For example:
If the buffer passed is not large enough to hold the received
MAD (RMPP), the errno is set to ENOSPC and the length of the
buffer needed is set in mad.length.
Example for normal MAD (non RMPP) reads:
struct ib_user_mad *mad;
mad = malloc(sizeof *mad + 256);
ret = read(fd, mad, sizeof *mad + 256);
if (ret != sizeof mad + 256) {
perror("read");
free(mad);
}
Example for RMPP reads:
struct ib_user_mad *mad;
mad = malloc(sizeof *mad + 256);
ret = read(fd, mad, sizeof *mad + 256);
if (ret == -ENOSPC)) {
length = mad.length;
free(mad);
mad = malloc(sizeof *mad + length);
ret = read(fd, mad, sizeof *mad + length);
}
if (ret < 0) {
perror("read");
free(mad);
}
In addition to the actual MAD contents, the other struct ib_user_mad
fields will be filled in with information on the received MAD. For
example, the remote LID will be in mad.lid.
If a send times out, a receive will be generated with mad.status set
to ETIMEDOUT. Otherwise when a MAD has been successfully received,
mad.status will be 0.
poll()/select() may be used to wait until a MAD can be read.
Sending MADs
MADs are sent using write(). The agent ID for sending should be
filled into the id field of the MAD, the destination LID should be
filled into the lid field, and so on. The send side does support
RMPP so arbitrary length MAD can be sent. For example:
struct ib_user_mad *mad;
mad = malloc(sizeof *mad + mad_length);
/* fill in mad->data */
mad->hdr.id = my_agent; /* req.id from agent registration */
mad->hdr.lid = my_dest; /* in network byte order... */
/* etc. */
ret = write(fd, &mad, sizeof *mad + mad_length);
if (ret != sizeof *mad + mad_length)
perror("write");
Transaction IDs
Users of the umad devices can use the lower 32 bits of the
transaction ID field (that is, the least significant half of the
field in network byte order) in MADs being sent to match
request/response pairs. The upper 32 bits are reserved for use by
the kernel and will be overwritten before a MAD is sent.
P_Key Index Handling
The old ib_umad interface did not allow setting the P_Key index for
MADs that are sent and did not provide a way for obtaining the P_Key
index of received MADs. A new layout for struct ib_user_mad_hdr
with a pkey_index member has been defined; however, to preserve
binary compatibility with older applications, this new layout will
not be used unless the IB_USER_MAD_ENABLE_PKEY ioctl is called
before a file descriptor is used for anything else.
In September 2008, the IB_USER_MAD_ABI_VERSION will be incremented
to 6, the new layout of struct ib_user_mad_hdr will be used by
default, and the IB_USER_MAD_ENABLE_PKEY ioctl will be removed.
Setting IsSM Capability Bit
To set the IsSM capability bit for a port, simply open the
corresponding issm device file. If the IsSM bit is already set,
then the open call will block until the bit is cleared (or return
immediately with errno set to EAGAIN if the O_NONBLOCK flag is
passed to open()). The IsSM bit will be cleared when the issm file
is closed. No read, write or other operations can be performed on
the issm file.
/dev files
To create the appropriate character device files automatically with
udev, a rule like
KERNEL="umad*", NAME="infiniband/%k"
KERNEL="issm*", NAME="infiniband/%k"
can be used. This will create device nodes named
/dev/infiniband/umad0
/dev/infiniband/issm0
for the first port, and so on. The InfiniBand device and port
associated with these devices can be determined from the files
/sys/class/infiniband_mad/umad0/ibdev
/sys/class/infiniband_mad/umad0/port
and
/sys/class/infiniband_mad/issm0/ibdev
/sys/class/infiniband_mad/issm0/port