LWN.net Logo

 
Sponsored Link
LinuxQuestions.org Linux Wiki

The LQ Wiki is a completely free, collaborative Linux knowledgebase in wiki format. Share your Linux knowledge today!


 
Summary page
Return to the Kernel page
 
Recent Features

LWN.net Weekly Edition for March 18, 2004

LWN.net Weekly Edition for March 11, 2004

The annotated SCO stock price chart

A grumpy editor's calendar search

LWN.net Weekly Edition for March 4, 2004

Printable page
 

 

Driver porting: Request Queues II

This article is part of the LWN Porting Drivers to 2.6 series.
This article continues the look at request queues in 2.6; if you've not read the first part in the request queue series, you may want to start there. Here we'll look at command pregeneration, tagged command queueing, and doing without a request queue altogether.

Command pregeneration

Traditionally, block drivers have prepared low-level hardware commands at the time a request is processed. There can be advantages to preparing commands at an earlier point, however. In 2.6, drivers which wish to prepare commands (or perform some other sort of processing) for requests before they hit the request function should set up a prep_rq_fn with this prototype:

    typedef int (prep_rq_fn) (request_queue_t *q, struct request *rq);

This function should perform preparatory work on the given request rq. The 2.6 request structure includes a 16-byte cmd field where a pregenerated command can be stored; rq->cmd_len should be set to the length of that command. The prep function should return BLKPREP_OK (process the request normally), BLKPREP_DEFER (which defers processing of the command for now), or BLKPREP_KILL (which terminates the request with a failure status).

To add your prep function to a request queue, call:

    void blk_queue_prep_rq(request_queue_t *q, prep_rq_fn *pfn);

The prep function is currently called out of elv_next_request() - immediately before the request is passed back to your driver. There is a possibility that, at some future point, the call to the prep function could happen earlier in the process.

Tagged command queueing

Tagged command queueing (TCQ) allows a block device to have multiple outstanding I/O requests, each identified by an integer "tag." TCQ can yield performance benefits; the drive generally knows best when it comes to figuring out which request should be serviced next. SCSI drivers in Linux have long supported TCQ, but each driver has included its own infrastructure for tag management. In 2.6, a simple tag management facility has been added to the block layer. The generic tag management code can make life easier, but it's important to understand how these functions interact with the request queue.

Drivers wishing to use tags should set things up with:

    int blk_queue_init_tags(request_queue_t *q, int depth,
                            struct blk_queue_tag *tags);

This call should be made after the queue has been initialized. Here, depth is the maximum number of tagged commands which can be outstanding at any given time. The tags argument is a pointer to a blk_queue_tag structure which will be used to track the outstanding tags. Normally you can pass tags as NULL, and the block subsystem will allocate and initialize the structure for you. If you wish to share a structure (and, thus, the tags it represents) with another device, however, you can pass a pointer to the blk_queue_tag structure in the first queue when initializing the second. This call performs memory allocation, and will return a negative error code if that allocation failed.

A call to:

    void blk_queue_free_tags(request_queue_t *q);

will clean up the TCQ infrastructure. This normally happens automatically when blk_cleanup_queue() is called, so drivers do not normally have to call blk_queue_free_tags() themselves.

To allocate a tag for a request, use:

    int blk_queue_start_tag(request_queue_t *q, struct request *rq);

This function will associate a tag number with the given request rq, storing it in rq->tag. The return value will be zero on success, or a nonzero value if there are no more tags available. This function will remove the request from the queue, so your driver must take care not to lose track of it - and to not try to dequeue the request itself. It is also necessary to hold the queue lock when calling blk_queue_start_tag().

blk_queue_start_tag() has been designed to work as the command prep function. If your driver would like to have tags automatically assigned, it can perform a call like:

    blk_queue_prep_rq(queue, blk_queue_start_tag);

And every request that comes from elv_next_request() will already have a tag associated with it.

If you need to know if a given request has a tag associated with it, use the macro blk_rq_tagged(rq). The return value will be nonzero if this request has been tagged.

When all transfers for a tagged request have been completed, the tag should be returned with:

    void blk_queue_end_tag(request_queue_t *q, struct request *rq);

Timing is important here: blk_queue_end_tag() must be called before end_that_request_last(), or unpleasant things will happen. Be sure to have the queue lock held when calling this function.

If you need to know which request is associated with a given tag, call:

    struct request *blk_queue_find_tag(request_queue_t *q, int tag);

The return value will be the request structure, or NULL if the given tag is not currently in use.

In the real world, things occasionally go wrong. If a drive (or the bus it is attached to) goes into an error state and must be reset, all outstanding tagged requests will be lost. In such a situation, the driver should call:

    void blk_queue_invalidate_tags(request_queue_t *q);

This call will return all outstanding tags to the pool, and the associated I/O requests will be returned to the request queue so that they can be restarted.

Doing without a request queue

Some devices have no real need for a request queue. In particular, truly random-access devices, such as memory technology devices or ramdisks, can process requests quickly and do not benefit from sorting and merging of requests. Drivers for such devices may achieve better performance by shorting out much of the request queue structure and handling requests directly as they are generated.

As in 2.4, this sort of driver can set up a "make request" function. First, however, the request queue must still be created. The queue will not be used to handle the actual requests, but it contains other infrastructure needed by the block subsystem. If your driver will use a make request function, it should first create the queue with blk_alloc_queue():

    request_queue_t *blk_alloc_queue(int gfp_mask);

The gfp_mask argument describes how the requisite memory should be allocated, as usual. Note that this call can fail.

Once you have a request queue, you can set up the make request function; the prototype for this function has changed a bit from 2.4, however:

    typedef int (make_request_fn) (request_queue_t *q, struct bio *bio);

If the make request function can arrange for the transfer(s) described in the given bio, it should do so and return zero. "Stacking" drivers can also redirect the bio by changing its bi_bdev field and returning nonzero; in this case the bio will then be dispatched to the new device's driver (this is as things were done in 2.4).

If the "make request" function performs the transfer itself, it is responsible for passing the BIO to bio_endio() when the transfer is complete. Note that the "make request" function is not called with the queue lock held.

To arrange for your driver's function to be called, use:

    void blk_queue_make_request(request_queue_t *q, 
                                make_request_fn *func);

If and when your driver shuts down, be sure to return the request queue to the system with:

    void blk_put_queue(request_queue_t *queue);

As of 2.6.0-test3, this function is just another name for blk_cleanup_queue(), but such things could always change in the future.


Post a comment

  Driver porting: Request Queues II
(Posted May 25, 2003 14:10 UTC (Sun) by dwmw2) (Post reply)

Actually the drivers which make memory technology devices (i.e. flash) pretend to be a block device by some kind of 'Translation Layer' -- from the most naïve and unsafe read/erase/modify/writeback of the 'mtdblock' driver to the more complicated pseudo-filesystem of the FTL and NFTL drivers -- does benefit from request merging. You have a limited number of erase cycles to each block on the flash and it does help to combine requests which fall within the same erase block.

Copyright (©) 2003, Eklektix, Inc.
Linux (®) is a registered trademark of Linus Torvalds
Powered by Rackspace Managed Hosting.