96
Advanced Char Driver Operations Sarah Diesburg CIS 4930

Advanced Char Driver Operations Sarah Diesburg CIS 4930

Embed Size (px)

Citation preview

Page 1: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Advanced Char Driver Operations

Sarah Diesburg

CIS 4930

Page 2: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Resources

LDD Chapter 3 Red font in slides where up-to-date code diverges

from book LDD module source code for 3.2.x

http://ww2.cs.fsu.edu/~diesburg/courses/dd/code.html

Page 3: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Resources

LXR – Cross-referenced Linux Go to http://lxr.linux.no/ Click on Linux 2.6.11 and later Select your kernel version from drop-down menu

Page 4: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Topics

Managing ioctl command numbers Block/unblocking a process Seeking on a device Access control

Page 5: Advanced Char Driver Operations Sarah Diesburg CIS 4930

ioctl

For operations beyond simple data transfers Eject the media Report error information Change hardware settings Self destruct

Alternatives Embedded commands in the data stream Driver-specific file systems

Page 6: Advanced Char Driver Operations Sarah Diesburg CIS 4930

ioctl

User-level interfaceint ioctl(int fd, int request, ...); ...

Variable number of arguments Problematic for the system call interface

In this context, it is meant to pass a single optional argument Traditionally a char *argp Just a way to bypass the type checking

For more information, look at man page

Page 7: Advanced Char Driver Operations Sarah Diesburg CIS 4930

ioctl

Driver-level interfaceint (*unlocked_ioctl) (struct file *filp,

unsigned int cmd,

unsigned long arg); cmd is passed from the user unchanged arg can be an integer or a pointer Compiler does not type check

Ioctl has changed from the LDD3 era Modified to remove the big kernel lock (BKL) http://lwn.net/Articles/119652/

Page 8: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Choosing the ioctl Commands Need a numbering scheme to avoid mistakes

E.g., issuing a command to the wrong device (changing the baud rate of an audio device)

Check include/linux/ioctl.h and directory Documentation/ioctl/

Page 9: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Choosing the ioctl Commands A command number uses four bitfields

Defined in <linux/ioctl.h> < direction, type, number, size> direction: direction of data transfer

_IOC_NONE _IOC_READ _IOC_WRITE _IOC_READ | WRITE

Page 10: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Choosing the ioctl Commands

type (ioctl device type) 8-bit (_IOC_TYPEBITS) magic number Associated with the device

number 8-bit (_IOC_NRBITS) sequential number Unique within device

size: size of user data involved The width is either 13 or 14 bits (_IOC_SIZEBITS)

Page 11: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Choosing the ioctl Commands Useful macros to create ioctl command

numbers _IO(type, nr) _IOR(type, nr, datatype) _IOW(type, nr, datatype) _IOWR(type, nr, datatype)

size = sizeof(datatype)

Page 12: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Choosing the ioctl Commands Useful macros to decode ioctl command

numbers _IOC_DIR(nr) _IOC_TYPE(nr) _IOC_NR(nr) _IOC_SIZE(nr)

Page 13: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Choosing the ioctl Commands The scull example

/* Use 'k' as magic number */

#define SCULL_IOC_MAGIC 'k‘

/* Please use a different 8-bit number in your code */

#define SCULL_IOCRESET _IO(SCULL_IOC_MAGIC, 0)

Page 14: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Choosing the ioctl Commands The scull example/*

* S means "Set" through a ptr,

* T means "Tell" directly with the argument value

* G means "Get": reply by setting through a pointer

* Q means "Query": response is on the return value

* X means "eXchange": switch G and S atomically

* H means "sHift": switch T and Q atomically

*/

#define SCULL_IOCSQUANTUM _IOW(SCULL_IOC_MAGIC, 1, int)

#define SCULL_IOCSQSET _IOW(SCULL_IOC_MAGIC, 2, int)

#define SCULL_IOCTQUANTUM _IO(SCULL_IOC_MAGIC, 3)

#define SCULL_IOCTQSET _IO(SCULL_IOC_MAGIC, 4)

#define SCULL_IOCGQUANTUM _IOR(SCULL_IOC_MAGIC, 5, int)

Set new value and return the old value

Page 15: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Choosing the ioctl Commands The scull example

#define SCULL_IOCGQSET _IOR(SCULL_IOC_MAGIC, 6, int)

#define SCULL_IOCQQUANTUM _IO(SCULL_IOC_MAGIC, 7)

#define SCULL_IOCQQSET _IO(SCULL_IOC_MAGIC, 8)

#define SCULL_IOCXQUANTUM _IOWR(SCULL_IOC_MAGIC, 9, int)

#define SCULL_IOCXQSET _IOWR(SCULL_IOC_MAGIC,10, int)

#define SCULL_IOCHQUANTUM _IO(SCULL_IOC_MAGIC, 11)

#define SCULL_IOCHQSET _IO(SCULL_IOC_MAGIC, 12)

#define SCULL_IOC_MAXNR 14

Page 16: Advanced Char Driver Operations Sarah Diesburg CIS 4930

The Return Value

When the command number is not supported Return –EINVAL Or –ENOTTY (according to the POSIX standard)

Page 17: Advanced Char Driver Operations Sarah Diesburg CIS 4930

The Predefined Commands

Handled by the kernel first Will not be passed down to device drivers

Three groups For any file (regular, device, FIFO, socket)

Magic number: “T.” For regular files only Specific to the file system type

Page 18: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Using the ioctl Argument

If it is an integer, just use it directly If it is a pointer

Need to check for valid user addressint access_ok(int type, const void *addr,

unsigned long size); type: either VERIFY_READ or VERIFY_WRITE Returns 1 for success, 0 for failure

Driver then results –EFAULT to the caller Defined in <linux/uaccess.h> Mostly called by memory-access routines

Page 19: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Using the ioctl Argument

The scull exampleint scull_ioctl(struct file *filp,

unsigned int cmd, unsigned long arg) {

int err = 0, tmp;

int retval = 0;

/* check the magic number and whether the command is defined */

if (_IOC_TYPE(cmd) != SCULL_IOC_MAGIC) {

return -ENOTTY;

}

if (_IOC_NR(cmd) > SCULL_IOC_MAXNR) {

return -ENOTTY;

}

Page 20: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Using the ioctl Argument

The scull example…

/* the concept of "read" and "write" is reversed here */

if (_IOC_DIR(cmd) & _IOC_READ) {

err = !access_ok(VERIFY_WRITE, (void __user *) arg,

_IOC_SIZE(cmd));

} else if (_IOC_DIR(cmd) & _IOC_WRITE) {

err = !access_ok(VERIFY_READ, (void __user *) arg,

_IOC_SIZE(cmd));

}

if (err) return -EFAULT;

Page 21: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Using the ioctl Argument

Data transfer functions optimized for most used data sizes (1, 2, 4, and 8 bytes) If the size mismatches

Cryptic compiler error message: Conversion to non-scalar type requested

Use copy_to_user and copy_from_user #include <linux/uaccess.h>

put_user(datum, ptr) Writes to a user-space address Calls access_ok() Returns 0 on success, -EFAULT on error

Page 22: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Using the ioctl Argument

__put_user(datum, ptr) Does not check access_ok() Can still fail if the user-space memory is not writable

get_user(local, ptr) Reads from a user-space address Calls access_ok() Stores the retrieved value in local Returns 0 on success, -EFAULT on error

__get_user(local, ptr) Does not check access_ok() Can still fail if the user-space memory is not readable

Page 23: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Capabilities and Restricted Operations Limit certain ioctl operations to privileged users See <linux/capability.h> for the full set of

capabilities To check a certain capability call

int capable(int capability); In the scull example

if (!capable(CAP_SYS_ADMIN)) {

return –EPERM;

}A catch-all capability for

many system administration operations

Page 24: Advanced Char Driver Operations Sarah Diesburg CIS 4930

The Implementation of the ioctl Commands A giant switch statement…

switch(cmd) {

case SCULL_IOCRESET:

scull_quantum = SCULL_QUANTUM;

scull_qset = SCULL_QSET;

break;

case SCULL_IOCSQUANTUM: /* Set: arg points to the value */

if (!capable(CAP_SYS_ADMIN)) {

return -EPERM;

}

retval = __get_user(scull_quantum, (int __user *)arg);

break;

Page 25: Advanced Char Driver Operations Sarah Diesburg CIS 4930

The Implementation of the ioctl Commands…

case SCULL_IOCTQUANTUM: /* Tell: arg is the value */

if (!capable(CAP_SYS_ADMIN)) {

return -EPERM;

}

scull_quantum = arg;

break;

case SCULL_IOCGQUANTUM: /* Get: arg is pointer to result */

retval = __put_user(scull_quantum, (int __user *) arg);

break;

case SCULL_IOCQQUANTUM: /* Query: return it (> 0) */

return scull_quantum;

Page 26: Advanced Char Driver Operations Sarah Diesburg CIS 4930

The Implementation of the ioctl Commands…

case SCULL_IOCXQUANTUM: /* eXchange: use arg as pointer */

if (!capable(CAP_SYS_ADMIN)) {

return -EPERM;

}

tmp = scull_quantum;

retval = __get_user(scull_quantum, (int __user *) arg);

if (retval == 0) {

retval = __put_user(tmp, (int __user *) arg);

}

break;

Page 27: Advanced Char Driver Operations Sarah Diesburg CIS 4930

The Implementation of the ioctl Commands…

case SCULL_IOCHQUANTUM: /* sHift: like Tell + Query */

if (!capable(CAP_SYS_ADMIN)) {

return -EPERM;

}

tmp = scull_quantum;

scull_quantum = arg;

return tmp;

default: /* redundant, as cmd was checked against MAXNR */

return -ENOTTY;

} /* switch */

return retval;

} /* scull_ioctl */

Page 28: Advanced Char Driver Operations Sarah Diesburg CIS 4930

The Implementation of the ioctl Commands Six ways to pass and receive arguments from

the user space Need to know command number

int quantum;

ioctl(fd,SCULL_IOCSQUANTUM, &quantum); /* Set by pointer */

ioctl(fd,SCULL_IOCTQUANTUM, quantum); /* Set by value */

ioctl(fd,SCULL_IOCGQUANTUM, &quantum); /* Get by pointer */

quantum = ioctl(fd,SCULL_IOCQQUANTUM); /* Get by return value */

ioctl(fd,SCULL_IOCXQUANTUM, &quantum); /* Exchange by pointer */

/* Exchange by value */

quantum = ioctl(fd,SCULL_IOCHQUANTUM, quantum);

Page 29: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Device Control Without ioctl Writing control sequences into the data

stream itself Example: console escape sequences Advantages:

No need to implement ioctl methods Disadvantages:

Need to make sure that escape sequences do not appear in the normal data stream (e.g., cat a binary file)

Need to parse the data stream

Page 30: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Blocking I/O

Needed when no data is available for reads When the device is not ready to accept data

Output buffer is full

Page 31: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Introduction to Sleeping

Page 32: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Introduction to Sleeping

A process is removed from the scheduler’s run queue

Certain rules Never sleep when running in an atomic context

Multiple steps must be performed without concurrent accesses

Not while holding a spinlock, seqlock, or RCU lock Not while disabling interrupts

Page 33: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Introduction to Sleeping

Okay to sleep while holding a semaphore Other threads waiting for the semaphore will also sleep Need to keep it short Make sure that it is not blocking the process that will wake

it up After waking up

Make no assumptions about the state of the system The resource one is waiting for might be gone again Must check the wait condition again

Page 34: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Introduction to Sleeping

Wait queue: contains a list of processes waiting for a specific event #include <linux/wait.h> To initialize statically, callDECLARE_WAIT_QUEUE_HEAD(my_queue);

To initialize dynamically, callwait_queue_head_t my_queue;

init_waitqueue_head(&my_queue);

Page 35: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Simple Sleeping

Call variants of wait_event macros wait_event(queue, condition)

queue = wait queue head Passed by value

Waits until the boolean condition becomes true Puts into an uninterruptible sleep

Usually is not what you want

wait_event_interruptible(queue, condition) Can be interrupted by signals Returns nonzero if sleep was interrupted

Your driver should return -ERESTARTSYS

Page 36: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Simple Sleeping

wait_event_timeout(queue, condition, timeout) Wait for a limited time (in jiffies) Returns 0 regardless of condition evaluations

wait_event_interruptible_timeout(queue,

condition,

timeout)

Page 37: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Simple Sleeping

To wake up, call variants of wake_up functionsvoid wake_up(wait_queue_head_t *queue);

Wakes up all processes waiting on the queue

void wake_up_interruptible(wait_queue_head_t *queue); Wakes up processes that perform an interruptible sleep

Page 38: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Simple Sleeping

Example module: sleepystatic DECLARE_WAIT_QUEUE_HEAD(wq);

static int flag = 0;

ssize_t sleepy_read(struct file *filp, char __user *buf,

size_t count, loff_t *pos) {

printk(KERN_DEBUG "process %i (%s) going to sleep\n",

current->pid, current->comm);

wait_event_interruptible(wq, flag != 0);

flag = 0;

printk(KERN_DEBUG "awoken %i (%s)\n", current->pid,

current->comm);

return 0; /* EOF */

}

Multiple threads can wake up at this point

Page 39: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Simple Sleeping

Example module: sleepyssize_t sleepy_write(struct file *filp, const char __user *buf,

size_t count, loff_t *pos) {

printk(KERN_DEBUG "process %i (%s) awakening the readers...\n",

current->pid, current->comm);

flag = 1;

wake_up_interruptible(&wq);

return count; /* succeed, to avoid retrial */

}

Page 40: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Blocking and Nonblocking Operations By default, operations block

If no data is available for reads If no space is available for writes

Non-blocking I/O is indicated by the O_NONBLOCK flag in filp->f_flags Defined in <linux/fcntl.h> Only open, read, and write calls are affected Returns –EAGAIN immediately instead of block Applications need to distinguish non-blocking

returns vs. EOFs

Page 41: Advanced Char Driver Operations Sarah Diesburg CIS 4930

A Blocking I/O Example

scullpipe A read process

Blocks when no data is available Wakes a blocking write when buffer space becomes

available A write process

Blocks when no buffer space is available Wakes a blocking read process when data arrives

Page 42: Advanced Char Driver Operations Sarah Diesburg CIS 4930

A Blocking I/O Example

scullpipe data structure

struct scull_pipe {

wait_queue_head_t inq, outq; /* read and write queues */

char *buffer, *end; /* begin of buf, end of buf */

int buffersize; /* used in pointer arithmetic */

char *rp, *wp; /* where to read, where to write */

int nreaders, nwriters; /* number of openings for r/w */

struct fasync_struct *async_queue; /* asynchronous readers */

struct semaphore sem; /* mutual exclusion semaphore */

struct cdev cdev; /* Char device structure */

};

Page 43: Advanced Char Driver Operations Sarah Diesburg CIS 4930

A Blocking I/O Example

static ssize_t scull_p_read(struct file *filp, char __user *buf,

size_t count, loff_t *f_pos) {

struct scull_pipe *dev = filp->private_data;

if (down_interruptible(&dev->sem)) return -ERESTARTSYS;

while (dev->rp == dev->wp) { /* nothing to read */

up(&dev->sem); /* release the lock */

if (filp->f_flags & O_NONBLOCK)

return -EAGAIN;

if (wait_event_interruptible(dev->inq, (dev->rp != dev->wp)))

return -ERESTARTSYS;

if (down_interruptible(&dev->sem)) return -ERESTARTSYS;

}

Page 44: Advanced Char Driver Operations Sarah Diesburg CIS 4930

A Blocking I/O Example

if (dev->wp > dev->rp)

count = min(count, (size_t)(dev->wp - dev->rp));

else /* the write pointer has wrapped */

count = min(count, (size_t)(dev->end - dev->rp));

if (copy_to_user(buf, dev->rp, count)) {

up (&dev->sem);

return -EFAULT;

}

dev->rp += count;

if (dev->rp == dev->end) dev->rp = dev->buffer; /* wrapped */

up (&dev->sem);

/* finally, awake any writers and return */

wake_up_interruptible(&dev->outq);

return count;

}

Page 45: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Advanced Sleeping

Page 46: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Advanced Sleeping

Uses low-level functions to affect a sleep How a process sleeps

1. Allocate and initialize a wait_queue_t structureDEFINE_WAIT(my_wait); Or

wait_queue_t my_wait;

init_wait(&my_wait);

Queue element

Page 47: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Advanced Sleeping

2. Add to the proper wait queue and mark a process as being asleep TASK_RUNNING TASK_INTERRUPTIBLE or

TASK_UNINTERRUPTIBLE Call

void prepare_to_wait(wait_queue_head_t *queue,

wait_queue_t *wait,

int state);

Page 48: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Advanced Sleeping

3. Give up the processor Double check the sleeping condition before going to

sleep The wakeup thread might have changed the condition

between steps 1 and 2

if (/* sleeping condition */) {

schedule(); /* yield the CPU */

}

Page 49: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Advanced Sleeping

4. Return from sleep

Remove the process from the wait queue if schedule() was not calledvoid finish_wait(wait_queue_head_t *queue,

wait_queue_t *wait);

Page 50: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Advanced Sleeping

scullpipe write method

/* How much space is free? */

static int spacefree(struct scull_pipe *dev) {

if (dev->rp == dev->wp)

return dev->buffersize - 1;

return ((dev->rp + dev->buffersize - dev->wp)

% dev->buffersize) - 1;

}

Page 51: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Advanced Sleeping

static ssize_t

scull_p_write(struct file *filp, const char __user *buf,

size_t count, loff_t *f_pos) {

struct scull_pipe *dev = filp->private_data;

int result;

if (down_interruptible(&dev->sem)) return -ERESTARTSYS;

/* Wait for space for writing */

result = scull_getwritespace(dev, filp);

if (result)

return result; /* scull_getwritespace called up(&dev->sem) */

/* ok, space is there, accept something */

count = min(count, (size_t)spacefree(dev));

Page 52: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Advanced Sleeping

if (dev->wp >= dev->rp)

count = min(count, (size_t)(dev->end - dev->wp));

else /* the write pointer has wrapped, fill up to rp - 1 */

count = min(count, (size_t)(dev->rp - dev->wp - 1));

if (copy_from_user(dev->wp, buf, count)) {

up (&dev->sem); return -EFAULT;

}

dev->wp += count;

if (dev->wp == dev->end) dev->wp = dev->buffer; /* wrapped */

up(&dev->sem);

wake_up_interruptible(&dev->inq);

if (dev->async_queue)

kill_fasync(&dev->async_queue, SIGIO, POLL_IN);

return count;

}

Page 53: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Advanced Sleeping

/* Wait for space for writing; caller must hold device semaphore.

* On error the semaphore will be released before returning. */

static int scull_getwritespace(struct scull_pipe *dev,

struct file *filp) {

while (spacefree(dev) == 0) { /* full */

DEFINE_WAIT(wait);

up(&dev->sem);

if (filp->f_flags & O_NONBLOCK) return -EAGAIN;

prepare_to_wait(&dev->outq, &wait, TASK_INTERRUPTIBLE);

if (spacefree(dev) == 0) schedule();

finish_wait(&dev->outq, &wait);

if (signal_pending(current)) return -ERESTARTSYS;

if (down_interruptible(&dev->sem)) return -ERESTARTSYS;

}

return 0;

}

Task state: RUNNINGQueue: full

Page 54: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Advanced Sleeping

/* Wait for space for writing; caller must hold device semaphore.

* On error the semaphore will be released before returning. */

static int scull_getwritespace(struct scull_pipe *dev,

struct file *filp) {

while (spacefree(dev) == 0) { /* full */

DEFINE_WAIT(wait);

up(&dev->sem);

if (filp->f_flags & O_NONBLOCK) return -EAGAIN;

prepare_to_wait(&dev->outq, &wait, TASK_INTERRUPTIBLE);

if (spacefree(dev) == 0) schedule();

finish_wait(&dev->outq, &wait);

if (signal_pending(current)) return -ERESTARTSYS;

if (down_interruptible(&dev->sem)) return -ERESTARTSYS;

}

return 0;

}

Task state: RUNNING INTERRUPTIBLEQueue: full

Page 55: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Advanced Sleeping

/* Wait for space for writing; caller must hold device semaphore.

* On error the semaphore will be released before returning. */

static int scull_getwritespace(struct scull_pipe *dev,

struct file *filp) {

while (spacefree(dev) == 0) { /* full */

DEFINE_WAIT(wait);

up(&dev->sem);

if (filp->f_flags & O_NONBLOCK) return -EAGAIN;

prepare_to_wait(&dev->outq, &wait, TASK_INTERRUPTIBLE);

if (spacefree(dev) == 0) schedule();

finish_wait(&dev->outq, &wait);

if (signal_pending(current)) return -ERESTARTSYS;

if (down_interruptible(&dev->sem)) return -ERESTARTSYS;

}

return 0;

}

Task state: INTERRUPTIBLE /* sleep */Queue: full

Page 56: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Exclusive Waits

Avoid waking up all processes waiting on a queue Wakes up only one process

Callvoid prepare_to_wait_exclusive(wait_queue_heat_t *queue,

wait_queue_t *wait, int state);

Set the WQ_FLAG_EXCLUSIVE flag Add the queue entry to the end of the wait queue

wake_up stops after waking the first process with the flag set

Page 57: Advanced Char Driver Operations Sarah Diesburg CIS 4930

The Details of Waking Up

/* wakes up all processes waiting on the queue */void wake_up(wait_queue_head_t *queue);

/* wakes up processes that perform an interruptible sleep */void wake_up_interruptible(wait_queue_head_t *queue);

/* wake up to nr exclusive waiters */void wake_up_nr(wait_queue_head_t *queue, int nr);void wake_up_interruptible_nr(wait_queue_head_t *queue, int nr);

/* wake up all exclusive waiters */void wake_up_all(wait_queue_head_t *queue);void wake_up_interruptible_all(wait_queue_head_t *queue);

/* do not lose the CPU during this call */void wake_up_interruptible_sync(wait_queue_head_t *queue);

Page 58: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Testing the scullpipe Driver

Window 1% cat /dev/scullpipe

Window2%

Page 59: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Testing the scullpipe Driver

Window 1% cat /dev/scullpipe

Window2% ls –aF > /dev/scullpipe

Page 60: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Testing the scullpipe Driver

Window 1% cat /dev/scullpipe

./

../

file1

file2

Window2% ls –aF > /dev/scullpipe

Page 61: Advanced Char Driver Operations Sarah Diesburg CIS 4930

poll and select

Nonblocking I/Os often involve the use of poll, select, and epoll system calls Allow a process to determine whether it can read

or write one or more open files without blocking Can block a process until any of a set of file

descriptors becomes available for reading and writing

select introduced in BSD Linux poll introduced in System V epoll added in 2.5.45 for better scaling

Page 62: Advanced Char Driver Operations Sarah Diesburg CIS 4930

poll and select

All three calls supported through the poll methodunsigned int (*poll) (struct file *filp,

poll_table *wait);1. Call poll_wait on one or more wait queues that could

indicate a change in the poll status If no file descriptors are available, wait

2. Return a bit mask describing the operations that could be immediately performed without blocking

Page 63: Advanced Char Driver Operations Sarah Diesburg CIS 4930

poll and select

poll_table defined in <linux/poll.h> To add a wait queue into the poll_table,

callvoid poll_wait(struct file *,

wait_queue_head_t *,

poll_table *);

Bit mask flags defined in <linux/poll.h> POLLIN

Set if the device can be read without blocking

Page 64: Advanced Char Driver Operations Sarah Diesburg CIS 4930

poll and select

POLLOUT Set if the device can be written without blocking

POLLRDNORM Set if “normal” data is available for reading A readable device returns (POLLIN | POLLRDNORM)

POLLWRNORM Same meaning as POLLOUT A writable device returns (POLLOUT | POLLWRNORM)

POLLPRI High-priority data can be read without blocking

Page 65: Advanced Char Driver Operations Sarah Diesburg CIS 4930

poll and select

POLLHUP Returns when a process reads the end-of-file

POLLERR An error condition has occurred

POLLRDBAND Out-of-band data is available for reading Associated with sockets

POLLWRBAND Data with nonzero priority can be written to the device

Page 66: Advanced Char Driver Operations Sarah Diesburg CIS 4930

poll and select

Examplestatic unsigned int scull_p_poll(struct file *filp,

poll_table *wait) {

struct scull_pipe *dev = filp->private_data;

unsigned int mask = 0;

down(&dev->sem);

poll_wait(filp, &dev->inq, wait);

poll_wait(filp, &dev->outq, wait);

if (dev->rp != dev->wp) /* circular buffer not empty */

mask |= POLLIN | POLLRDNORM; /* readable */

if (spacefree(dev)) /* circular buffer not full */

mask |= POLLOUT | POLLWRNORM; /* writable */

up(&dev->sem);

return mask;

}

Page 67: Advanced Char Driver Operations Sarah Diesburg CIS 4930

poll and select

No end-of-file support Scull pipe does not implement this If it did…

The reader could see an end-of-file when all writers close the file

Check dev->nwriters in read and poll Problem when a reader opens the scullpipe before

the writer Need blocking within open

Page 68: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Interaction with read and write Reading from the device

If there is data in the input buffer, return at least one byte poll returns POLLIN | POLLRDNORM

If no data is available If O_NONBLOCK is set, return –EAGAIN poll must report the device unreadable until one byte

arrives At the end-of-file, read returns 0, poll returns POLLHUP

Page 69: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Interaction with read and write Writing to the device

If there is space in the output buffer, accept at least one byte poll reports that the devices is writable by returning

POLLOUT | POLLWRNORM If the output buffer is full, write blocks

If O_NONBLOCK is set, write returns –EAGAIN poll reports that the file is not writable If the device is full, write returns -ENOSPC

Page 70: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Interaction with read and write

In write, never wait for data transmission before returning Or, select may block

To make sure the output buffer is actually transmitted, use fsync call

Page 71: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Interaction with read and write To flush pending output, call fsyncint (*fsync) (struct file *file, loff_t, loff_t, int datasync);

Should return only when the device has been completely flushed

datasync: Used by file systems, ignored by drivers

Page 72: Advanced Char Driver Operations Sarah Diesburg CIS 4930

The Underlying Data Structure

Page 73: Advanced Char Driver Operations Sarah Diesburg CIS 4930

The Underlying Data Structure When the poll call completes, poll_table

is deallocated with all wait queue entries removed epoll reduces this overhead of setting up and

tearing down the data structure between every I/O

Page 74: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Asynchronous Notification

Polling Inefficient for rare events

A solution: asynchronous notification Application receives a signal whenever data

becomes available Two steps

Specify a process as the owner of the file (so that the kernel knows whom to notify)

Set the FASYNC flag in the device via fcntl command

Page 75: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Asynchronous Notification

Example (user space)/* create a signal handler */

signal(SIGIO, &input_handler);

/* set current pid the owner of the stdin */

fcntl(STDIN_FILENO, F_SETOWN, getpid());

/* obtain the current file control flags */

oflags = fcntl(STDIN_FILENO, F_GETFL);

/* set the asynchronous flag */

fcntl(STDIN_FILENO, F_SETFL, oflags | FASYNC);

Page 76: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Asynchronous Notification

Some catches Not all devices support asynchronous notification

Usually available for sockets and ttys Need to know which input file to process

Still need to use poll or select

Page 77: Advanced Char Driver Operations Sarah Diesburg CIS 4930

The Driver’s Point of View

1. When F_SETOWN is invoked, a value is assigned to filp->f_owner

2. When F_SETFL is executed to change the status of FASYNC The driver’s fasync method is calledstatic int

scull_p_fasync(int fd, struct file *filp, int mode) {

struct scull_pipe *dev = filp->private_data;

return fasync_helper(fd, filp, mode, &dev->async_queue);

}

Page 78: Advanced Char Driver Operations Sarah Diesburg CIS 4930

The Driver’s Point of View

fasync_helper adds or removes processes from the asynchronous list

void fasync_helper(int fd, struct file *filp, int mode,

struct fasync_struct **fa);

3. When data arrives, send a SIGNO signal to all processes registered for asynchronous notification Near the end of write, notify blocked readersif (dev->async_queue)

kill_fasync(&dev->async_queue, SIGIO, POLL_IN);

Similarly for read, as needed

Page 79: Advanced Char Driver Operations Sarah Diesburg CIS 4930

The Driver’s Point of View

4. When the file is closed, remove the file from the list of asynchronous readers in the release methodscull_p_fasync(-1, filp, 0);

Page 80: Advanced Char Driver Operations Sarah Diesburg CIS 4930

The llseek Implementation

Implements lseek and llseek system calls Modifies filp->f_pos

loff_t scull_llseek(struct file *filp, loff_t off, int whence) {

struct scull_dev *dev = filp->private_data;

loff_t newpos;

switch(whence) {

case 0: /* SEEK_SET */

newpos = off;

break;

case 1: /* SEEK_CUR, relative to the current position */

newpos = filp->f_pos + off;

break;

Page 81: Advanced Char Driver Operations Sarah Diesburg CIS 4930

The llseek Implementation

case 2: /* SEEK_END, relative to the end of the file */

newpos = dev->size + off;

break;

default: /* can't happen */

return -EINVAL;

}

if (newpos < 0) return -EINVAL;

filp->f_pos = newpos;

return newpos;

}

Page 82: Advanced Char Driver Operations Sarah Diesburg CIS 4930

The llseek Implementation

Does not make sense for serial ports and keyboard inputs Need to inform the kernel via calling nonseekable_open in the open method

int nonseekable_open(struct inode *inode, struct file *filp);

Replace llseek method with no_llseek (defined in <linux/fs.h> in your file_operations structure

Page 83: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Access Control on a Device File Prevents unauthorized users from using the

device Sometimes permits only one authorized user

to open the device at a time

Page 84: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Single-Open Devices

Example: scullsinglestatic atomic_t scull_s_available = ATOMIC_INIT(1);

static int scull_s_open(struct inode *inode, struct file *filp) {

struct scull_dev *dev = &scull_s_device;

if (!atomic_dec_and_test(&scull_s_available)) {

atomic_inc(&scull_s_available);

return -EBUSY; /* already open */

}

/* then, everything else is the same as before */

if ((filp->f_flags & O_ACCMODE) == O_WRONLY) scull_trim(dev);

filp->private_data = dev;

return 0; /* success */

}

Returns true, if the tested value is 0

Page 85: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Single-Open Devices

In the release call, marks the device idle

static int

scull_s_release(struct inode *inode, struct file *filp) {

atomic_inc(&scull_s_available); /* release the device */

return 0;

}

Page 86: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Restricting Access to a Single User (with multiple processes) at a Time Example: sculluid Includes the following in the open callspin_lock(&scull_u_lock);

if (scull_u_count && /* someone is using the device */

(scull_u_owner != current->uid) && /* not the same user */

(scull_u_owner != current->euid) && /* not the same effective uid (for su) */

!capable(CAP_DAC_OVERRIDE)) { /* not root override */

spin_unlock(&scull_u_lock);

return -EBUSY; /* -EPERM would confuse the user */

}

if (scull_u_count == 0) scull_u_owner = current->uid;

scull_u_count++;

spin_unlock(&scull_u_lock);

Page 87: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Restricting Access to a Single User (with Multiple Processes) at a Time Includes the following in the release call

static int scull_u_release(struct inode *inode,

struct file *filp) {

spin_lock(&scull_u_lock);

scull_u_count--; /* nothing else */

spin_unlock(&scull_u_lock);

return 0;

}

Page 88: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Blocking open as an Alternative to EBUSY (scullwuid) A user might prefer to wait over getting errors

E.g., data communication channelspin_lock(&scull_w_lock);

while (!scull_w_available()) {

spin_unlock(&scull_w_lock);

if (filp->f_flags & O_NONBLOCK) return -EAGAIN;

if (wait_event_interruptible(scull_w_wait,

scull_w_available()))

return -ERESTARTSYS; /* tell the fs layer to handle it */

spin_lock(&scull_w_lock);

}

if (scull_w_count == 0) scull_w_owner = current->uid;

scull_w_count++;

spin_unlock(&scull_w_lock);

Page 89: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Blocking open as an Alternative to EBUSY (scullwuid) The release method wakes pending

processesstatic int scull_w_release(struct inode *inode,

struct file *filp) {

int temp;

spin_lock(&scull_w_lock);

scull_w_count--;

temp = scull_w_count;

spin_unlock(&scull_w_lock);

if (temp == 0)

wake_up_interruptible_sync(&scull_w_wait);

return 0;

}

Page 90: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Blocking open as an Alternative to EBUSY Might not be the right semantics for

interactive users Blocking on cp vs. getting a return value –EBUSY

or -EPERM Incompatible policies for the same device

One solution: one device node per policy

Page 91: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Cloning the Device on open

Allows the creation of private, virtual devices E.g., One virtual scull device for each process

with different tty device number Example: scullpriv

Page 92: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Cloning the Device on open

static int scull_c_open(struct inode *inode, struct file *filp) {

struct scull_dev *dev;

dev_t key;

if (!current->signal->tty) {

PDEBUG("Process \"%s\" has no ctl tty\n", current->comm);

return -EINVAL;

}

key = tty_devnum(current->signal->tty);

spin_lock(&scull_c_lock);

dev = scull_c_lookfor_device(key);

spin_unlock(&scull_c_lock);

if (!dev) return -ENOMEM;

.../* then, everything else is the same as before */

}

Page 93: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Cloning the Device on open

/* The clone-specific data structure includes a key field */

struct scull_listitem {

struct scull_dev device;

dev_t key;

struct list_head list;

};

/* The list of devices, and a lock to protect it */

static LIST_HEAD(scull_c_list);

static spinlock_t scull_c_lock = SPIN_LOCK_UNLOCKED;

Page 94: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Cloning the Device on open

/* Look for a device or create one if missing */

static struct scull_dev *scull_c_lookfor_device(dev_t key) {

struct scull_listitem *lptr;

list_for_each_entry(lptr, &scull_c_list, list) {

if (lptr->key == key)

return &(lptr->device);

}

/* not found */

lptr = kmalloc(sizeof(struct scull_listitem), GFP_KERNEL);

if (!lptr) return NULL;

Page 95: Advanced Char Driver Operations Sarah Diesburg CIS 4930

Cloning the Device on open

/* initialize the device */

memset(lptr, 0, sizeof(struct scull_listitem));

lptr->key = key;

scull_trim(&(lptr->device)); /* initialize it */

init_MUTEX(&(lptr->device.sem));

/* place it in the list */

list_add(&lptr->list, &scull_c_list);

return &(lptr->device);

}

Page 96: Advanced Char Driver Operations Sarah Diesburg CIS 4930

What’s going on?

scull_c_list

struct list_head { struct list_head *next; struct list_head *prev;};

struct list_head { struct list_head *next; struct list_head *prev;} list;

scull_listitem

struct scull_dev device;dev_t key;