'poll' mechanism is certainly a big improvement over the 'select' mechanism. It does not use bit masks, hence there is no limit on number of descriptors to wait on. Also, along with the FD, user programs can provide 'callback' argument. But it has similar performance problems as 'select'. That is, every time 'poll' comes out, it needs to figure out which descriptors have events ready by going through all descriptors which were given to 'poll' call. Though scalability problem is taken care, performance problem still continues to be there.
'epoll' mechanism solves both scalability and performance problems. It has mechanisms to add descriptors and remove descriptors with descriptor value, not the bit position. It also has facility for user programs to associate its data to the descriptor, which is returned along with descriptor that is ready when 'epoll_wait' returns. More importantly, epoll_wait returns the array of descriptors which has some event ready. Due to this, user programs need not go through complete 'fd' list to figure out ready descriptors.
Usage is also simple. Create a epoll instance using epoll_create. Add, Delete and Modify descriptors and events they are interested in using epoll_ctl and call epoll_wait to wait for events to occur. epoll_wait outputs the descriptors which are ready with some events. User program can walk though this ready descriptors to do whatever they need to do.
It provides following API functions.
- int epoll_create(int size): This function is expected to be called by user thread only once. 'size' parameter was hint for old Linux distributions to allocate memory internally in the kernel. But this value is no longer used. You can pass any value.
- int epoll_ctl(int epfd, int op, int fd, struct epoll_event *event) : This function can be used to add new descriptor and associated interested events, delete existing descriptor from the epoll instance and modify the interested events information for the existing descriptor. User programs can do this on dynamic basis.
- int epoll_wait(int epfd, struct epoll_event *events, int maxevents, int timeout) : This function is used by user threads to wait for events on the descriptors which were added to the epoll instance identified by epfd. This call waits until there is some ready event on any descriptor or until timeout. 'events' would have all ready descriptors when this call returns.
- Use typical 'close' call on epfd to close the epoll instance. This is typically done when the thread is exiting.