Perlfect Solutions

Multiplexing filehandles with select() in perl.

The problem

I/O requests such as read() and write() are blocking requests. Suppose you have a line in a program that get STDIN from a terminal like the following:

$input = <STDIN>;

What will happen here is that the program's execution will block until there a line of input is available, i.e. the user types something followed by a newline. In many cases this is the desired behavior. Suppose you have a program that accepts requests through a socket and does some processing for each request, then moves on to the next request.

01 # Create the receiving socket 02 my $s = new IO::Socket ( 03 LocalHost => thekla, 04 LocalPort => 7070, 05 Proto => 'tcp' 06 Listen => 16, 07 Reuse => 1, 08 ); 09 die "Could not create socket: $!\n" unless $s; 10 11 my ($ns, $buf); 12 while( $ns = $s->accept() ) { # wait for and accept a connection 13 while( defined( $buf = <$ns> ) ) { # read from the socket 14 # do some processing 15 } 16 } 17 close($s);

Although this is a perfectly valid way of handling the incoming requests, it does suffer some serious problems, especially if the frequency of incoming requests is high and the processing that needs to be performed for each is a lot.

Clearly, the problem is that, once a request has been accepted, we have to keep other requests hanging in the queue while we read the request message and process it. Now, reading from a socket is a blocking call, so if the client takes too long to transmit the request message, we just sit there waiting while we could be doing useful processing of other requests. Obviously, not only this is not acceptable, but in cases where the demand for request processing is high, the program may not be able to meet its operating reqiurements. Also think that a single client failure at a critical point (in the middle of an ongoing transmission) poses the risk of making the server block indefinetly.

What can we do about it?

What we need to deal with situations like the above, is a way to handle I/O (we use sockets for this example, but the rules apply in general to any kind of filehandles) independently and with some sort of apparent parallelism/multiprocessing. There are two very common approaches to deal with this.

One approach is to spawn separate threads of control to handle each request. This can be done either at process-level, using fork() to create a new process for each request, or at thread-level using perl's threading capabilities to create multiple threads within the same process. (Perl's support for threads was introduced in version 5.005)

The other approach - which is the one that we will discuss here - is to use the select() to multiplex between several filehandles within a single thread of control, thus creating the effect of parallelism in the handling of I/O.

What does select() do?

The idea behind select() is to avoid blocking calls by making sure that a call will not block before attempting it. How do we do that? Suppose we have two filehandles, and we want to read data from them as it comes in. Let's call them A and B. Now, let's assume that A has no input pending yet, but B is ready to respond to a read() call. If we know this bit of information, we can try readin from B first, instead of A, knowing that our call will not block. select() gives us this bit of information. All we need to do is to define sets of filehandles (one for reading, one for writing and one for errors) and ask call select() on them which will return a filehandle which is ready to perform the operation for which it has been delegated (depending on which set it is in) as soon as such a filhandle is ready.

Obviously this provides us with the advantage of always picking up a filehandle that will not block thus avoiding the possibility of delaying the entire program for one lazy filehandle just because it happened to be the first we picked at random. Still, it does not guarantee that the selected filehandle is the best choice, because we still don't know how much data can be read, or how qucikly it can take in data that we wrte to it. But it is definetly a big step forward from our initial program.

Using select()

We will try writing the example program we attempted on the beginnign of this article, but now using the select() method. Instead of using perl's select call directly we will use a wrapper module, IO::Select that makes life easier for us.

... create socket as before ... 11 use IO::Select; 12 $read_set = new IO::Select(); # create handle set for reading 13 $read_set->add($s); # add the main socket to the set 14 15 while (1) { # forever 16 # get a set of readable handles (blocks until at least one handle is ready) 17 my ($rh_set) = IO::Select->select($read_set, undef, undef, 0); 18 # take all readable handles in turn 19 foreach $rh (@$rh_set) { 20 # if it is the main socket then we have an incoming connection and 21 # we should accept() it and then add the new socket to the $read_set 22 if ($rh == $s) { 23 $ns = $rh->accept(); 24 $read_set->add($ns); 25 } 26 # otherwise it is an ordinary socket and we should read and process the request 27 else { 28 $buf = <$rh>; 29 if($buf) { # we get normal input 30 # ... process $buf ... 31 } 32 else { # the client has closed the socket 33 # remove the socket from the $read_set and close it 34 $read_set->remove($rh); 35 close($rh); 36 } 37 } 38 } 39 }

We create an IO::Select object, $read_set, which is our set of handles to test for readability, and add all open handles to it. We start by adding the main socket and each time a new connection is made returning a new socket for it, we add that socket to the set. Then we go into a loop where we ask select to give us a list of readable handles and we examine each one in turn. If it is the main socket then we want to call accept() to receive the incoming connection and add the new socket to the read set. Otherwise it must be an ordinary socket in which case we read from it and process its input. If the read fails, that means the socket has been closed on the client side, so we close it, too, and remove it from the read set. So we work our way continuously through the incoming requests, by making sure that a call for I/O on any filehandle will progress since select() tells us it will.

As we already mentioned earlier, this method does not guarantee progress as it only tests whether a handle is ready to respond to I/O. The question still remains, whether the handle we pick from the ready ones is the one that will respond faster to I/O, and how much data there is available for reading or how much data it is ready to receive. So it is still possible to block a bit after the point where we picked the handle. Also, we did not take into account the impact on performance that the actual processing of requests will have. We might just be printing incoming data to a file, but then again, each request might need heavy processing that would slow down the entire handle processing loop. But these are issues that must be considered in the context of the individual application.

Comments

Your name:
Your comments:

Security check *

 

David   

Posted at 11:33pm on Wednesday, June 27th, 2007

Excellent tutorial, thank you!

max   

Posted at 6:45pm on Friday, July 13th, 2007

great read

Vijay   

Posted at 8:20am on Wednesday, July 18th, 2007

A very good tuturial, very clearly explained. Thank you.

Anis   

Posted at 4:10am on Friday, July 20th, 2007

Excellent thanks..

Shahid Khan   

Posted at 1:06am on Thursday, October 18th, 2007

SO useful information

Jonathan Perkin   

Posted at 6:09am on Tuesday, January 29th, 2008

The last argument to select() should really be undef, so that it blocks until ready. A timeout of 0 means continuously check, so it chews up 100% CPU.

Wilko   

Posted at 1:24pm on Monday, February 4th, 2008

Very good article thank you. Thanks to Jonathan P aswell! I was maxxing out the CPU whilst the server was waiting for incoming connections. Changing the 0 to undef worked a treat. Thanks again

DimeCadmium   

Posted at 1:02pm on Tuesday, March 4th, 2008

A better way (IMO) to get the error message (more details): $@
Also, I use:
new IO::Socket::INET(...) or die "No socket: $!/$@n";

alpha   

Posted at 10:56pm on Friday, March 14th, 2008

You should not mix buffered input, i.e. , with select. Use {select/sysread/syswrite} or {print//read/write}

He Man   

Posted at 4:42am on Wednesday, March 19th, 2008

NICE....

Rick   

Posted at 4:49pm on Friday, April 25th, 2008

I believe line 28 is a blocking IO statement. If the other end of the connection went away during an IO, this entire app will wait on that line. I tested this via Telnet as the other end - when I type my first character, the IO::Select detects it and then it blocks at line 28 until I hit carriage return. Does anyone have a solution to this issue?

MattCarter   

Posted at 11:39am on Wednesday, July 30th, 2008

As alpha pointed out, the IO::Select example above has a serious flaw: The diamond operator () (the shortcut for readline()) does buffered I/O. The buffer that perl uses for the diamond operator is NOT visible to IO::Select. So, the above code will hang in the subsequent select call if multiple lines arrive simultaneously. To avoid this problem, the perl program must use unbuffered IO calls like sysread(...) .

Ankit Kapoor   

Posted at 12:06pm on Sunday, November 9th, 2008

Xcellent Tutorial!

Frank   

Posted at 8:27am on Tuesday, March 24th, 2009

Is it the same also for UDP socket? my udp socket don't accept the connect(), i deleted the line with the connect command, and i modified $read_set->add($ns); to $read_set->add(my main socket);
but it'doesn't work.
HELP
tnx for your tutorial!

Daniel   

Posted at 10:46am on Saturday, June 6th, 2009

Very nice tutorial. it inspired me very much

martin007   

Posted at 3:06am on Thursday, August 27th, 2009

IT is one of the best and most leading technology in this modern era. There are many different ways to get high level posts in any famous organization. In order to get a good post in Information Technology, you must have detailed knowledge and experience about different topics like test king. I really appreciate the best and amazing efforts like that. Well done..

Vetrivel   

Posted at 8:13pm on Thursday, September 10th, 2009

nice to have tutorial

Travis   

Posted at 5:28am on Tuesday, October 20th, 2009

Excellent tutorial, this was EXACTLY what I was looking for.

Colin   

Posted at 10:39am on Sunday, November 22nd, 2009

It was mentioned a couple time that mixing unbuffered reading/writing with buffered reading/writing is a bad idea, this may be a ridiculous question, but how would that look if this program was re-written using entirely un-buffered IO calls ( sysread, syswrite etc.. )

sri   

Posted at 6:23pm on Thursday, February 4th, 2010

Good tut.
So is there any solution to the concerns of the last para.
Thanks

Jagadeesan   

Posted at 10:59pm on Thursday, March 25th, 2010

Execellent !!!! Thank you very much

saurabh verma   

Posted at 3:03am on Tuesday, April 20th, 2010

Well , I've been looking out for a similar kind of article , and this one really helped me understand multiplexing filehandles

stefan   

Posted at 10:45am on Saturday, May 8th, 2010

thanks, great totorial. thats what I have been looking for

Brett G   

Posted at 2:12pm on Tuesday, August 31st, 2010

I'm no expert, but I think the following code properly converts the buffered call (to the readline diamond op ) to buffered sysread command (reference 'alpha' and 'MattCarter" posts)

$status = sysread($rh,$buf,512);
if ($status>0) { # we get normal input
# ... process $buf ...

Brett G   

Posted at 2:14pm on Tuesday, August 31st, 2010

OOPS...make that "UN-buffered sysread call" in my last post.

Natali   

Posted at 5:15am on Monday, January 31st, 2011

Hello friends,this is a nice site and I wanted to post a note to let you know, good job! Thanks
Best regards, Natali, CEO of os x iscsi initiator

Christina   

Posted at 3:01pm on Wednesday, May 11th, 2011

Walking in the presence of giants here. Cool thinknig all around!

wcotjpe   

Posted at 6:23pm on Thursday, May 12th, 2011

vbEVFO ggcbkplgshdb

wcotjpe   

Posted at 6:23pm on Thursday, May 12th, 2011

vbEVFO ggcbkplgshdb

ccwkdsqnn   

Posted at 9:22pm on Friday, May 13th, 2011

qGzU31 mfpflxuujnwp

Manoj Hirwani   

Posted at 12:50am on Saturday, May 14th, 2011

Its really nice tutorial, Thanks alot!!

Cory   

Posted at 6:09pm on Tuesday, May 24th, 2011

MLCP is a great place to buy Used Cisco Equipment.

MooneySonja25   

Posted at 11:44pm on Wednesday, August 24th, 2011

I took 1 st home loans when I was not very old and it helped me very much. But, I require the secured loan once more time.

WHITNEYFlowers27   

Posted at 2:12pm on Friday, September 30th, 2011

I strictly recommend not to hold back until you earn big sum of cash to order goods! You should just get the loan or auto loan and feel yourself comfortable

Anonymous   

Posted at 7:06am on Thursday, March 15th, 2012

This was very useful

denv   

Posted at 4:16pm on Wednesday, June 27th, 2012

Thx to u for notes!
Specialy thx to Jonathan Perkin with undef!

Richard   

Posted at 8:37pm on Wednesday, August 22nd, 2012

Deep thinking - adds a new diemnsion to it all.

bnzqagvsaaq   

Posted at 5:11am on Thursday, August 23rd, 2012

2fBiQp kngcwokpkgwx

nadxarfijgs   

Posted at 7:16pm on Friday, August 24th, 2012

NUWeWS qcpfvbhldzkl

enzo   

Posted at 1:14am on Tuesday, November 13th, 2012

you can all so use epoll or anyevent have a look at cpan

https://metacpan.org/search?q=anyevent
https://metacpan.org/search?q=io::epoll

z-man   

Posted at 4:07pm on Tuesday, November 27th, 2012

Thank you for explaining it so clearly - what a difference a few well-placed comments make!

Also thanks to others for the sysread() notes.

Anonymous   

Posted at 12:13pm on Wednesday, October 23rd, 2013

ATES

Comments to date: 42.

Like it? Share it!

Suggested Reading

Order your copy of Advanced Perl Programming now! Advanced Perl Programming among various other very interesting subjects, dedicates a chapter to socket programming, providing a very clear and to-the-point approach to the issue. There is a short discussion on select() and its use to manipulate sockets. It is also a good book to have in general if you're seriously interested in perl programming.