神刀安全网

Linux Programming – The Basics of Pipes

Have you ever wondered how pipes work ? When you type ls | grep stuff, or ls | wc -l some sort of magic seems to happen, but how!? I was curious too so I did some digging. The shell provides this functionality with a pipe. A pipe is one of many IPC (Inter process communication) mechanisms available on Linux that enable communication between processes. When you create a pipe, the kernel sets aside a buffer in kernel space. The maximum number of bytes that can be stored is 65,536 as of kernel 2.6.11. The kernel returns 2 file descriptors wrapped in an array. One of these descriptors can be written to and the other read from. When you write to the pipe your data is copied from userspace to kernel space, when a another process reads it back it is transfered once again from kernel space to userspace. Pipes are meant to be used between related processes, and as such it is necessary to have a basic understanding of forking a process on Linux. Check out myblogpost on the subject. Lets have a look at some code.

#include <sys/wait.h> #include <unistd.h> #include <stdio.h> #include <stdlib.h> #include <string.h> #include <errno.h>  #define BUF_SIZE 10  int main(int argc, char *argv[]) {  int pfd[2];  char buf[BUF_SIZE];  ssize_t numRead;   if(argc != 2){   printf("%s string/n", argv[0]);   exit(EXIT_SUCCESS);  }  if(pipe(pfd) == -1){   printf("Error creating pipe/n");   exit(EXIT_FAILURE);  }  switch(fork()){   case -1:    printf("Error forking/n");    exit(EXIT_FAILURE);   case 0:    if(close(pfd[1]) == -1){     printf("Error closing write end of pipe/n");     exit(EXIT_FAILURE);    }    for(;;){     numRead = read(pfd[0], buf, BUF_SIZE);     if(numRead == -1){      printf("Error reading from pipe/n");      exit(EXIT_FAILURE);     }     if(numRead == 0)      break;     if(write(STDOUT_FILENO, buf, numRead) != numRead){      printf("Failed to write entire buffer/n");      exit(EXIT_FAILURE);     }             write(STDOUT_FILENO, "/n", 1);    if(close(pfd[0]) == -1){     printf("Error closing read end of pipe from child/n");     exit(EXIT_FAILURE);    }    _exit(EXIT_SUCCESS);   default:    if(close(pfd[0]) == -1){     printf("Error closing read end of pipe from parent/n");     printf("%s/n", strerror(errno));     exit(EXIT_FAILURE);    }    if(write(pfd[1], argv[1], strlen(argv[1])) != strlen(argv[1])){     printf("Error writing to pipe/n");     exit(EXIT_FAILURE);    }    if(close(pfd[1]) == -1){     printf("Error closing write end of pipe/n");     exit(EXIT_FAILURE);    }    wait(NULL);    exit(EXIT_SUCCESS);  } }

First we check that the user entered a command line argument, any string will suffice. Next, we create the pipe with pipe() and pass our 2 element array. The kernel will set aside some memory for our pipe and fill the array with 2 file descriptors, one for reading and one for writing. Then we switch on fork, close the write file descriptor, and loop forever reading from the read end of the pipe and printing the data to STDOUT until we receive EOF. In the parent we close the read file descriptor, write the string to the pipe, and close the write descriptor before waiting on the child to exit. Note, its important to close the write descriptor, otherwise the child will never receive EOF and as such we will be stuck waiting for ever. Its also worth noting that a pipe is unidirectional in nature. If you want to setup a bidirectional pipe, you could use two pipes. One for communicating from parent to child, and one for communicating from child to to parent. There are however, other IPC mechanisms that achieve similar such as shared memory. One more important note before we move on, reads on a pipe are destructive, a read "consumes" the data and even though you can have multiple readers, once a read completes the data is not available to other readers. So then, how is it that the shell uses a pipe to facilitate things like ls | grep stuff or ls | wc -l ? Lets have a look at some code.

#include <sys/wait.h> #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <string.h>   int main(int argc, char *argv[]) {  int pfd[2];   if(pipe(pfd) == -1){   printf("Error creating pipe/n");   exit(EXIT_FAILURE);  }   switch(fork()){   case -1:    printf("Error forking/n");    exit(EXIT_FAILURE);   case 0:    if(close(pfd[0]) == -1){     printf("Error closing read end of pipe/n");     _exit(EXIT_FAILURE);    }    if(dup2(pfd[1], STDOUT_FILENO) == -1){     printf("Error duplicating write end FD to STDOUT/n");     _exit(EXIT_FAILURE);    }    if(close(pfd[1]) == -1){     printf("Error closing unused write descriptor/n");     _exit(EXIT_FAILURE);    }    execlp("ls", "ls", (char *) NULL);    printf("Error execing ls/n");    _exit(EXIT_FAILURE);   default:    break;     }  switch(fork()){   case -1:    printf("Error creating second child/n");    exit(EXIT_FAILURE);   case 0:    if(close(pfd[1]) == -1){     printf("Error closing write descriptor/n");     _exit(EXIT_FAILURE);    }    if(dup2(pfd[0], STDIN_FILENO) == -1){     printf("Error duplicating read descriptor/n");     _exit(EXIT_FAILURE);    }    if(close(pfd[0]) == -1 ){     printf("Error closing uneeded read descriptor/n");     _exit(EXIT_FAILURE);    }    execlp("wc", "wc", "-l", (char *) NULL);    printf("Error execing wc/n");    _exit(EXIT_FAILURE);   default:    break;  }   if(close(pfd[0]) == -1){   printf("Error closing descriptor/n");   exit(EXIT_FAILURE);  }   if(close(pfd[1]) == -1){   printf("Error closing descriptor/n");   exit(EXIT_FAILURE);  }  if(wait(NULL) == -1){   printf("Error waiting on child/n");   exit(EXIT_FAILURE);  }  if(wait(NULL) == -1){   printf("Error waiting on child/n");   exit(EXIT_FAILURE);  }  exit(EXIT_SUCCESS); }

It turns out we have the UNIX philosophy of everything is a file to thank for what is one of the most useful bits of functionality in Linux. Piping commands on the shell is simply a matter of redirecting the STDOUT of one process to the STDIN of another. First we create the pipe, and once again our kernel sets aside a buffer and returns 2 file descriptors in our array. In our first fork, we close the read end of the pipe. The real magic is calling dup2() which copies the write end of the file descriptor to the STDOUT_FILENO, when this process writes to standard out it will now be written to the pipe. Then we exec the ls program which inherits our crafty file descriptor setup. In the second fork, we close the write end of the pipe, and call dup2() which copies the the read end of the pipe to the STDIN_FILENO, and exec wc. Now when this program reads from STDIN it will be reading from the pipe.

Hacker challange – Write a program that will take 2 command line arguments. Pipe the STDOUT of the first program to the write end of a pipe, and pipe the STDIN of the second program to the read end of the pipe.

Pro Tip – Pipes only work between related processes, research how to use a FIFO which works in a similar fashion to pipes but uses a file on disk to allow communication between unrelated processes. If you find you never receive EOF, make sure you closed all the file descriptors when you are finished reading and writing.

转载本站任何文章请注明:转载至神刀安全网,谢谢神刀安全网 » Linux Programming – The Basics of Pipes

分享到:更多 ()

评论 抢沙发

  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址