The C-Unix Interface
[follow external pointer to pipes ]
Overview
This summary is largely extracted from "C through UNIX", K. E. Martin, 1992, W.C.Brown Publishers,
and Chapter 10 of Glass.
- We have already seen the compiler cc with its options (including -g and -
p), the debugger dbx and the lint and make packages, as well as a few other
utilities for compiling and analyzing C programs.
- A more interesting question is the use of UNIX commands as _parts_ of C programs, which we
explore here. It is now time to expand upon the various tools that can be used in the construction of a C
program.
- These fall into three categories:
- Built-in functions, called UNIX C functions, that cannot stand alone, but must be part of a C
program.
- Type I: _System_calls_, which are part of UNIX itself.
- There are about 60, and should be standard on every UNIX or lookalike. They provice the interface
between application programs plus shell, and UNIX utilities and the kernel, which perform low-level
operating system services. The kernel communicates with the hardware and peripherals. System calls are
available to C programs automatically, without the necessity of declaration.
- Type II: _Library_functions_
- These are add-ons to the operating system, typically using system calls to perform their tasks. These
may not be standard acress UNIX systems, although in fact they often are. Some but not all such
functions must be declared (as extern) in the C program to make them available, or must be
invoked with compiler or command-line options.
- UNIX commands, or sequences of UNIX commands, stored in a file as a shell script, can be called
from a C program, and areavailable through the function _system_.
- User-written functions (which we have already used).
- Note that the converse is also true. Compiled C programs can be used as commands in a shell script
or mixed with UNIX commands and operators on the command line.
UNIX C functions in the manual
- UNIX type I C functions can be found in Section 2 of the manual (so that a man command
would get a note system(2) ).
- UNIX type II C functions are contained in Sections 2, 3C, 3F, 3M, 3S, and 3X. Someare loaded
automatically, the others (except those in 3F and 3X) by #include < stdio.h> or
#include< math.h> .
- The functions in 3X are loaded by other _include_ files; those in 3F are available to the Fortran
compiler for included Fortran fucntion calls (used for example for low-level graphics).
Interfaces with the UNIX system
Handling errors
- errno
is a global variable which specifies the return error code of a function call (remember
that 0 means successful completion; -1 is more-or-less _program_error_; and the other codes indicate
states of the file system or environment).
- To use errno, it should be declared with
extern int errno;
- After this declaration, a statement like the following will print its integer value.
printf("%d\n",errno);
- By adding the following to the header, one can access the symbolic error code (e.g., EROFS for
attempting to write a read-only file), and use that in tests (on loops and conditionals), although printing
will still print only the integer value.
#include< errno.h>
- By also using the following library function, one can have an error message, prefixed by _s_
(typically the name of the function causing the error, or in which the call was made, or the line number).
void perror (char *s);
- The error message is located in an array
extern char *sys_errlist [];
- Note that if perror is used, errno does not need to be declared, and errno.h
does not need to be included.
File manipulation
- access
can determine whether a given file/directory exists, or can be read, executed
(searched in case of a directory), or written by the program. The syntax is
int access (char * path; int amode);
- path
is the path to the file/directory, and amode is one of 00 (exists), 01 (execute),
02 (write), 04 (read) --- I don't know for sure, but I suspect that superuser can use them.
- access
returns 0 for success, and -1 otherwise.
- access
uses the permission bits.
- chmod
can be used internal to a C program to (attempt to) change the permissions on a file,
more-or-less as from command line (using integer rather than +/- notation), but written functionally. This
is a typical example of a UNIX C function duplicating a UNIX command line call. Thus the following
sets the permissions on myfile to let me read, write, and execute, and everyone else read and
execute.
chmod (myfile,00755);
- For the record, 04000 and 02000 are "sticky bits" and set user or group ids on execution.
- getuid , getgid , geteuid , getegid
get the real and effective user and group id's. There
are also commands setuid, etc. (see Glass, p 420).
- access
can be used to guard chmod or other file manipulations (see next item) to
ensure that an error will not occur.
- There are two different ways to manipulate files inside a C program: using the low-level functions
creat , open , close , read , write , lseek , or functions from < stdio.h> (for which
see the appendix to Deitel & Deitel, or the manual, or try at UNIX command line cat
/usr/include/stdio.h).
- addition to open/close/read/write/print commands, these functions include: ctermid and
cuserid (which get terminal and user ids); ferror , clearerr , and feof
(which handle I/O errors and EOF); fileno (who am i?); setbuf (create a buffer);
setjmp (create and manage a long jump) and longjmp (take a long jump).
- popen
opens a PIPE from within a program from the program's output to a UNIX shell
command. It has the syntax
FILE *popen (char *command; char *type);
- FILE
is a structure defined in < stdio.h> (remember case is significant),
*command is a shell command, and *type is either _r_ or _w_
(read/write).
- This might be useful if one wanted to pipe to different commands from different copies (see
fork) of the program.
Multiprocessing
- The UNIX C command system causes a subshell to be created in which the command
argument to system is executed (during which ordinarily the program sleeps).
- The command execl (one of the exec commands --- see Glass, p 416) also takes a
command, but also a list of arguments, but is equivalent to a return from main followed
by execution of the command.
- Multiprocessing in C is enabled by the fork command, which takes no arguments.
- The process which called fork is the _parent_, and the new process is its _child_.
- fork
returns a new process id to the parent, and 0 to the child.
- getpid
always returns the process id of the current process, getppid returns the
process id of the parent (or 1 if the current process is the root process).
- Nominally, the parent and the child run the same process, but in most applications subsequent
commands use the return value, or the value obtained by getpid, to differentiate and have the two
processes run different code, or use variants of exec to cause the child to execute another process
instead.
- Synchronization with children can be enforced through the wait command. The parent can
kill the child using the kill command:
int kill (int pid; int sig);
- sig
is a signal (see signal in the manual). Recall that a process can also kill itself
and return via exit.
- Note _zombie_processes_ (Glass, p 414).
System information
C programs can use functional variants of command-line UNIX commands to determine system
information; these can be used, among other purposes, to initialize random number generators.
- The following are obvious:
char *getname (char *variable);
char *getlogin ();
- The following return running time in various ways:
long time ((long *) 0);
/* “long” is an integer type */
/* with more significant bits than “int” */
long time (long *tloc);
- The following decodes the result of time.
char *ctime (long *clock);
- The following return various properties of the file, stored in *buffer.
int stat (char *filename, struct stat *buffer);
int fstat (int file_id, struct stat *buffer);
- The following returns various properties of a directory, stored in *buf.
int getdents (int file_id, struct direct * buf,
int structSize);
- chown
and fchown can be used (mostly by superuser) to change the owner of a file.
- The following can be used to make a special file:
int mknod (char *fileName, int type, int device)
Pipes and sockets
- There are two types of pipes: named and unnamed pipes.
- Unnamed pipes
- Unnamed pipes have fixed endpoints.
- See the Deitel/Deitel and Glass books for ways to assign one file's descriptor to another.
- The endpoints are created via the pipe command:
int pipe (int fd [2]);
- The command takes an array of two file descriptors.
- The first is the read end of the pipe, the second is the write end.
- A read of the first "file" is actually a read of the pipe buffer, a write of the second
"file" is actually a write to the buffer.
- The buffer is FIFO (queue), so the read is from the front and the write is to the rear of the buffer.
- Glass, page 441--443, describes the rules for access to a pipe. In particular, "Since access to an
unnamed pipe is via [file descriptors], only the process that creates a pipe and its descendants may use [it].
...
- Unnamed pipes are usually used for communication between a parent process and its child." Any
process that knows an unnamed pipe’s file descriptor fd can either read or write to the pipe.
- Unnamed pipes should be closed when its accessing processes are done with them, but
cannot be accessed once the creating process and all its subsequent children have terminated.
- Named pipes
- Named pipes are created with mknod, with mode set by chmod (see Glass, page
445).
- Named pipes must be opened for access, and use read and write like files.
- Access should be terminated with unlink rather than close.
- Unlike unnamed pipes, a given process should open a named pipe only for read access, or only for
write access, but not both.
- Details on use of named pipes are given in Glass, pages 446--448.
- Sockets
- Sockets are in some ways similar to named pipes, but connect processes which may reside on
different machines.
- A socket has three attributes: a domain, a type, and a protocol.
- domain
indicates where the processes at either end reside.
- type
indicates the form in which communication occurs --- the most typical, and also the
"type" for a pipe, is a stream of bytes/characters.
- protocol
specifies the implementation of the connection; there is a standard which applies
by default.
- Sockets are managed through a separate library, < socket.h> and also need the library
< types.h> . Additional header files may be required, depending on the domain.
- More information, including information on their use for the Internet, is contained in Glass, Chapter
10.