getrlimit and setrlimit Functions
Every process has a set of resource limits,some of which can be queried and changed by the getrlimit and setrlimit functions.
#include <sys/resource.h>
int getrlimit(int resource,struct rlimit *rlptr);
int setrlimit(int resource,struct const rlimit *rlptr);
These two functions are defined in the XSI option in the Single UNIX Specification.
Each call to these two functions specifies a single resource and a pointer to the
following structure:
strct rlimit
{ rlimt_t rlimit_cur; rlimt_t rlimit_max; }
Three rules govern the changing of the resource limits.
1.A process can change its soft limit to a value less than or equal to its hard limit.
2.A process can lower its hard limit to a value greater than or equal to its soft
limit. This lowering of the hard limit is irreversible for normal users.
3.Only a superuser process can raise a hard limit.
An infinite limit is specified by the constant RLIM_INFINITY.
#include "apue.h"
#include <sys/resource.h>
#define doit(name) pr_limits(#name,name)
static void pr_limits(char *,int);
int main(void)
{
#ifdef RLIMIT_AS
doit(RLIMIT_AS)
#endif
doit(RLIMIT_CORE)
doit(RLIMIT_cpu)
doit(RLIMIT_DATA)
doit(RLIMIT_FSIZE)
#ifdef RLIMIT_MEMLOCK
doit(RLIMIT_MEMLOCK);
#endif
#ifdef RLIMIT_MSGQUEUE
doit(RLIMIT_MSGQUEUE);
#endif
#ifdef RLIMIT_NICE
doit(RLIMIT_NICE);
#endif
doit(RLIMIT_NOFILE);
#ifdef RLIMIT_NPROC
doit(RLIMIT_NPROC);
#endif
#ifdef RLIMIT_NPTS
doit(RLIMIT_NPTS);
#endif
#ifdef RLIMIT_RSS
doit(RLIMIT_RSS);
#endif
#ifdef RLIMIT_SBSIZE
doit(RLIMIT_SBSIZE);
#endif
#ifdef RLIMIT_SIGPENDING
doit(RLIMIT_SIGPENDING);
#endif
doit(RLIMIT_STACK)
#ifdef RLIMIT_SWAP
doit(RLIMIT_SWAP)
#endif
#ifdef RLIMIT_VMEM
doit(RLIMIT_VMEM)
#endif
return 0;
}
static void pr_limits(char *name,int resource)
{
struct rlimit limit;
unsigned long long lim;
if(getrlimit(resource,&limit)<0)
err_sys("getrlimit error for %s",name);
printf("%-14s ",name);
if(limit.rlim_cur==RLIM_INFINITY){
printf("(infinite)");
}
else
{
lim=limit.rlim_cur;
printf("%10lld ",lim);
}
if(limit.rlim_max==RLIM_INFINITY){
printf("(infinite)");
}
else
{
lim=limit.rlim_max;
printf("%10lld ",lim);
}
putchar((int)'\n');
}
Print the current resource limits
Note that we’ve used the ISO C string-creation operator (#) in the doit macro,to generate the string value for each resource name. When we say
doit(RLIMIT_CORE);
the C preprocessor expands this into
pr_limits("RLIMIT_CORE",RLIMIT_CORE);
Process Control
Introduction
We now turn to the process control provided by the UNIX System. This includes the creation of new processes,program execution,and process termination. We also look at the varIoUs IDs that are the property of the process — real,effective,and saved; user and group IDs—and how they’re affected by the process control primitives. Interpreter files and the system function are also covered. We conclude the chapter by looking at the process accounting provided by most UNIX systems.
Process Identifiers
Every process has a unique process ID,a non-negative integer. Because the process ID is the only well-known identifier of a process that is always unique,it is often used as a piece of other identifiers,to guarantee uniqueness.
Although unique,process IDs are reused. As processes terminate,their IDs become candidates for reuse. Most UNIX systems implement algorithms to delay reuse,however,so that newly created processes are assigned IDs different from those used by processes that terminated recently. This prevents a new process from being mistaken for the prevIoUs process to have used the same ID.
There are some special processes,but the details differ from implementation to
implementation.
Process ID 0 is usually the scheduler process and is often known as the swapper. No program on disk corresponds to this process,which is part of the kernel and is known as a system process.
Process ID 1 is usually the init process and is invoked by the kernel at the end of the bootstrap procedure. The program file for this process was /etc/init in older versions of the UNIX System and is /sbin/init in newer versions. This process is responsible for bringing up a UNIX system after the kernel has been bootstrapped. init usually reads the system-dependent initialization files — the /etc/rc* files or /etc/inittab and the files in /etc/init.d—and brings the system to a certain state,such as multiuser.
The init process never dies. It is a normal user process,not a system process within the kernel,like the swapper,although it does run with superuser privileges.
In Mac OS X 10.4,the init process was replaced with the launchd process,which performs the same set of tasks as init,but has expanded functionality.
Each UNIX System implementation has its own set of kernel processes that provide operating system services.On some virtual memory implementations of the UNIX System,process ID 2 is the pagedaemon. This process is responsible for supporting the paging of the virtual memory system.
#include <unistd.h>
pid_t getpid(void);
pid_t getppid(void);
uid_t getuid(void);
uid_t geteuid(void);
gid_T getgid(void);
gid_t getegid(void);
fork Function
#include <unistd.h>
pid_t fork(void);
The new process created by fork is called the child process. This function is called once but returns twice. The only difference in the returns is that the return value in the child is 0,whereas the return value in the parent is the process ID of the new child. The reason the child’s process ID is returned to the parent is that a process can have more than one child,and there is no function that allows a process to obtain the process IDs of its children. The reason fork returns 0 to the child is that a process can have only a single parent,and the child can always call getppid to obtain the process ID of its parent. (Process ID 0 is reserved for use by the kernel,so it’s not possible for 0 to be the process ID of a child.)
Both the child and the parent continue executing with the instruction that follows
the call to fork. The child is a copy of the parent.
Modern implementations don’t perform a complete copy of the parent’s data,stack,and heap,since a fork is often followed by an exec. Instead,a technique called copy-on-write (COW) is used. These regions are shared by the parent and the child and have their protection changed by the kernel to read-only. If either process tries to modify these regions,the kernel then makes a copy of that piece of memory only,typically a ‘‘page’’ in a virtual memory system.
Variations of the fork function are provided by some platforms. All four platforms discussed in this book support the vfork(2) variant discussed in the next section.
#include "apue.h"
int globvar=6;
char buf[]="a write to stdout\n";
int main(void)
{
int var;
pid_t pid;
var=88;
if(write(STDOUT_FILENO,buf,sizeof(buf)-1)!=sizeof(buf)-1)
{
err_sys("write error");
}
printf("before fork\n");
if((pid==fork())<0)
{
err_sys("fork error");
}
else if(pid==0)
{
globvar++;
var++;
}
else
{
sleep(2);
}
printf("pid=%ld,glob=%d,var=%d\n",(long)getpid(),globvar,var);
return 0;
}
Example of fork function
In general,we never know whether the child starts executing before the parent,or vice versa. The order depends on the scheduling algorithm used by the kernel. If it’s required that the child and parent synchronize their actions,some form of interprocess communication is required.
The write function is not buffered. Because write is called before the fork,its data is written once to standard output. The standard I/O library,is buffered. Standard output is line buffered if it’s connected to a terminal device; otherwise,it’s fully buffered.
File Sharing
The child’s standard output is also redirected. Indeed,one characteristic of fork is that all file descriptors that are open in the parent are duplicated in the child. We say ‘‘duplicated’’ because it’s as if the dup function had been called for each descriptor. The parent and the child share a file table entry for every open descriptor.
There are two normal cases for handling the descriptors after a fork.
1.The parent waits for the child to complete. In this case,the parent does not need to do anything with its descriptors. When the child terminates,any of the shared descriptors that the child read from or wrote to will have their file offsets updated accordingly.
2.Both the parent and the child go their own ways. Here,after the fork,the parent closes the descriptors that it doesn’t need,and the child does the same thing. This way,neither interferes with the other’s open descriptors. This scenario is often found with network servers.
Besides the open files,numerous other properties of the parent are inherited by the
child:
- Real user ID,real group ID,effective user ID,and effective group ID
- Supplementary group IDs
- Process group ID
- Session ID
- Controlling terminal
- The set-user-ID and set-group-ID flags
- Current working directory
- Root directory
- File mode creation mask
- Signal mask and dispositions
- The close-on-exec flag for any open file descriptors
- Environment
- Attached shared memory segments
- Memory mappings
- Resource limits
The differences between the parent and child are
- The return values from fork are different.
- The process IDs are different.
- The two processes have different parent process IDs: the parent process ID of the child is the parent; the parent process ID of the parent doesn’t change.
- The child’s tms_utime,tms_stime,tms_cutime,and tms_cstime values are set to 0 (these times are discussed in Section 8.17).
- File locks set by the parent are not inherited by the child.
- Pending alarms are cleared for the child.
- The set of pending signals for the child is set to the empty set.
There are two uses for fork:
1.When a process wants to duplicate itself so that the parent and the child can each execute different sections of code at the same time. This is common for network servers—the parent waits for a service request from a client. When the request arrives,the parent calls fork and lets the child handle the request. The parent goes back to waiting for the next service request to arrive.
2.When a process wants to execute a different program. This is common for shells. In this case,the child does an exec (which we describe in Section 8.10) right after it returns from the fork.
The Single UNIX Specification does include spawn interfaces in the advanced real-time option group. These interfaces are not intended to be replacements for fork and exec,however. They are intended to support systems that have difficulty implementing fork efficiently,especially systems without hardware support for memory management.
vfork Function
The function vfork has the same calling sequence and same return values as fork,but the semantics of the two functions differ.
The vfork function originated with 2.9BSD. Some consider the function a blemish,but all the platforms covered in this book support it. In fact,the BSD developers removed it from the 4.4BSD release,but all the open source BSD distributions that derive from 4.4BSD added support for it back into their own releases. The vfork function was marked as an obsolescent interface in Version 3 of the Single UNIX Specification and was removed entirely in Version 4. We include it here for historical reasons only. Portable applications should not use it.
The vfork function was intended to create a new process for the purpose of executing a new program (step 2 at the end of the prevIoUs section),similar to the
method used by the bare-bones shell from Figure 1.7. The vfork function creates the new process,just like fork,without copying the address space of the parent into the child,as the child won’t reference that address space; the child simply calls exec (or exit) right after the vfork. Instead,the child runs in the address space of the parent until it calls either exec or exit. This optimization is more efficient on some implementations of the UNIX System,but leads to undefined results if the child modifies any data (except the variable used to hold the return value from vfork),makes function calls,or returns without calling exec or exit.
#include "apue.h"
int globvar=6;
char buf[]="a write to stdout\n";
int main(void)
{
int var;
pid_t pid;
var=88;
printf("before vfork\n");
if((pid==vfork())<0)
{
err_sys("vfork error");
}
else if(pid==0)
{
globvar++;
var++;
return 0;
}
printf("pid=%ld,var);
return 0;
}
Example of vfork function
_exit does not perform any flushing of standard I/O buffers. If we call exit instead,the results are indeterminate. Depending on the implementation of the standard I/O library,we might see no difference in the output,or we might find that the output from the first printf in the parent has disappeared.
If the child calls exit,the implementation flushes the standard I/O streams. If this is the only action taken by the library,then we will see no difference from the output generated if the child called _exit. If the implementation also closes the standard I/O streams,the memory representing the FILE object for the standard output will be cleared out. Because the child is borrowing the parent’s address space,when the parent resumes and calls printf,no output will appear and printf will return −1. Note that the parent’s STDOUT_FILENO is still valid,as the child gets a copy of the parent’s file descriptor array.
Most modern implementations of exit do not bother to close the streams. Because the process is about to exit,the kernel will close all the file descriptors open in the process. Closing them in the library simply adds overhead without any benefit.