PLOG(1)

NAME

plog - Process resource logger.

SYNOPSIS

plog -h | [ -p pid -i interval | -l | -q pid|dir ] [-d log directory] [-t duration]

DESCRIPTION

Plog is the client used to register a process for monitoring by a plogsrvd(1). There may be multiple such servers running, see -d under INVOCATION OPTIONS below.

INVOCATION OPTIONS

Requests for logging require the -p and -i arguments; alternatively, there are the -q, -l, and -h parameters, which do not require -p and -i.

-h

Show usage message and version number then exit.

-p PID

The process id of the process to log.

-i INTERVAL

The interval, in minutes, at which to sample the process while logging.

-l

List all processes currently being logged by the plogsrvd.

-q 'pid' or 'dir'

Query the plogsrvd for its process id or runtime directory. The later is only useful if you are not sure what the default is, and obviously makes no sense in conjunction with -d.

-d RUNTIME DIRECTORY

This is the directory used by the plogsrvd. If not specified, plog assumes the content of the PLOGDIR environment variable or /var/local/plog.

-t TIMEOUT

This is the number of seconds before plog times out on a request. Generally they should be instantaneous and the default is 3, but you can use anything from 1 to 300.

ABOUT LOG FILES

The logs initiated by a plog request are kept in the relevant plogsrvd's runtime directory. Plog receives a response informing you of the full path to this file; the name is a basename for the process and its pid (eg, 'bash.123'). If there is already a log file with this name, .2, .3, etc are appended.

The first line of the log file is the command line that was used to invoke the process. The next line is the column headers. Here are their meanings (see also proc(5), and VIRTUAL ADDRESS SPACE VS. PHYSICAL MEMORY, below), together with a format appropriate for parsing:

Date Time

This is the time of the sample, in the form DD-MM [H]H:MM.

CPU_time

This is the total amount of processor time actively used by the process since it started, rounded down to the second. If it is less than 60, it will simply be suffixed with 's'. If it is one minute or more but less than 1 hour it will be in [M]M:SS format, and longer than that, H:MM:SS.

#th

This is the number of threads the process currently has. It is a simple integer >= 1.

Virtual

This is the virtual address space in use by the process. It is not necessarily an accurate measure of how much real memory it consumes (see VIRTUAL ADDRESS SPACE VS. PHYSICAL MEMORY, below). This and the metrics for Resident, PSS, Dat+Stck, and Prv&Writ are integers with a 'k' suffix if the amount is < 4096 kiB. Otherwise it is a fixed precision decimal number (two places) suffixed with 'M' for MiB.

Resident

Resident Set Size (RSS). This is the amount of real physical memory used by the process. However, it includes space shared by other processes. The format is the same as for Virtual.

Share

This is the percentage of the process's Resident Set Size shared by other processes. If there is a value for PSS (the next column), or PSS is 'n/a', the value is computed by parsing /proc/[pid]/smaps. If PSS is blank, then the value was taken from /proc/[pid]/statm and should be identical to that reported by top(1). The smaps figure will often be very similar, but should be considered more accurate and dynamic. It is expressed as an integer with '%' at the end.

PSS

This is the Proportional Set Size (PSS). It is a newer metric and may not be reported by all kernels, so the field may be blank (indicating there is no readable proc/[pid]/smaps) or 'n/a' (indicating there is no PSS data in smaps). If present, it is all the unshared RSS, plus the shared regions, but the value for each shared region is divided by the number of other processes sharing it. Hence, it is a good measure of the real memory load incurred. The format is the same as for Virtual.

Dat+Stck

This is the virtual address space accounted for by the 'data' section of a process's executable (some of which may be shared), plus the stack and heap space. The format is the same as for Virtual.

Prv&Writ

This is all the virtual address space marked as private and writable; pmap(1) will also report this when invoked with the -d switch. This will often be very close to the Dat+Stck figure. The format is the same as for Virtual.

Minor

Minor page faults (a page is a region of memory) are normal events; they happen when a virtually mapped region is actually used for the first time, meaning the linux kernel must do some juggling to provide the virtual region with real, physical memory (see VIRTUAL ADDRESS SPACE VS. PHYSICAL MEMORY, below). Such minor faults are a processor expense, however, and excessive minor faults in a short period of time can affect a process's performance. This is normal during startup. Beyond that a possible cause would be allocating a large area and then randomly accessing parts of it. Unfortunately, a complete discussion is beyond the scope of this document, but you will find more information online if you search for things like 'linux minor page faults' and 'GNU mallopt'.

This number includes faults incurred by the process's waited on children (see proc(1)). It is a simple integer >= 0.

Major

These are the major page faults (also reported by top(1)), plus the number of major faults incurred by the process's waited on children. Major page faults are more expensive, time wise, than minor ones, because they involve loading of data from storage into memory (for example, the first time some code or data from an executable object is accessed, the relevant parts of the file must be loaded from disk into memory; this doesn't mean there will always be at least one major fault, since such data may already be cached by the kernel). It is a simple integer >= 0.

VIRTUAL ADDRESS SPACE VS. PHYSICAL MEMORY

It is important to understand the difference between virtual address space and physical memory in interpreting some of the above statistics. As the name implies, virtual address space is not real; it's basically a map of all the memory currently allocated to a process. The limit on the size of this map is the same for each processes (generally, 2-4 GB), and it is not accumulated (ie, you may have dozens or hundreds of processes, each with its own 2-4 GB virtual address space, on a system that only actually has 512 MB of physical memory).

Data cannot actually be stored or retrieved from virtual address space; real data requires real, physical memory. It is the kernel's job to manage one in relation to another. Virtual space stats (Virtual, Dat+Stck, and Priv&Write) are useful for considering the structure of a process and the relationship to physical memory use, but with regard to amount of RAM actually used, the physical memory stats (Resident, Share, and PSS) are what counts.

TERMINATING LOGGING

There are three ways to stop logging:

1) Terminate the process being logged.
2) Terminate the plogsrvd doing the logging.
3) Remove the log file from the plogsrvd's runtime directory.

EXIT STATUS

If plog prints a usage message, either because you ask for one or because you gave it incorrect parameters, it returns 11. When otherwise successful, it returns 0. Other codes are shown in parentheses with the corresponding ERROR below; for a definitive list see client.c in the source package.

ERRORS

All errors are printed to the standard error stream. The following fatal errors are given with the corresponding exit status in parentheses:

$PLOGDIR is too long (1)

Unix local sockets must have a full path less than 108 bytes long. This includes the filename. Plogsrvd places this socket file in its runtime directory, so that path is limited to 96 bytes. You can use a symlink, however.

Create socket failed... (2)

Plog could not create a local unix socket; there should be a further explanation.

Cannot bind to [socket_path]... (3)

The unix local socket could not be bound to the specified file; there should be should be a further explanation.

No plog server found (4)

There is no plogsrvd active for the runtime directory (see -d under INVOCATION OPTIONS, above).

[socket file] is a stale socket with no plogsrvd running (5)

The same as above, except there is a leftover local socket file in the plogsrvd runtime directory (see plogsrvd(1)).

Permission denied on [socket file] (6)

See COMMUNICATION MODE in plogsrvd(1).

Poll(revents=[N]) error... (7)

Unlikely, but if it happens, this too should come with a further explanation. Try again. If you notice this happening frequently, report it via the plog website (http://cognitivedissonance.ca/cogware/plog).

Request timed out (8)

See -t under INVOCATION OPTIONS, above.

Out of memory! (9)

This is almost certainly not plog's fault, but it could still happen.

Received SIGPIPE (10)

This is only fatal after three such events, which should be very unusual.

[N]+ /tmp/.plog sockets, exiting (12)

See COMMUNICATION MODE in plogsrvd(1).

Recvfrom() error (13)

Akin to the error #7 above.

Sendto() returned [N]... (100+)

Also akin to #7. The status code is the actual errno from the recv() call plus 100.

SEE ALSO

plogsrvd(1)
proc(5)
top(1)

COPYRIGHT

Copyright (C) 2011, 2017 M. Eriksen. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation (http://www.gnu.org/licenses/fdl.html).