Useful Linux command line snippets

A short and comprehensive overview of helpful Unix / Linux / MacOSX terminal commands and their arguments.

Searching for strings

The search for a given string in a file (a case in-sensitive search can be enabled with the -i option) can be done with the grep program. Let's search for test in the file myfile:

grep -i "test" myfile

Instead of filenames we can also enter wildcards. If we want to perform recursive searches we use the -r flag. Another possibility is to output the lines, after the match. This can be done with the -A option. Usually this option is set to 0, i.e. no lines after the matching one are printed. The following command will print the next three lines after the match occured:

grep -A 3 "test" myfile

While -A means after, -B means before and -C means around the match. Those two options are used similar to the -A option.

In order to just print the name of the file, where the string has been matched, we have to specify the -l parameter.

If we are not interested in finding strings in files, but finding strings in filenames, then we should use the find program. An example would be:

find -iname "myfilename"

Here the -iname option specifies case insensitive filesnames to be searched.

Changing contents on the fly

The Stream Editor program (sed) is a powerful tool that let's us change file contents on the fly. Let's have a look at a simple example:

sed 's/.$//' filename

In this example we see two features: First of all we see that the syntax is building upon regular expressions. The first argument is the matching expression with the replacement (seperated by slashes / with the options before the first and after the third, i.e. last, slash. The second argument is the filename of the input stream. If we do not specify an output stream (or pipe it), the output gets redirected to the standard output (shell).

A more complicated example is the following:

sed '/./=' thegeekstuff.txt | sed 'N; s/\n/ /'

Here we add line numbers to all non-empty lines. Another possibility to change file contents is the awk program. It allows us for instance to remove all duplicate lines from a file:

awk '!($0 in array) { array[$0]; print }' myfile

AWK is also a complete programming languages. Therefore it is possible to do very complex things (with very few words).

Extract files

Creating and extracting tarballs is a mandatory job. To fully utilize this we only need to know three basic commands. Create a new tar archive (here named myarchive.tar) with the contents from the relative (local) directory dirname:

tar cvf myarchive.tar dirname/

Extract from an existing tar archive (here named myarchive.tar) to the current directory:

tar xvf myarchive.tar

And of course sometimes we want to have a look at the contents of a tarball first. In such cases we use want to view an existing tar archive:

tar tvf myarchive.tar

A generic stopwatch

If we want to use a simple and straight forward way to measure the performance of any program, we could use the inbuilt time command. However, we should note that this command is different in bash. Therefore we should always call the program the following way:

/usr/bin/time

Now there is a list of possible arguments, but the most simple case is to use just the target program, i.e. the application that should be measured, as an argument.

DNS lookup

The nslookup tool can be used to make all sorts of DNS lookups. The first and most important command snippet is the following:

nslookup redhat.com

This little snippet gives us IPs, names and addresses from redhat.com. However, sometimes we want to run a specific query on the DNS system backwards. One example would be:

nslookup -query=mx redhat.com

Here we additionally specified the -query parameter with the value mx (Mail eXchange). The answer might look the following:

Server:		192.168.19.2
Address:	192.168.19.2#53

Non-authoritative answer:
redhat.com	mail exchanger = 10 mx2.redhat.com.
redhat.com	mail exchanger = 5 mx1.redhat.com.

Authoritative answers can be found from:
mx2.redhat.com	internet address = 66.187.233.33
mx1.redhat.com	internet address = 209.132.183.28

Here we see the mail exchange servers as set in the DNS system of redhat.com, with the preferences (5 and 10) of the the system (lower numbers are prefered).

Additionally we can write a lot of other queries, for example:

soa, start of authority, which provides the authoritative information about the domain
ns, name server, maps a domain name to a list of DNS servers
any, to view all the available DNS records

We can also do a reverse lookup by entering an IP instead of a name. Other popular features include the specification of a port and changing the timeout interval to wait for a reply. Examples of such commands are:

nslookup -port 56 redhat.com
nslookup -timeout=10 redhat.com

Files and folders disk usage size

We can use the du command to retrieve information about file and folder sizes. An important parameter is -a. This parameter shows the disk usage of all the files and directories from the current location. Without using it we would just get information about directories, which have a non-zero size.

In order to understand the directory sizes we need to add information about the unit size (K for Kilobytes, M for Megabytes and so on). By using the -h parameter we set the human-readable output, i.e. an output that includes units.

Sometimes we are only interested in the total sum. To display only the total count we are using the -s parameter. Since the final count (or every value) is determined by the number of blocks specified in the file allocation table, we could be interested how many blocks could be used using a different block size. Changing the block size is possible. All we need to do it using du --block-size=2048, where 2048 is the size of one block in bytes. If we combine some of the already discussed we might end up with the following command:

du -ahc --block-size=2048

Here we display all entities in a human-readable form, also displaying the grand total in the output using -c. Additionally we can tell the program to display everything in bytes (instead of blocks) and with their modification time. We can also customize the display style or exclude certain files using a certain mask. One example would be:

du -cbha --exclude="*.txt"

File system disk usage size

The df command offers similar options as the du command. Initially the program gives us some valuable information on the file systems, their mount points, their memory usage, and various other things. By using the -a option we can display Information of all the file systems. Again we can specify the memory block size, here by using -B:

df -B 100

Similarly the option -h is used to tell the program that all units should be displayed, i.e. making the output human-readable. The grand total can be retrieved by using the --total parameter:

df -h --total

Till now we used df to print the second column as total memory blocks. If information in terms of inode is desired the option -i should be used. In computing, an inode (index node) is a data structure, where each inode stores all the information about a file system object (file, directory, device node, socket, pipe, etc.), except data content and file name.

Additionally we might want to get information about the type of file system. This is possible by using the option -T. An example that shows the number of inodes and the type of file system is the following:

df -Ti

As with the files and folders program we can also exclude certain items from the list. Here our exclusion (or inclusion) rule is mainly focused on types of file systems. We can make a white list (only include file systems with the following type) by using the -t parameter. Otherwise we might end up with a black list (exclude file systems with the following type) by using the -x option.

# only show file systems with type ext2
df -t ext2
# exclude all file systems with type ext2
df -x ext2

Information on symbols

To gather information on the symbols that are used in an object file or an executable, we can use the nm command. By default we are already getting a lot of interesting information from this program. We get:

The virtual address of the symbol
A character which depicts the symbol type, i.e. if the character is in lower case then the symbol is local but if the character is in upper case then the symbol is external
Of course the name of the symbol

There are various characters that identify symbol types. A short list includes the following:

A Global absolute symbol
a Local absolute symbol
B Global bss symbol
b Local bss symbol
D Global data symbol
d Local data symbol
f Source file name symbol
L Global thread-local symbol (TLS)
l Static thread-local symbol (TLS)
T Global text symbol
t Local text symbol
U Undefined symbol

Let's have a short look at the default (very trivial) syntax of this little helper:

nm myobject.o
nm someexecutable

The default argument (if we do not specify any object or executable) is to search for a.out. Combined with wildcards and pipelined to grep we can search in a set of objects or executables for a set of names. Let's have a look at one example:

nm  -A ./*.o | grep func

Here we want all global absolute symboles in all objects of the current directory to be found. Additionally we just print out those results, where the name func is found.

Sometimes we have a lots of results and therefore need a way to sort them. We can use the flag -n, so that the output comes out to be in sorted with the undefined symbols first and then according to the addresses. Sorting can help in the process of debugging a problem. Another way of sorting is by using --size-sort, which sorts the results by their size.

Information on addresses is usually not enough - what we additionally care about are sizes. By using the -S option we will additionally get information on the size of the object. Consider the following example, which searches for dmw in all objects of the current directory:

nm  -S ./*.o | grep dmw

If we want to get information about the external symbols of an object or executabe we can use the -g flag.

Downloading files

By using the program cURL we can transfer data using the URL syntax. cURL supports various protocols like, DICT, FILE, FTP, FTPS, Gopher, HTTP, HTTPS, IMAP, IMAPS, LDAP, LDAPS, POP3, POP3S, RTMP, RTSP, SCP, SFTP, SMTP, SMTPS, Telnet and TFTP. Downloading a single file is as easy as:

curl http://www.google.de

Now the output is being redirected to the command line. If we want to save the content of the file we just have to pipe it to the specific file:

curl http://www.google.de > index.html

However, curl also provides more direct ways to do this by using flags like -o or -O. While the first one expects a filename to be chosen by the user (chosen via the command line arguments), the second one choses a filename automatically. The choice is usually dependent on the filename specified in the URL.

cURL also allows us to download multiple files. All we have to do is to specify the files seperately like,

curl -O URL1 -O URL2 -O URL3 -o FILENAME4 URL4 ...

The program understands the protocols it supports quite nicely. Therefore it knows about status codes from the HTTP protocol. Usually redirects are not followed, i.e. we will not get the same result as in the browser (here going to Google.com will result in a local page, like Google.de, by performing a redirect) in general. We can, however, specify the -L option to follow HTTP redirects.

curl -L http://www.google.com

If a previous download (of a large file) stopped for some reason, then we can continue by using the -C flag. It is important to use the same file parameters (like the same name for a manual choice or otherwise the automatic choice again) for this to work.

Maybe the file download did not work due to some bandwidth limitation of your provider (some people just have a limit quota per day, so fully using their bandwidth might result in exceeding their quota too early). Here we can limit the bandwidth of the download by using the --limit-rate flag:

curl --limit-rate 1000B -C -O http://www.gnu.org/software/gettext/manual/gettext.html

In this example we set the bandwidth to 1000 Bytes per second. We wanted to continue with our download and we let the program choose the corresponding file name (will be gettext.html in this case).

Another nice use-case is the usage of -z to start the download only if the file has been modified after a particular time. By using a negative date, i.e. the date starts with a minus sign, we will start the download only if the file has been modified before a particular time. Here is an example for starting the download only if the file has been modified after the given date (31st of December 2010):

curl -z 31-Dec-10 ftp://example.com/somefile

Some URLs are protected by a HTTP Username / Password protection. Again, this can be solved by using some cURL parameters. Here we just use the -u option to enter username and password seperated by a colon:

curl -u username:password URL

This is also needed to log in on a secured FTP server. Additionally it is also possible to upload files to the server by using the option -T. We can either upload a single file (simple by specifying the local path to the file and the URL to the directory) or multiple files. Both ways are displayed below:

curl -u ftpuser:ftppass -T file ftp://example.com/
curl -u ftpuser:ftppass -T "{file1,file2,...,fileN}" ftp://example.com/

More Information can be received by using the verbose mode (option -v) or the trace option (--trace). The last flag will enable a lot of interesting output to be displayed by cURL.

Another really important option can be set with -x. Here we have the ability to specify a proxy server to be used for the request. The proxy server will then execute the request itself:

curl -x proxysever.test.com:3128 http://www.google.de

Recursive downloads with wget

An alternative to curl is the wget program. While cURL builds upon the popular libcurl library, which provides APIs for transfers like uploads and downloads to various protocols, wGet is just a command-line tool without any APIs.

There is one main advantage for using wget:

wget supports recursive download, while curl doesn't.

On the other hand cURL supports lot more protocols that wGet lacks support of. For example: SCP, SFTP, TFTP, TELNET, LDAP(S), FILE, POP3, IMAP, SMTP, RTMP and RTSP. However, they both can be used to download files using FTP and HTTP(s) or send HTTP POST requests.

The following example downloads the file and stores in the same name as the remote server:

wget http://www.openss7.org/repos/tarballs/strx25-0.9.2.1.tar.bz2

This is actually a difference to cURL, where we had to specify the -O flag to save the file with the same name as the remote server (otherwise the transfer was redirected to the standard output like the console). The -O flag is also present in wGet, but here we are allowed to specify a new file name for the downloaded file.

Quite similar is also the --limit-rate flag:

wget --limit-rate=200k http://www.openss7.org/repos/tarballs/strx25-0.9.2.1.tar.bz2

With -c a previously cancelled download will be resumed. Additionally we can perform a download in the background by using -b. A nice feature of wGet is the possibility to send a custom user agent string to the server. Let's have a look:

wget --user-agent="Opera/9.80 (X11; Linux x86_64; U; en) Presto/2.10.289 Version/12.00" URL

We can use this feature to mask our download as if it would be performed by a (popular) webbrowser. Using the --spider option we can test various scenarios:

Checking the status before scheduling a download.
Monitoring whether a website is available or not at certain intervals.
Checking links from a list (like our bookmarks) to check which entries are still available.

A great feature is the option to download a complete webpage (including external sources). This is the recursive part of the program. This can be done by entering the following command:

wget --mirror -p --convert-links -P ./LOCAL-DIR WEBSITE-URL

The --mirror flag activates the mirroring mode, while -p downloads all external files that are included in the given HTML page. With --convert-links all references to external sources (images, scripts, ...) in the document will be converted to the downloaded local version. The -P just states that we are specifying a directory as target, not a file.

The last scenario that is easy to imagine and solve by using wGet is the task to download only certain file types with the flags -r and -A. Usually we want to scan a webpage for a certain type of linked document and then perform the download of those linked resources.

wget -r -A.pdf http://example.com/some-page-with-pdf(s)

Here we scan the webpage for all files with the extension pdf and download the files, which have been found.

Information about the process or user using a file

The fuser command allows us to identify which processes are using a particular file or directory. The very basic command has just a directory (can also be the current directory, set by a .) as argument.

If we perform this command we see that the output consists of process IDs followed by a character. This character indicates the type of access. The type of access can be any one of the following:

c current directory
e executable being run
f open file (usually omitted)
F open file for writing (usually omitted)
r root directory
m maped file or shared library

To display detailed information in the output we have to use the verbose option -v. If we use the program on an executable file instead of a directory or file we can see which user is running the program. Another option is to look at resources with the option -n. The following command would look for the TCP port 5000:

fuser -v -n tcp 5000

Here we would get information about the process (name and ID) and the user running the process that is using this resource. We could also specify to kill all processes that are using the requested file or resource. If the program socket_serv is should be killed we could do that like:

fuser -v -k socket_serv

This is just another way to kill a process. Other ways involve kill (pid), xkill (auswahl), killall (name) and pkill (signal). With fuser you can now interactively kill processes. The statement for doing this is simply:

fuser -v -k -i socket_serv

Suppose we want to delete a file forcefully but it is being used by many processes then the processes won't let us delete the file. In that case, we can use this utility to kill all the processes (or selected processes) that are using that file.

Change owner and group

The concept of owner and groups for files is fundamental to Linux. Every file is associated with an owner and a group. We can use chown and chgrp commands to change the owner or the group of a particular file or directory.

To change the owner a specific file one has to enter the following command:

chown OWNER FILE

This changes the owner of the FILE file to OWNER. If we want to change the group of the file we can do that by simply placing a colon in front of the owner. Therefore we now have:

chown :GROUP FILE

This looks quite nice and has one direct consequence. If we want to change both, owner and group of the file, we can do that by seperating the owner from the group with a colon, like:

chown OWNER:GROUP FILE

When the chown command was issued on a symbolic link to change the owner or the group then its the referent of the symbolic link. This means that the owner and group of the original file got changed. This is the default behavior of the program. To prevent this we have to use the special -h flag.

What is if we want to change the permissions only if the file is currently owner by a specific user? This is possible, too, using the following syntax:

chown --from=CURRENT_OWNER OWNER FILE

Here we use the --from option to set our constraint. We can also use it with groups or with users and groups together. The syntax is the same, i.e. groups start with colons, which seperate the user from the group, as in the first examples.

Sometimes we want to copy permissions from one file to another. Using the --reference option we can set any file as source as we want. Consider the following example:

chown --reference=SOURCE_FILE TARGET_FILE

Another often used option is the ability to change permissions of files recursivly by using -R. Other quite popular options include the option to forcefully change the owner or group of a symbolic link directory recursively with -H and enabling the verbose mode with -v.

Created 8/10/2012 11:39:58 AM +00:00. Last updated 8/10/2012 11:41:19 AM +00:00.

References

50 Most Frequently Used UNIX / Linux Commands (With Examples)