Basics of a Linux System¶
Introduction¶
In this chapter we will see some of the very basic concepts of the operating systems and programs which run on it.
- What is a computer program, how to convert the
.c
file to anexecutable
and what are the steps involved. - What are libraries? What are shared libraries and static libraries?
- What are system calls?
- What is a kernel?
- How the block diagram of the system looks like?
Programs and Compilation¶
Your program is a set of instructions to the computer which your computer needs to follow in order to get some work done for you.
For running a program on a Linux System these are the steps involved.
- Write the program.
- Pre-process the program. Run
gcc -E hello_world.c > pre.c
. - Assemble the pre-processed code. Run
gcc -S pre.c
. You will get a filepre.s
- Compile the assembled code. Run
gcc -c pre.s
. You will get a filepre.s
. - Run the linker on the compiled code.
gcc pre.o
. You will get a file with name asa.out
.
These steps are pretty simple and straight forward but there is a lot of things which go under the hood and is hidden under the gcc
command.
What is gcc
¶
gcc
is a computer program which takes another program as an input and converts it intoELF
file format.ELF
file format is the file format of the executable files which can be run onLinux
machines.
Stages of compilation¶
gcc
has to undergo a lot of stages while compiling your code. The sequence isPREPROCESSING -> COMPILATION -> ASSEMBLING -> LINKING
Preprocessing¶
- This stage converts the macros in the c file to c code which can be compiled. See the file
pre.e
. Here the macro#include
has been expanded and the whole filestdio.h
has been copied in the c file.
Compilation¶
- Here the assembled code will be converted into the opcode of the assembly instruction.
Assembling¶
- This stage will convert the C programming language into the instruction set of the CPU. See the file
pre.s
. Here you will only see assembly instructions.
Linking¶
- Here the code will be linked with the libraries present on the system. Note that
printf
function is not defined in your code, neither it is defined in the filestdio.h
. It is just declared in the header file and it is stored in the compiled and executable format in a shared library on the system.
Hands-On¶
- Write the code
1 2 3 4 5 6 | #include <stdio.h>
int main() {
printf("\n\nHello World\n");
return 0;
}
|
Pre-process the file
gcc -E hello_world.c > pre.c
Read the
pre.c
file to understand what has been done in the pre-processing stage.Assemble the
pre.c
filegcc -S pre.c
- you will get a filepre.s
- Read the file to see the assembled codeCompile the
pre.s
filegcc -c pre.s
- you will get a filepre.o
- Read the file withobjdump -D pre.o
- You will get to see the full contents of the fileLink the file
Now this is a bit tricky as calling
ld
with the right option will be required. We will see howgcc
does it.Run
gcc hello_world.c -v
to see whatgcc
does. This is very specific to the flavor of Linux because of the folder paths it has. The same command may not run on your machine. My flavor is
$ uname -a
Linux rishi-office 4.4.0-83-generic #106-Ubuntu SMP Mon Jun 26 17:54:43 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
rishi@rishi-office:~/publications/doc_syscalls/code_system_calls/00$ cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=16.04
DISTRIB_CODENAME=xenial
DISTRIB_DESCRIPTION="Ubuntu 16.04.2 LTS"
- Here is the output of the command
gcc hello_world.c -v
. We are focusing only on the last few lines.
/usr/lib/gcc/x86_64-linux-gnu/5/collect2 -plugin /usr/lib/gcc/x86_64-linux-gnu/5/liblto_plugin.so -plugin-opt=/usr/lib/gcc/x86_64-linux-gnu/5/lto-wrapper -plugin-opt=-fresolution=/tmp/cc8bF6fB.res -plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lgcc_s -plugin-opt=-pass-through=-lc -plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lgcc_s –sysroot=/ –build-id –eh-frame-hdr -m elf_x86_64 –hash-style=gnu –as-needed -dynamic-linker /lib64/ld-linux-x86-64.so.2 -z relro /usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crt1.o /usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crti.o /usr/lib/gcc/x86_64-linux-gnu/5/crtbegin.o -L/usr/lib/gcc/x86_64-linux-gnu/5 -L/usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu -L/usr/lib/gcc/x86_64-linux-gnu/5/../../../../lib -L/lib/x86_64-linux-gnu -L/lib/../lib -L/usr/lib/x86_64-linux-gnu -L/usr/lib/../lib -L/usr/lib/gcc/x86_64-linux-gnu/5/../../.. /tmp/cchjP9PO.o -lgcc –as-needed -lgcc_s –no-as-needed -lc -lgcc –as-needed -lgcc_s –no-as-needed /usr/lib/gcc/x86_64-linux-gnu/5/crtend.o /usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crtn.o
- You will get something like above, this is the exact step done during the linking step.
gcc
internally calls it for linking. Read more about it http://gcc.gnu.org/onlinedocs/gccint/Collect2.html - We will replace the object file name in the above string and then run the command. New command is
ld -plugin /usr/lib/gcc/x86_64-linux-gnu/5/liblto_plugin.so -plugin-opt=/usr/lib/gcc/x86_64-linux-gnu/5/lto-wrapper -plugin-opt=-fresolution=/tmp/cc1PIEfF.res -plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lgcc_s -plugin-opt=-pass-through=-lc -plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lgcc_s –sysroot=/ –build-id –eh-frame-hdr -m elf_x86_64 –hash-style=gnu –as-needed -dynamic-linker /lib64/ld-linux-x86-64.so.2 -z relro /usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crt1.o /usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crti.o /usr/lib/gcc/x86_64-linux-gnu/5/crtbegin.o -L/usr/lib/gcc/x86_64-linux-gnu/5 -L/usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu -L/usr/lib/gcc/x86_64-linux-gnu/5/../../../../lib -L/lib/x86_64-linux-gnu -L/lib/../lib -L/usr/lib/x86_64-linux-gnu -L/usr/lib/../lib -lgcc –as-needed -lgcc_s –no-as-needed -lc -lgcc –as-needed -lgcc_s –no-as-needed /usr/lib/gcc/x86_64-linux-gnu/5/crtend.o /usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crtn.o pre.o -o pre.elf
- The difference is marked with
>>>>> <<<<<
/usr/lib/gcc/x86_64-linux-gnu/5/collect2 -plugin /usr/lib/gcc/x86_64-linux-gnu/5/liblto_plugin.so -plugin-opt=/usr/lib/gcc/x86_64-linux-gnu/5/lto-wrapper -plugin-opt=-fresolution=/tmp/cc8bF6fB.res -plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lgcc_s -plugin-opt=-pass-through=-lc -plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lgcc_s –sysroot=/ –build-id –eh-frame-hdr -m elf_x86_64 –hash-style=gnu –as-needed -dynamic-linker /lib64/ld-linux-x86-64.so.2 -z relro /usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crt1.o /usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crti.o /usr/lib/gcc/x86_64-linux-gnu/5/crtbegin.o -L/usr/lib/gcc/x86_64-linux-gnu/5 -L/usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu -L/usr/lib/gcc/x86_64-linux-gnu/5/../../../../lib -L/lib/x86_64-linux-gnu -L/lib/../lib -L/usr/lib/x86_64-linux-gnu -L/usr/lib/../lib >>>>>>!!!-L/usr/lib/gcc/x86_64-linux-gnu/5/../../.. /tmp/cchjP9PO.o <<<<<!!! -lgcc –as-needed -lgcc_s –no-as-needed -lc -lgcc –as-needed -lgcc_s –no-as-needed /usr/lib/gcc/x86_64-linux-gnu/5/crtend.o /usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crtn.o
- Run the command after replacing the object file in the above command.
- You will get your
pre.elf
file - Run it
./pre.elf
$ ./pre.elf
Hello World
- Using the following
Makefile
you can do the above steps one by one and see the results for yourself.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | C_FILE=hello_world.c
PRE_FILE=pre.c
COMP_FILE=pre.s
ASSEMBLE_FILE=pre.o
ELF_FILE=pre.elf
GCC=gcc
LINK=ld -plugin /usr/lib/gcc/x86_64-linux-gnu/5/liblto_plugin.so -plugin-opt=/usr/lib/gcc/x86_64-linux-gnu/5/lto-wrapper -plugin-opt=-fresolution=/tmp/cc1PIEfF.res -plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lgcc_s -plugin-opt=-pass-through=-lc -plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lgcc_s --sysroot=/ --build-id --eh-frame-hdr -m elf_x86_64 --hash-style=gnu --as-needed -dynamic-linker /lib64/ld-linux-x86-64.so.2 -z relro /usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crt1.o /usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crti.o /usr/lib/gcc/x86_64-linux-gnu/5/crtbegin.o -L/usr/lib/gcc/x86_64-linux-gnu/5 -L/usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu -L/usr/lib/gcc/x86_64-linux-gnu/5/../../../../lib -L/lib/x86_64-linux-gnu -L/lib/../lib -L/usr/lib/x86_64-linux-gnu -L/usr/lib/../lib -lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed -lgcc_s --no-as-needed /usr/lib/gcc/x86_64-linux-gnu/5/crtend.o /usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crtn.o
preprocess:
$(GCC) -E $(C_FILE) -o $(PRE_FILE)
compile: preprocess
$(GCC) -S $(PRE_FILE) -o $(COMP_FILE)
assemble: compile
$(GCC) -c $(COMP_FILE) -o $(ASSEMBLE_FILE)
link: assemble
$(LINK) $(ASSEMBLE_FILE) -o $(ELF_FILE)
clean:
rm -rf $(PRE_FILE) $(COMP_FILE) $(ASSEMBLE_FILE)
|
Libraries¶
A library is a zipped file of compiled code. The code is compiled and kept in a
format that any other program can use the code by just linking to it. For this
the program should just have the function declared in the code so that the
compilation stage knows that the function’s code will be linked to at a later
stage.
In the linking phase the linker links the code by attaching
the function
call’s code present in the library to the function place where function is
called in the compiled code.
There are two words which I have formatted differntly
in the above paragraph
attaching
and later stage
.
An executable is said to be statically linked if the later stage
is
the last stage of the compilation and attaching
is done in the last stage
of installation.
An executable is said to be dynamically linked if the later stage
is at
the time of program execution and attaching is also done at the time of program
execution. This is the role of loader
.
Static Library¶
In the above section we have understood that we can compile some code and keep
it as a library on the system, then use the code to link
(read as
attaching
) to some new programs. When we link
the code at the compile
time we call it a statically compiled executable. This increases the size of
the executable program as the whole library gets copied to the executable. This
has the benefit that the executable becomes self sufficient and can execute on
any other Linux machine.
System Calls¶
System calls are API’s which the Kernel provides to the user space applications. The system calls pass some arguments to the kernel space and the kernel acts accordingly on the arguments
For example: open()
system call - opens a file so that further read and
write operations can be done on the file. The return value of the open
system call is a file descriptor
or an error status
. Successful return value
allows the user space applications to use the file descriptor
for further reads
and writes.
System calls get executed in the kernel space. Kernel space runs in an elevated privileged mode. There is a shift of the privileged modes whenever a system call is called and hence its a bad idea to call system calls without considering the time taken to switch to the elevated privileged mode.
For example - lets say that you want to copy a file. One way of copying the file is to read each character of the file and for every character read you write the character to another file. This will call two system calls for every character you read and write. As this is expensive in terms of time its a bad design.
Let us see a small demonstration of this.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 | /*
* In this code we will open the /etc/passwd file and copy the file 1000 times
* to the output file. We will copy it 1000 times so that we have a good amount
* data to run our test on.
*/
#include <stdlib.h>
#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>
#include <errno.h>
#define BLOCK_SIZE 1
int main ()
{
char *src_file = "src_file";
char *dest_file = "copied_file.txt";
int dest_fd, src_fd, read_byte, write_byte;
char read_buf[BLOCK_SIZE];
dest_fd = open (dest_file, O_WRONLY|O_CREAT, S_IRWXU|S_IRWXG|S_IROTH);
if (dest_fd < 0) {
perror ("\nError opening the destination file");
exit(1);
} else {
fprintf (stderr, "\nSuccessfully opened the destination file..");
}
src_fd = open (src_file, O_RDONLY);
if (src_fd < 0) {
perror ("\nError opening the source file");
exit(1);
} else {
fprintf (stderr, "Successfully opened the source file.");
}
/*
* We will start the copy process byte by byte
*/
while (1) {
read_byte = read (src_fd, read_buf, BLOCK_SIZE);
if (read_byte == 0) {
fprintf(stdout, "Reached the EOF for src file");
break;
}
write_byte = write (dest_fd, read_buf, BLOCK_SIZE);
if (write_byte < 0) {
perror ("Error writing file");
exit(1);
}
}
close(src_fd);
close(dest_fd);
return 0;
}
|
What should instead be done here is that you read a block (set of characters) and then write that block into another file. This will reduce the number of the system calls and thus increase the overall performance of the file copy program.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 | /*
* In this code we will open the /etc/passwd file and copy the file 1000 times
* to the output file. We will copy it 1000 times so that we have a good amount
* data to run our test on.
*/
#include <stdlib.h>
#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>
#include <errno.h>
#define BLOCK_SIZE 4096
int main ()
{
char *src_file = "src_file";
char *dest_file = "copied_file.txt";
int dest_fd, src_fd, read_byte, write_byte;
char read_buf[BLOCK_SIZE];
dest_fd = open (dest_file, O_WRONLY|O_CREAT, S_IRWXU|S_IRWXG|S_IROTH);
if (dest_fd < 0) {
perror ("\nError opening the destination file");
exit(1);
} else {
fprintf (stderr, "\nSuccessfully opened the destination file..");
}
src_fd = open (src_file, O_RDONLY);
if (src_fd < 0) {
perror ("\nError opening the source file");
exit(1);
} else {
fprintf (stderr, "Successfully opened the source file.");
}
/*
* We will start the copy process byte by byte
*/
while (1) {
read_byte = read (src_fd, read_buf, BLOCK_SIZE);
if (read_byte == 0) {
fprintf(stdout, "Reached the EOF for src file");
break;
}
write_byte = write (dest_fd, read_buf, BLOCK_SIZE);
if (write_byte < 0) {
perror ("Error writing file");
exit(1);
}
}
close(src_fd);
close(dest_fd);
return 0;
}
|
1 2 3 4 5 6 7 8 9 10 11 12 13 | all:
gcc -o elf.slow_write slow_write.c -Wall
gcc -o elf.fast_write fast_write.c -Wall
run: setup all
time -p ./elf.slow_write
time -p ./elf.fast_write
clean:
rm src_file elf.slow_write elf.fast_write copied_file.txt
setup:
for i in `seq 1 10000`; do cat /etc/passwd >> src_file; done
|
Kernel¶
Kernel is an important component of any Operating System. This is the only
layer which interacts directly with the hardware. So in order to get any work
done from your hardware you need to ask
the kernel to do this.
This asking
is done by system calls
. In assembly level language this is
the syscall
instruction. When you call any system call a function in
the kernel is invoked and it gets the work done. The arguments we passed are
passed to the kernel and a particular function call is invoked.
For the functions any hardware interaction is needed the kernel interacts with the hardware through the device driver of the hardware.
Conclusion¶
In this chapter we have seen some of the important concepts and steps required to take a program from a .c
file to an executable format on a Linux
machine. This chapter also introduced us to the concepts of system calls and libraries.
References¶
- https://stackoverflow.com/questions/14163208/how-to-link-c-object-files-with-ld
- For further reading refer 1st Chapter
Getting Started
ofBeginning Linux Programming
byNeil Matthew and Richard Stones
.