6.1.5. Trial-and-Error Method
Until recently, the only way to populate the contents of your root file system was to use the trial-and-error method. Perhaps the process can be automated by creating a set of scripts for this purpose, but the knowledge of which files are required for a given functionality still had to come from the developer. Tools such as Red Hat Package Manager (rpm) can be used to install packages on your root file system. rpm has reasonable dependency resolution within given packages, but it is complex and involves a steep learning curve. Furthermore, using rpm does not lend itself easily to building small root file systems because it has limited capability to strip unnecessary files from the installation, such as documentation and unused utilities in a given package.
6.1.6. Automated File System Build Tools
The leading vendors of embedded Linux distributions ship very capable tools designed to automate the task of building root file systems in Flash or other devices. These tools are usually graphical in nature, enabling the developer to select files by application or functionality. They have features to strip unnecessary files such as documentation and other unneeded files from a package, and many have the capability to select at the individual file level. These tools can produce a variety of file system formats for later installation on your choice of device. Contact your favorite embedded Linux distribution vendor for details on these powerful tools.
6.2. Kernel's Last Boot Steps
In the previous chapter, we introduced the steps the kernel takes in the final phases of system boot. The final snippet of code from .../init/main.c is reproduced in Listing 6-2 for convenience.
Listing 6-2. Final Boot Steps from main.c
...
if (execute_command) {
run_init_process(execute_command);
printk(KERN_WARNING 'Failed to execute %s. Attempting defaults...
', execute_command);
}
run_init_process('/sbin/init');
run_init_process('/etc/init');
run_init_process('/bin/init');
run_init_process('/bin/sh');
panic('No init found. Try passing init= option to kernel.');
This is the final sequence of events for the kernel thread called init spawned by the kernel during the final stages of boot. The run_init_process() is a small wrapper around the execve() function, which is a kernel system call with a rather interesting behavior. The execve() function never returns if no error conditions are encountered in the call. The memory space in which the calling thread is executing is overwritten by the called program's memory image. In effect, the called program directly replaces the calling thread, including inheriting its Process ID (PID).
The structure of this initialization sequence has been unchanged for a long time in the development of the Linux kernel. In fact, Linux version 1.0 contained similar constructs. Essentially, this is the start of user space[50] processing. As you can see from Listing 6-2, unless the Linux kernel is successful in executing one of these processes, the kernel will halt, displaying the message passed in the panic() system call. If you have been working with embedded systems for any length of time, and especially if you have experience working on root file systems, you are more than familiar with this kernel panic() and its message! If you search on Google for this panic() error message, you will find page after page of hits for this FAQ. When you complete this chapter, you will be an expert at troubleshooting this common failure.
Notice a key ingredient of these processes: They are all programs that are expected to reside on a root file system that has a similar structure to that presented in Listing 6-1. Therefore we know that we must at least satisfy the kernel's requirement for an init process that is capable of executing within its own environment.
In looking at Listing 6-2, this means that at least one of the run_init_process() function calls must succeed. You can see that the kernel tries to execute one of four programs in the order in which they are encountered. As you can see from the listing, if none of these four programs succeeds, the booting kernel issues the dreaded panic() function call and dies right there. Remember, this snippet of code from .../init/main.c is executed only once on bootup. If it does not succeed, the kernel can do little but complain and halt, which it does through the panic() function call.
6.2.1. First User Space Program
On most Linux systems, /sbin/init is spawned by the kernel on boot. This is why it is attempted first from Listing 6-2. Effectively, this becomes the first user space program to run. To review, this is the sequence:
1. Mount the root file system
2. Spawn the first user space program, which, in this discussion, becomes init
In our example minimal root file system from Listing 6-2, the first three attempts at spawning a user space process would fail because we did not provide an executable file called init anywhere on the file system. Recall from Listing 6-1 that we had a soft link called sh that pointed back to busybox. You should now realize the purpose for that soft link: It causes busybox to be executed by the kernel as the initial process, while also satisfying the common requirement for a shell executable from userspace.[51]
6.2.2. Resolving Dependencies
It is not sufficient to simply include an executable such as init on your file system and expect it to boot. For every process you place on your root file system, you must also satisfy its dependencies. Most processes have two categories of dependencies: those that are needed to resolve unresolved references within a dynamically linked executable, and external configuration or data files that an application might need. We have a tool to find the former, but the latter can be supplied only by at least a cursory understanding of the application in question.
An example will help make this clear. The init process is a dynamically linked executable. To run init, we need to satisfy its library dependencies. A tool has been developed for this purpose: ldd. To understand what libraries a given application requires, simply run your cross-version of ldd on the binary:
$ ppc_4xxFP-ldd init
libc.so.6 => /opt/eldk/ppc_4xxFP/lib/libc.so.6
ld.so.1 => /opt/eldk/ppc_4xxFP/lib/ld.so.1
$