[转载] Runtime kernel kmem patching
VX Heavens
[Home]
Runtime kernel kmem patching
Silvio Cesare
November 1998
[Back to index]
Introduction
The kernel symbol list
Direct kernel memory access in Linux
Building symbol information from kmem
Patching kernel structures
Patching the kernel to run inserted kernel code
Patching the kernel to insert new code
Patching new code into the Linux kernel
Object code (LKM) inserting of kernel code
Object code (LKM) linking to kernel
Bootstrap loading of object code (LKM)
The provided implementation
Conclusion
Introduction
This paper documents runtime (on the fly) kernel patching on a running syste
m under Linux using direct access to kernel memory. The same algorithms may
equally be applicable to other systems. Examples of kernel patching for use
by an attacker is provided showing patching of kernel structures to remove a
lkm's visibility to lsmod and even the addition of kernel code ala loadable
kernel modules (lkm) to a running system without native lkm support in the
kernel. Discussion of rebuilding the appropriate sections of the system symb
ol map (System.map) is provided and implemented.
Linux kernel programming skills of a rudimentary level are required for the
purposes of this paper. For the sections on object code linking use ELF, int
roductory knowledge of the ELF specs while is assumed has the appropriate pa
rts used in this paper explained in further detail to those things directly
applicable. It is strongly suggested that the interested reader look at the
ELF specs. This is especially true if full understanding of the implementati
on is to be achieved.
The kernel symbol list
Linux provides a means to locate every symbol used in the kernel whether the
y are exported or not. This is available by System.map which is created at c
ompile time by gathering the symbol information in the object files used to
compile the kernel.
# cat /System.map
00100000 T _stext
00100000 T stext
00100000 t startup_32
001000bc t isnew
00100118 t is486x
00100183 t n6x86
.
.
.
Thus we can see we can identify the symbol by its name. The address we can u
se to directly look at kernel memory via /dev/kmem.
Kernel symbols that are exported, that is, those symbols that are globally v
isible and maintained in the kernel and available to user-land via /proc/ksy
sm and get_kernel_syms (1) do not require the use of System.map for the reas
ons just stated. This however is only available when lkm support is compiled
in the kernel.
Direct kernel memory access in Linux
Direct access to memory in Linux is provided by two device files. /dev/mem i
s a file mapping of linear memory. /dev/kmem is a file mapping of all virtua
l memory available, that is, linear memory plus swap space.
To use the device files as memory requires opening of the device and using i
t as normal. Random access requires seeking to the required position (memory
address) in the file before reading or writing.
Building symbol information from kmem
System.map is not required to run Linux. Neither is /proc/ksyms when lkm's a
ren't supported natively in the kernel. However, symbol information is vital
for patching the kernel using kmem. A variety of approaches are available h
owever to rebuild symbol information from a running kernel using kmem.
The simplest approach is to copy the first few bytes at a symbol's address a
nd use this as a key to locate the symbol in kmem by string searching. This
approach is ideal for locating symbols which represent code, ie functions. T
he machine code instructions used as the key however will possibly change be
tween architectures and kernel versions. Thus a key lists for function symbo
ls must be maintained for each arch and kernel version. Likewise, if the fun
ction uses compile time configurable code, this approach will fail unless we
use a key for that specific configuration. In general however, this is high
ly successful, and a key list construction and searching may be easily autom
ated to find a large number of symbols.
This approach requires that symbol information be available on the machine w
here the key lists are being built. This information can use System.map or /
proc/ksyms, the later being the preferred source as its common on many syste
ms that System.map is not current as its often not copied in kernel compilat
ions.
The implementation of constructing the key lists is build-ksyms-keys.sh whic
h uses a minimum of 15 bytes as the key length, but may increase if the symb
ol is not able to be identified uniquely with the current key length.
A problem is present with this method however. Its common for a large number
of functions to reference other symbols which are not known. A solution to
this problem requires that we identify which part of the key is random so we
can safely ignore it. To do this, we can build a key list initially which a
ssumes the entire key is to be used, then we update the key list on a new ke
rnel. By looking at what changed in between kernels, we can use this as what
to ignore. Also, this gives us a reasonable guess at how useful a key is an
d if a key is not at all stable and should thus be discarded.
Locating structures in memory may also be approached using a search key tech
nique. For this we try to make known as much information in the structure as
possible so we have a feasible key to work with. The key does not have to b
e a string however, ie, we can search for a structure knowing fields that ar
e dispersed within it. A practical example that is implemented in the provid
ed source code is locating a module's data structure.
From /usr/include/linux/module.h
struct module {
struct module *next;
struct module_ref *ref; /* the list of modules that refer to me */
struct symbol_table *symtab;
const char *name;
int size; /* size of module in pages */
void* addr; /* address of module */
int state;
void (*cleanup)(void); /* cleanup routine */
};
The module's name 'name' is a pointer to a string. Thus to locate the module
structure we can make two passes in memory. In the first pass we search for
all matches to a string, that is the name of the module. We then use the ad
dress of these strings in our second pass as the value of 'name' in the modu
le structure.
To make the searching practical, we try to fill in as much information that
is known into the partially complete module struct which is out search key.
The size of the module 'size' is available in a module listing and we can us
e this to make our key more useful.
# lsmod
Module Pages Used by
ppp 5 1 (autoclean)
slhc 2 [ppp] 1 (autoclean)
serial 8 1
ipx 4 0
rarp 1 0
smbfs 7 0
nfs 13 4
#
The pages number indicates the size of the module in pages. Thus we can use
this in our search for the module structure. The 'state' of the module may b
e assume to be MOD_RUNNING, that is, a module that is currently running (as
opposed to one that needs deletion, or is being initialized). The 'ref' may
be assumed NULL, that is, no modules refer to this one. Also 'cleanup' may b
e assumed not NULL.
This information gives us a key that is practical and will uniquely identify
module structures in memory most of the time (but not all the time).
Another method of locating a symbol is that if it is a constant distance to
a known symbol location, then the unknown symbol may be found as a function
of the known symbol.
From /usr/src/linux-2.0.35/kernel/sched.c
.
.
.
struct task_struct * task[NR_TASKS] = {&init_task, };
struct kernel_stat kstat = { 0 };
.
.
.
'task' is the symbol we are trying to locate. This symbol is not in /proc/ks
yms, however 'kstat' is. Thus we can see the following.
ADDR(task) == ADDR(kstat) - NR_TASKS*sizeof(struct task_struct *)
It is possible however, that we have no information to use directly as a key
to identify the structure, that is, the structure has from our position ran
dom or unknown data, nor can we use known symbols as reference points. If th
is happens, we may be able to locate the structure indirectly by finding cod
e that references the structure we are looking for.
From /usr/src/linux-2.0.35/arch/i386/kernel/entry.S
.
.
.
* Stack layout in 'ret_from_system_call':
* ptrace needs to have all regs on the stack.
* if the order here is changed, it needs to be
* updated in fork.c:copy_process, signal.c:do_signal,
* ptrace.c and ptrace.h
*
* 0(%esp) - %ebx
* 4(%esp) - %ecx
* 8(%esp) - %edx
* C(%esp) - %esi
* 10(%esp) - %edi
* 14(%esp) - %ebp
* 18(%esp) - %eax
* 1C(%esp) - %ds
* 20(%esp) - %es
* 24(%esp) - %fs
* 28(%esp) - %gs
* 2C(%esp) - orig_eax
* 30(%esp) - %eip
* 34(%esp) - %cs
* 38(%esp) - %eflags
* 3C(%esp) - %oldesp
* 40(%esp) - %oldss
*/
.
.
.
ALIGN
1: call SYMBOL_NAME(syscall_trace)
movl ORIG_EAX(%esp),%eax
call *SYMBOL_NAME(sys_call_table)(,%eax,4)
movl %eax,EAX(%esp) # save the return value
#ifdef __SMP__
GET_PROCESSOR_OFFSET(%eax)
movl SYMBOL_NAME(current_set)(,%eax),%eax
It is seen that sys_call_table is referenced in the code. If we are able to
locate this particular code, we can extract the address of the sys_call_tabl
e. It is preferable to use as many instructions as possible to increase the
efficiency of the key. It is pointless searching with keys so small they ide
ntify many possible matches.
By using the adjacent instructions to that which references the sys_call_tab
le we give ourselves a good key to use. These instructions do not reference
anything not known so they can be used. Likewise we don't use the __SMP__ de
pendent code.
The final code we use then is as follows.
movl ORIG_EAX(%esp),%eax
call *SYMBOL_NAME(sys_call_table)(,%eax,4)
movl %eax,EAX(%esp) # save the return value
By compiling this code in a dummy program and using objdump to view the mach
ine code we can extract our search key. The search key can be thought of as
a partially incomplete string.
The entry.S code does not change often and is seen in many kernels. Thus we
do not have to use different keys for each kernel version. Naturally this is
only applicable to this specific architecture, however the same principle m
ay be used for others.
Patching kernel structures
The kernel symbol information and direct access to kernel memory provides us
with the ability to modify kernel structures. The mem devices provide us wi
th direct memory access and the symbol information gives us addresses to use
for location of kernel structures.
The first example is a simple application of how to modify a kernel structur
e. kroot is the supplied program that changes the uid of a process to become
superuser. This is not very practical in real life except when the permissi
ons of the mem devices are insecure - which does happen and has even appeare
d as kernel bugs in the past. It does serve as a good example.
The algorithm is as follows.
open /dev/kmem
seek to the 'task' symbol address.
read the task table into memory
locate the appropriate task we wish to modify
modify the task to reflect a superuser uid
seek to the specific task in the file
write the modified task
To locate the 'task' symbol address we use the methods given above.
A more practical example is also given. Patching a module kernel structure t
o remove the lkm from module listing's.
From /usr/src/linux-2.0.35/kernel/module.c
.
.
.
/*
* Called by the /proc file system to return a current list of modules.
*/
int get_module_list(char *buf)
{
.
.
.
q = mp->name;
if (*q == '\0' && mp->size == 0 && mp->ref == NULL)
continue; /* don't list modules for kernel syms */
.
.
.
It can be seen that a lkm is not listed of the name is empty, size is 0 and
the ref is NULL.
The implemtation of the lkm 'zapper' uses this information to patch the modu
le struct's size and ref members and patches the name to be an empty string.
The reverse can also be done provided the size and ref pointer is saved in
the 'zapping' process.
Patching the kernel to run inserted kernel code
Linux provides a system call table represented by the symbol sys_call_table.
The table is an array of fixed size and each element of the array is a poin
ter to the system call, number of which is the index in the array. If no sys
tem call is implemented for that syscall number, the element is a NULL point
er. Typically many tens of elements in the syscall table are NULL representi
ng the syscall slot available for use.
Provided we know the address of the new kernel code, we can patch the syscal
l table and fill an empty slot to point to our new code. To run the kernel c
ode, we make a call to the new syscall from user-land using the int 0x80 mec
hanism that Linux employs and the new code is thus run.
Patching the kernel to insert new code
/dev/kmem gives us direct access to kernel memory, so to insert new code sim
ply means we overwrite part of memory. Importantly to leave the kernel in a
stable state we cannot overwrite internal kernel structures that are allocat
ed or being used.
Its possible to use the kmalloc pool in kernel space however we cant be cert
ain if the memory is already being used by kmalloc. Likewise, even if we loo
k at the allocation block headers to find free memory, the operations are no
t atomic so it leaves to possible racing with the kernel.
The linear layout of kernel memory looks like this.
[compile time kernel memory reserve]
[kmalloc pool]
The kmalloc pool limits are defined in the kernel by the symbols memory_star
t and memory_end.
[compile time kernel memory reserve]
memory_start
[kmalloc pool]
memory_end
The kmalloc pool however is aligned to the next page border of memory_start,
so in practice this is what happens.
Key:
[..] a complete page
K kmalloc pool
P padding
R reserve (compile time kernel text/data etc)
[RRRRRRRRRRRR]
...
[RRRRRRRRRRRR]
[RRRR
memory_start
PPPPPPPP]
[KKKKKKKKKKKK]
...
[KKKKKKKKKKKK]
memory_end
This padding can provide us with an ideal positing for new kernel code, as i
t is totally safe to use as the kernel is not allowed to use this memory.
This is the ideal situation where we can use System.map or another technique
to locate memory_start. It is preferable however if we can not to use a sym
bol who's address is not known at runtime without location.
From /usr/src/linux-2.0.35/arch/i386/mm/init.c
.
.
.
/*
* BAD_PAGE is the page that is used for page faults when linux
* is out-of-memory. Older versions of linux just did a
* do_exit(), but using this instead means there is less risk
* for a process dying in kernel mode, possibly leaving a inode
* unused etc..
*
* BAD_PAGETABLE is the accompanying page-table: it is initialized
* to point to BAD_PAGE entries.
.
.
.
From /usr/src/linux-2.0.35/arc/i386/kernel/head.S
.
.
.
org 0x3000
ENTRY(empty_bad_page)
org 0x4000
ENTRY(empty_bad_page_table)
org 0x5000
ENTRY(empty_zero_page)
.
.
.
This all starts at 0x100000 for a compressed kernel image. Thus empty_bad_pa
ge is at 0x100300 and is 4096 bytes in size. Running out of memory is not a
common occurrence and we can use the empty_bad_page for our own code without
too many fears.
Patching new code into the Linux kernel
Patching new code into the kernel requires two things. It requires that the
code be in kernel space memory and it requires a method of calling that code
The above text has provided has with methods to fulfil both these requirem
ents. The algorithm is thus.
insert new code into memory employing the methods stated earlier
patch the sys_call_table so a syscall points to the new code
make a syscall to call our new code
Object code (LKM) inserting of kernel code
Inserting object code follows the same principles as inserting code directly
, however as stated, kernel symbol references, require that the correct addr
ess be used by looking at the kernel symbol table. Binding symbol's to addre
sses is the process of linking, which so far we have been doing manually. Fo
r non trivial code manual linking isn't a viable solution, and linking must
be added to the algorithms we are using.
Object code (LKM) linking to kernel
ELF is the typical standard used to represent object code on Linux. The pape
r will thus only refer to linking using ELF objects.
An object code file is referred to as relocatable code when using ELF becaus
e that summarizes what it is. It is not fixed to any memory position. It is
the responsibility of linking that makes an executable image out of a reloca
table object and binds symbols to addresses.
Linking of code is done by relocating the code to a fixed positing. For the
most part, the object code does not need to be changed heavily.
Consider the following C code.
{
char s[] = "I wonder where I'm located...");
printk(KERN_INFO "%s\n", s);
}
The string 's' being part of the relocatable text section in the object has
no known absolute position in memory at compile time. Likewise, printk, is a
n externally defined symbol and its address is also not known at compile tim
e.
Relocation sections in the ELF object are used for describing what needs to
be modified (relocated) in the object. In the above case, relocation entries
would be made for printk's reference and the string's reference.
The format for an ELF relocatable object (object code) is as follows.
ELF header
Program header table
Section 1
Section n
Section header table
From /usr/include/elf.h
/* The ELF file header. This appears at the start of every ELF file. */
#define EI_NIDENT (16)
typedef struct
{
unsigned char e_ident[EI_NIDENT]; /* Magic number and other info */
Elf32_Half e_type; /* Object file type */
Elf32_Half e_machine; /* Architecture */
Elf32_Word e_version; /* Object file version */
Elf32_Addr e_entry; /* Entry point virtual address */
Elf32_Off e_phoff; /* Program header table file offset
*/
Elf32_Off e_shoff; /* Section header table file offset
*/
Elf32_Word e_flags; /* Processor-specific flags */
Elf32_Half e_ehsize; /* ELF header size in bytes */
Elf32_Half e_phentsize; /* Program header table entry size *
/
Elf32_Half e_phnum; /* Program header table entry count
*/
Elf32_Half e_shentsize; /* Section header table entry size *
/
Elf32_Half e_shnum; /* Section header table entry count
*/
Elf32_Half e_shstrndx; /* Section header string table index
*/
} Elf32_Ehdr;
/* Conglomeration of the identification bytes, for easy testing as a word.
*/
#define ELFMAG "\177ELF"
#define SELFMAG 4
/* Legal values for e_machine (architecture). */
#define EM_386 3 /* Intel 80386 */
#define EM_486 6 /* Intel 80486 */
/* Legal values for e_version (version). */
#define EV_NONE 0 /* Invalid ELF version */
#define EV_CURRENT 1 /* Current version */
From the ELF specifications.
String Table
String table sections hold null-terminated character sequences, commonly cal
led strings. The object file uses these strings to represent symbol and sect
ion names. One references a string as an index into the string table section
The first byte, which is index zero, is defined to hold a null character.
Likewise, a string tables last byte is defined to hold a null character, ens
uring null termination for all strings. A string whose index is zero specifi
es either no name or a null name, depending on the context. An empty string
table section is permitted; its section headers sh_size member would contain
zero. Non-zero indexes are invalid for an empty string table."
..
Sections
An object file's section header table lets one locate all the file's section
s. The section header table is an array of Elf32_Shdr structures as describe
d below. A section header table index is a subscript into this array. The EL
F headers e_shoff member gives the byte offset from the beginning of the fil
e to the section header table; e_shnum tells how many entries the section he
ader table contains; e_shentsize gives the size in bytes of each entry."
From /usr/include/elf.h
/* Section header. */
typedef struct
{
Elf32_Word sh_name; /* Section name (string tbl index) *
/
Elf32_Word sh_type; /* Section type */
Elf32_Word sh_flags; /* Section flags */
Elf32_Addr sh_addr; /* Section virtual addr at execution
*/
Elf32_Off sh_offset; /* Section file offset */
Elf32_Word sh_size; /* Section size in bytes */
Elf32_Word sh_link; /* Link to another section */
Elf32_Word sh_info; /* Additional section information */
Elf32_Word sh_addralign; /* Section alignment */
Elf32_Word sh_entsize; /* Entry size if section holds table
*/
} Elf32_Shdr;
From the ELF specifications.
Symbol Table
An object file's symbol table holds information needed to locate and relocat
e a program's symbolic definitions and references. A symbol table index is a
subscript into this array. Index 0 both designates the first entry in the t
able and serves as the undefined symbol index. The contents of the initial e
ntry are specified later in this section."
/* Symbol table entry. */
typedef struct
{
Elf32_Word st_name; /* Symbol name (string tbl index) */
Elf32_Addr st_value; /* Symbol value */
Elf32_Word st_size; /* Symbol size */
unsigned char st_info; /* Symbol type and binding */
unsigned char st_other; /* No defined meaning, 0 */
Elf32_Section st_shndx; /* Section index */
} Elf32_Sym;
#define SHN_UNDEF 0 /* No section, undefined symbol. */
/* How to extract and insert information held in the st_info field. */
#define ELF32_ST_BIND(val) (((unsigned char) (val)) >> 4)
#define ELF32_ST_TYPE(val) ((val) & 0xf)
#define ELF32_ST_INFO(bind, type) (((bind) << 4) + ((type) & 0xf))
/* Legal values for ST_BIND subfield of st_info (symbol binding). */
#define STB_LOCAL 0 /* Local symbol */
#define STB_GLOBAL 1 /* Global symbol */
#define STB_WEAK 2 /* Weak symbol */
#define STB_NUM 3 /* Number of defined types. */
#define STB_LOPROC 13 /* Start of processor-specific */
#define STB_HIPROC 15 /* End of processor-specific */
From the ELF specifications.
"A relocation section references two other sections: a symbol table and a se
ction to modify. The section headers sh_info and sh_link members, described
in ``Sections'' above, specify these relationships. Relocation entries for d
ifferent object files have slightly different interpretations for the r_offs
et member.
In relocatable files, r_offset holds a section offset. That is, the relocati
on section itself describes how to modify another section in the file; reloc
ation offsets designate a storage unit within the second section."
From /usr/include/elf.h
/* Relocation table entry without addend (in section of type SHT_REL). */
typedef struct
{
Elf32_Addr r_offset; /* Address */
Elf32_Word r_info; /* Relocation type and symbol index
*/
} Elf32_Rel;
/* How to extract and insert information held in the r_info field. */
#define ELF32_R_SYM(val) ((val) >> 8)
#define ELF32_R_TYPE(val) ((val) & 0xff)
#define ELF32_R_INFO(sym, type) (((sym) << 8) + ((type) & 0xff))
These selected paragraphs and sections from the ELF specifications and heade
r files give us a good high level concept of how a relocatable ELF file can
be linked to produce an image capable of being executed.
The process of linking the lkm is as follows.
Identify the file as being in relocatable ELF format
Load each relevant section into memory
For each PROGBITS section set the section address in memory
For each REL (relocation) section, carry out the relocation
Assemble the executable image by copying the sections into their respective
positions in memory
An extra step must also be carried out if we know where the initialization a
nd cleanup code of a lkm is.
Identify the file as being in relocatable ELF format
Load each relevant section into memory
For each PROGBITS section set the section address in memory
For each REL (relocation) section, carry out the relocation
Assemble the executable image by copying the sections into their respective
positions in memory
Locate and print the address of 'init_module' and 'cleanup_module'
The relocation step may be expanded into the following algorithm.
Evaluate the target section of the relocation entry
Evaluate the symbol table section of the relocation entry
Evaluate the location in the section that the relocation is to apply
Evaluate the address of the symbol that is used in the relocation
Apply the relocation
The actual relocation is best presented by looking at the source. For more i
nformation on the relocation types refer to the ELF specifications. Note tha
t we ignore the global offset table completely and any relocation types of i
ts nature.
switch (ELF32_R_TYPE(rel->r_info)) {
case R_386_NONE:
break;
case R_386_PLT32:
case R_386_PC32:
*loc -= dot; /* *loc += addr - dot */
case R_386_32:
*loc += addr;
break;
Evaluating the address of the symbol used in the relocation can also be expa
nded. If the symbol is local, that is STB_LOCAL, then the symbol is in the c
urrent objects symbol table - note also, that the symbol is only visible to
the current object. If the symbol is STB_GLOBAL, then the symbol is external
to the object and the symbol's address is known in this instance to be a ke
rnel symbol, thus we can use System.map or the methods presented earlier in
determining the symbol's address.
In this instance, probably the best source for more detailed information on
link editing is the source provided.
Bootstrap loading of object code (LKM)
In practice, real life lkm's will not fit into the kmalloc pool padding and
likewise may be greater than a single page if we use the empty_bad_page. The
solution to this is to bootstrap load the lkm by inserting a bootstrap load
er into the padding which allocates memory using the normal kernel methods o
f kmalloc, copying the actual lkm to this allocated memory and executing thi
s. The bootstrap loader can be made sufficiently small and our lkm's can be
of any size.
In practice, the only code we are actually required to insert into the kerne
l to enable bootstrapping is an allocation module that will return a memory
address to an allocated block of memory that can fit the real module we want
to load and run.
The provided implementation
The source included provides implementations for all the algorithms discusse
d. Front end shell scripts are provided to leave the internal details unknow
n to the user.
Conclusion
The algorithms and implementations discuss provide ample demonstration that
it is possible to use direct access to memory to provide things which would
appear only naively possible through native support in the kernel.
The supplied programs can provide practical instances for an attacker to use
in his or her array, and show that access to kernel memory is a very useful
concept not only in idea but in practice.
[Source kmem-src.tgz]
[Back to index]