From SELinux Wiki
Jump to: navigation, search

Linux Security Module and SELinux

This section gives a high level overview of the LSM and SELinux internal kernel structure and workings as enabled in kernel 3.14. A more detailed view can be found in the "Implementing SELinux as a Linux Security Module" that was used extensively to develop this section (and also using the SELinux kernel source code). The major areas covered are:

  1. How the LSM and SELinux modules work together.
  2. The major SELinux internal services.
  3. The fork and exec system calls are followed through as an example to tie in with the transition process covered in the Domain Transition section.
  4. The SELinux filesystem /sys/fs/selinux.
  5. The /proc filesystem area most applicable to SELinux.

The LSM Module

The LSM is the Linux security framework that allows 3rd party access control mechanisms to be linked into the GNU / Linux kernel. Currently there are five 3rd party services that utilise the LSM:

  1. SELinux - the subject of this Notebook.
  2. AppArmor is a MAC service based on pathnames and does not require labeling or relabeling of filesystems. See http://wiki.apparmor.net for details.
  3. Simplified Mandatory Access Control Kernel (SMACK). See http://www.schaufler-ca.com/ for details.
  4. Tomoyo that is a name based MAC and details can be found at http://sourceforge.jp/projects/tomoyo/docs.
  5. Yama extends the DAC support for ptrace. See Documentation/security/Yama.txt for further details.

The basic idea behind LSM is to:

  • Insert security function hooks and security data structures in the various kernel services to allow access control to be applied over and above that already implemented via DAC. The type of service that have hooks inserted are shown in Table 5 with an example task and program execution shown in the Fork Walk-thorough and Process Transition Walk-thorough sections.
  • Allow registration and initialisation services for the 3rd party security modules.
  • Allow process security attributes to be available to userspace services by extending the /proc filesystem with a security namespace as shown in Table 2. These are located at:
/proc/<self | pid>/attr/<attr>
/proc/<self | pid>/task/<tid>/attr/<attr>
Where <pid> is the process id, <tid> is the thread id and <attr> is the entry described in Table 2.
  • Support filesystems that use extended attributes (SELinux uses security.selinux as explained in the Labeling Extended Attribute Filesystems section).
  • Consolidate the Linux capabilities into an optional module.

It should be noted that the LSM does not provide any security services itself, only the hooks and structures for supporting 3rd party modules. If no 3rd party module is loaded, the capabilities module becomes the default module thus allowing standard DAC access control.

Table 1: LSM Hooks - These are the kernel services that LSM has inserted security hooks and structures to allow access control to be managed by 3rd party modules (see ./linux-3.14/include/linux/security.h).
Program execution Filesystem operations Inode operations
File operations Task operations Netlink messaging
Unix domain networking Socket operations XFRM operations
Key Management operations IPC operations Memory Segments
Semaphores Capability Sysctl
Syslog Audit

Table 2: /proc Filesystem attribute files - These files are used by the kernel services and libselinux (for userspace) to manage setting and reading of security contexts within the LSM defined data structures.
/proc/self/attr/ File Name
Permissions Function
current -rw-rw-rw- Contains the current process security context.
exec -rw-rw-rw- Used to set the security context for the next exec call.
fscreate -rw-rw-rw- Used to set the security context of a newly created file.
keycreate -rw-rw-rw- Used to set the security context for keys that are cached in the kernel.
prev -r--r--r-- Contains the previous process security context.
sockcreate -rw-rw-rw- Used to set the security context of a newly created socket.

The major kernel source files (relative to ./linux-3.14/security) that form the LSM are shown in Table 7. However there is one major header file (include/linux/security.h) that describes all the LSM security hooks and structures.

Table 3: The core LSM source modules.
Name Function
capability.c Some capability functions were in various kernel modules have been consolidated into these source files.
inode.c This allows the 3rd party security module to initialise a security filesystem. In the case of SELinux this would be /sys/fs/selinux that is defined in the selinux/selinuxfs.c source file.
security.c Contains the LSM framework initialisation services that will set up the hooks described in security.h and those in the capability source files. It also provides functions to initialise 3rd party modules.
lsm_audit.c Contains common LSM audit functions.
min_addr.c Minimum VM address protection from userspace for DAC and LSM.

The SELinux Module

This section does not go into detail of all the SELinux module functionality as the Implementing SELinux as a Linux Security Module does this (although a bit dated), however it attempts to highlight the way some areas work by using the fork and transition process example described in the Domain Transition section.

The major kernel SELinux source files (relative to ./linux-3.14/security/selinux) that form the SELinux security module are shown in Table 4. The diagrams shown inHigh Level SELinux Architecture and The Main LSM / SELinux Modules can be used to see how some of these kernel source modules fit together.

Table 4: The core SELinux source modules - The .h files and those in the include directory have a number of useful comments.
Name Function
avc.c Access Vector Cache functions and structures. The function calls are for the kernel services, however they have been ported to form the libselinux userspace library.
exports.c Exported SELinux services for SECMARK (as there is SELinux specific code in the netfilter source tree).
hooks.c Contains all the SELinux functions that are called by the kernel resources via the security_ops function table (they form the kernel resource object managers). There are also support functions for managing process exec's, managing SID allocation and removal, interfacing into the AVC and Security Server.
netif.c These manage the mapping between labels and SIDs for the net* language statements when they are declared in the active policy.
netlabel.c The interface between NetLabel services and SELinux.
netlink.c Manages the notification of policy updates to resources including userspace applications via libselinux.
selinuxfs.c The selinuxfs pseudo filesystem (/sys/fs/selinux) that imports/exports security policy information to/from userspace services. The services exported are shown in the SELinux Filesystem section.
xfrm.c Contains the IPSec XFRM (transform) hooks for SELinux.
include/classmap.h classmap.h contains all the kernel security classes and permissions. initial_sid_to_string.h contains the initial SID contexts. These are used to build the flask.h and av_permissions.h kernel configuration files when the kernel is being built (using the genheaders script defined in the selinux/Makefile).

These files are built this way now to support the new dynamic security class mapping structure to remove the need for fixed class to SID mapping.

ss/avtab.c AVC table functions for inserting / deleting entries.
ss/conditional.c Support boolean statement functions and implements a conditional AV table to hold entries.
ss/ebitmap.c Bitmaps to represent sets of values, such as types, roles, categories, and classes.
ss/hashtab.c Hash table.
ss/mls.c Functions to support MLS.
ss/policydb.c Defines the structure of the policy database. See the "SELinux Policy Module Primer" article for details on the structure.
ss/services.c This contains the supporting services for kernel hooks defined in hooks.c, the AVC and the Security Server.

For example the security_transition_sid that computes the SID for a new subject / object shown in the The Main LSM / SELinux Modules diagram.

ss/sidtab.c The SID table contains the security context indexed by its SID value.
ss/status.c Interface for selinuxfs/status. Used by the libselinux selinux_status_*(3) functions.
ss/symtab.c Maintains associations between symbol strings and their values.

Fork System Call Walk-thorough

This section walks through the the fork(2) system call shown in the domain transition diagram starting at the kernel hooks that link to the SELinux services. The way the SELinux hooks are initialised into the LSM security_ops function table are also described.

Using the hooks for a fork diagram, the major steps to check whether the unconfined_t process has permission to use the fork permission are:

  1. The kernel/fork.c has a hook that links it to the LSM function security_task_create() that is called to check access permissions.
  2. Because the SELinux module has been initialised as the security module, the security_ops table has been set to point to the SELinux selinux_task_create() function in hooks.c.
  3. The selinux_task_create() function check whether the task has permission via the current_has_perm(current, PROCESS__FORK) function.
  4. This will result in a call to the AVC via the avc_has_perm() function in avc.c that checks whether the permission has been granted or not. First (via avc_has_perm_noaudit()) the cache is checked for an entry. Assuming that there is no entry in the AVC, then the security_compute_av() function in services.c is called.
  5. The security_compute_av() function will search the SID table for source and target entries, and if found will then call the context_struct_compute_av() function. The context_struct_compute_av() function carries out many checks to validate whether access is allowed. The steps are (assuming the access is valid):
    1. Initialise the AV structure so that it is clear.
    2. Check the object class and permissions are correct. It also checks the status of the allow_unknown flag (see the SELinux Filesystem, /etc/selinux/semanage.conf file and Reference Policy Build Options - build.conf UNK_PERMS sections).
    3. Checks if there are any type enforcement rules (ALLOW, AUDIT_ALLOW, AUDIT_DENY).
    4. Check whether any conditional statements are involved via the cond_compute_av() function in conditional.c.
    5. Remove permissions that are defined in any constraint via the constraint_expr_eval() function call (in services.c). This function will also check any MLS constraints.
    6. context_struct_compute_av() checks if a process transition is being requested (it is not). If it were, then the TRANSITION and DYNTRANSITION permissions are checked and whether the role is changing.
    7. Finally check whether there are any constraints applied via the typebounds rule.
  6. Once the result has been computed it is returned to the kernel/fork.c system call via the initial selinux_task_create() function. In this case the fork call is allowed.
  7. The End.

Process Transition Walk-thorough

This section walks through the execve(2) and checking whether a process transition to the ext_gateway_t domain is allowed, and if so obtain a new SID for the context (unconfined_u:message_filter_r:ext_gateway_t) as shown in the domain transition diagram.

The process starts with the Linux operating system issuing a do_execve[1] call from the CPU specific architecture code to execute a new program (for example, from <tt>arch/ia64/kernel/process.c). The do_execve() function is located in the fs/exec.c source code module and does the loading and final exec as described below.

do_execve() has a number of calls to security_bprm_* functions that are a part of the LSM (see include/linux/security.h), and are hooked by SELinux during the initialisation process (in security/selinux/hooks.c). Table 5 briefly describes these security_bprm functions that are hooks for validating program loading and execution (although see security.h for greater detail).

Table 5: The LSM / SELinux Program Loading Hooks
LSM / SElinux Function Name Description

Set up security information in the bprm->security field based on the file to be exec'ed contained in bprm->file. SELinux uses this hook to check for domain transitions and the whether the appropriate permissions have been granted, and obtaining a new SID if required.

Prepare to install the new security attributes of the process being transformed by an execve operation. SELinux uses this hook to close any unauthorised files, clear parent signal and reset resource limits if required.

selinux_bprm_ committed_creds
Tidy up after the installation of the new security attributes of a process being transformed by an execve operation. SELinux uses this hook to check whether signal states can be inherited if new SID allocated.

Called when loading libraries to check AT_SECURE flag for glibc secure mode support. SELinux uses this hook to check the process class noatsecure permission if appropriate.

This hook is not used by SELinux.

Therefore starting at the do_execve() function and using the Process Transition diagram, the following major steps will be carried out to check whether the unconfined_t process has permission to transition the secure_server executable to the ext_gateway_t domain:

  1. The executable file is opened, a call issued to the sched_exec() function and the bprm structure is initialised with the file parameters (name, environment and arguments).
  2. Via the prepare_binprm() function call the UID and GIDs are checked and a call issued to security_bprm_set_creds() that will carry out the following:
  3. Call cap_bprm_set_creds function in commoncap.c, that will set up credentials based on any configured capabilities. If setexeccon(3) has been called prior to the exec, then that context will be used otherwise call security_transition_sid() function in services.c. This function will then call security_compute_sid() to check whether a new SID needs to be computed. This function will (assuming there are no errors):
    1. Search the SID table for the source and target SIDs.
    2. Sets the SELinux user identity.
    3. Set the source role and type.
    4. Checks that a type_transition rule exists in the AV table and / or the conditional AV table (see the The Main LSM / SELinux Modules diagram).
    5. If a type_transition, then also check for a role_transition (there is a role change in the ext_gateway.conf policy module), set the role.
    6. Check if any MLS attributes by calling mls_compute_sid() in mls.c. It also checks whether MLS is enabled or not, if so sets up MLS contexts.
    7. Check whether the contexts are valid by calling compute_sid_handle_invalid_context() that will also log an audit message if the context is invalid.
    8. Finally obtains a SID for the new context by calling sidtab_context_to_sid() in sidtab.c that will search the SID table (see the The Main LSM / SELinux Modules diagram) and insert a new entry if okay or log a kernel event if invalid.
  4. The selinux_bprm_set_creds() function will continue by checking via the avc_has_perm() functions (in avc.c) whether the file class file_execute_no_trans is set (in this case it is not), therefore the process class transition and file class file_entrypoint permissions are checked (in this case they are allowed), therefore the new SID is set, and after checking various other permissions, control is passed back to the do_execve function.
  5. The exec_binprm function will ultimately commit the credentials calling the SELinux selinux_bprm_committing_creds and selinux_bprm_committed_creds.
  6. Various strings are copied (args etc.) and a check is made to see if the exec succeeded or not (in this case it did), therefore the security_bprm_free() function is ultimately called to free the bprm security structure.
  7. The End.

SELinux Filesystem

Table 6 shows the information contained in the SELinux filesystem (selinuxfs) /sys/fs/selinux (or /selinux on older systems) where the SELinux kernel exports information regarding its configuration and active policy. selinuxfs is a read/write interface used by SELinux library functions for userspace SELinux-aware applications and object managers. Note: while it is possible for userspace applications to read/write to this interface, it is not recommended - use the libselinux library.

Table 6: selinux filesystem Information
selinuxfs Directory and File Names
This is the root directory where the SELinux kernel exports relevant information regarding its configuration and active policy for use by the libselinux library.
Compute access decision interface that is used by the security_compute_av(3), security_compute_av_flags(3), avc_has_perm(3)and avc_has_perm_noaudit(3) functions.

The kernel security server (see services.c) converts the contexts to SIDs and then calls the security_compute_av_user function to compute the new SID that is then converted to a context string.

Requires security {compute_av} permission.

0 = Check requested protection applied by kernel.

1 = Check protection requested by application. This is the default.

These apply to the mmap and mprotect kernel calls. Default value can be changed at boot time via the checkreqprot= parameter.

Requires security {setcheckreqprot} permission.

Commit new boolean values to the kernel policy.

Requires security {setbool} permission.

Validate context interface used by the security_check_context(3) function.

Requires security {check_context} permission.

Compute create labeling decision interface that is used by the security_compute_create(3) and avc_compute_create(3) functions.

The kernel security server (see services.c) converts the contexts to SIDs and then calls the security_transition_sid_user function to compute the new SID that is then converted to a context string.

Requires security {compute_create} permission.

These two files export deny_unknown (read by security_deny_unknown(3) function) and reject_unknown status to user space.

These are taken from the handle-unknown parameter set[2] in the /etc/selinux/semanage.conf file when policy is being built and are set as follows:


0:0 = Allow unknown object class / permissions. This will set the returned AV with all 1's.

1:0 = Deny unknown object class / permissions (the default). This will set the returned AV with all 0's.

1:1 = Reject loading the policy if it does not contain all the object classes / permissions.

Disable SELinux until next reboot.
Get or set enforcing status.

Requires security {setenforce} permission.

Load policy interface.

Requires security {load_policy} permission.

Compute polyinstantiation membership decision interface that is used by the security_compute_member(3) and avc_compute_member(3) functions.

The kernel security server (see services.c) converts the contexts to SIDs and then calls the security_member_sid function to compute the new SID that is then converted to a context string.

Requires security {compute_member} permission.

Returns 1 if MLS policy is enabled or 0 if not.
The SELinux equivalent of /dev/null for file descriptors that have been redirected by SELinux.
Interface to upload the current running policy in kernel binary format. This is useful to check the running policy using apol(1) , dispol/sedispol etc. (e.g. cat /sys/fs/selinux/policy > current-policy then load it into the required tool).
Returns supported policy version for kernel. Read by security_policyvers(3) function.
Compute relabeling decision interface that is used by the security_compute_relabel(3) function.

The kernel security server (see services.c) converts the contexts to SIDs and then calls the security_change_sid function to compute the new SID that is then converted to a context string.

Requires security {compute_relabel} permission.

This can be used to obtain enforcing mode and policy load changes with much less over-head than using the libselinux netlink / call backs. This was added for Object Managers that have high volumes of AVC requests so they can quickly check whether to invalidate their cache or not.

The status structure indicates the following:

version - Version number of the status structure. This will increase as other entries are added.

sequence - This is incremented for each event with an even number meaning that the events are stable. An odd number indicates that one of the events is changing and therefore the userspace application should wait before reading the status of any event.

enforcing - 0 = Permissive mode, 1 = enforcing mode.

policyload - This contains the policy load sequence number and should be read and stored, then compared to detect a policy reload.

deny_unknown - 0 = Allow and 1 = Deny unknown object classes / permissions. This is the same as the deny_unknown entry above.

Compute reachable user contexts interface that is used by the security_compute_user(3) function.

The kernel security server (see services.c) converts the contexts to SIDs and then calls the security_get_user_sids function to compute the user SIDs that are then converted to context strings.

Requires security {compute_user} permission.

This directory contains information regarding the kernel AVC that can be displayed by the avcstat command.
Shows the kernel AVC lookups, hits, misses etc.
The default value is 512, however caching can be turned off (but performance suffers) by:
echo 0 > /selinux/avc/cache_threshold

Requires security {setsecparam} permission.

Shows the number of kernel AVC entries, longest chain etc.
This directory contains one file for each boolean defined in the active policy.
Each file contains the current and pending status of the boolean (0 = false or 1 = true). The getsebool(8), setsebool(8) and sestatus(8) -b commands use this interface via the libselinux library functions.
This directory contains one file for each initial SID defined in the active policy. The file name is the initial SID name with the contents containing its security context.
Each file contains the initial context of the initial SID as defined in the active policy (e.g. any_socket was assigned system_u:object_r:unconfined_t).
This directory contains the policy capabilities that have been configured by default in the kernel via the policycap statement in the active policy. These are generally new features that can be enabled by using the policycap statement in policy. Their default values are false.
If true SECMARK and peer labeling are always enabled even if there are no SECMARK, NetLabel or Labeled IPsec rules configured. This forces checking of the packet class to protect the system should any rules fail to load or they get maliciously flushed. Requires kernel 3.14 minimum.
If true the following network_peer_controls are enabled:

node: sendto recvfrom

netif: ingress egress

peer: recv

If true the open permissions are enabled by default on the following object classes: dir, file, fifo_file, chr_file, blk_file.
Available in kernel 3.4 to allow finer control of ptrace (this will be named correctly one day). Requires policy support and the security class permission ptrace_child.
This directory contains a list of classes and their permissions as defined by the policy (for the Reference Policy the order in the security_classes and access_vectors files).
Each class has its own directory where each one is named using the appropriate class statement from the policy (i.e. class appletalk_socket). Each directory contains the following:
This file contains the allocated class number (e.g. appletalk_socket is the 56th entry in the policy security_classes file).
This directory contains one file for each permission defined in the policy.
Each file is named by the permission assigned in the policy and contains a number that represents its position in the list (e.g. accept is the 14th permission listed in the policy access_vector file for the appletalk_socket and therefore contains '14'.


  1. Kernel SIDs are not passed to userspace only the context strings.
  2. The /proc filesystem exports the process security context string to userspace via /proc/<self|pid>/attr and /proc/<self|pid>/task/<tid>/attr/<attr> interfaces.


  1. This function call will pass over the file name to be run and its environment + arguments.
  2. This is also set in the UNK_PERMS entry of the Reference Policy [#_Reference_Policy_Build_Options (bui build.conf] file. The entry in semanage.conf will over-ride the build.conf entry.