Best data sanitization secure guide for the WordPress site

8 min read
421

Security is an essential part of any software. That’s why I decided to prepare a series of articles that will analyze the most widespread vulnerabilities and programming techniques to help you make your code safer. Let’s get started with the most popular in the web devs environment vulnerability is code injection.

Code injection

Code injection is the exploitation of a computer bug that is caused by processing invalid data. The injection is used by an attacker to introduce (or “inject”) code into a vulnerable computer program and change the course of execution. The result of successful code injection can be disastrous, for example, by allowing computer viruses or computer worms to propagate.

Code injection vulnerabilities occur when an application sends untrusted data to an interpreter. Injection flaws are most often found in SQL, LDAP, XPath, NoSQL queries, OS commands, XML parsers, SMTP headers, program arguments, etc. Injection flaws tend to be easier to discover when examining source code than via testing.

Wikipedia

All injections work in about the same way. However, I see no reason to delve into all of them. Therefore, we can take the most popular SQL injection and take a look at it. Let’s dive into the code:

<?php

$search = $_POST['search'];

global $wpdb;

$posts = $wpdb->get_results(
	'SELECT ID, post_title, post_content FROM ' . $wpdb->posts . '
	WHERE post_type="post"
	AND post_content LIKE "%' . $search . '%"'
);

wp_send_json_success( $posts );

Can you see a place for injecting? The $search = $_POST['search']; line. Nevertheless, when a user inputting some word or phrase is okay. But on the other hand, what can happen if someone inputs some SQL code here? Thus this is where the fun begins.

Looking at the code, you can get absolutely any data from the database. Do you not believe me? Let’s start with a simple example and send the %" OR 1 = "1 to the search field, and thus we can get all records from the tables. Consequently, there can be other post types, private posts, etc.

In addition, we can get other data, for example, user logs in and password if we send the %" UNION SELECT 0, user_pass, user_login FROM wp_users WHERE user_pass LIKE "%. Although we have hash instead of the actual password, however, to crack the hash effortlessly through password guessing programs, such as John The Ripper or similar.

Despite the problem, The recipe for preventing injections is one, and plain is sanitizing user input.

Early sanitize (security input)

When dealing with user input, you must always sanitize these data once, as soon as possible. Although most users will positively use your code, even one unsafe code can cause you a lot of trouble.

How to straightforward understand that you deal with the user input? Fortunately, in PHP, it’s super-duper easy. All security input is the superglobals variables, such as $_REQUEST, $_POST, $_GET, $_FILES, $_COOKIE, and in some cases $_ENV, $_SERVER, and $_SESSION.

Since all developers love when their IDE automatically remembers when to do specific actions, of course, sanitizing isn’t an exception. The WordPress team prepared WordPress Coding Standards, which, in addition to checking the coding standards, reminded us to use sanitizing and unslashing.

What should we exactly do with the user input? You can use PHP or WordPress sanitizing or filtering functions.

Sanitize functions

PHP functions that start with filter_* and a few more:

  • filter_input( int $type, string $var_name, int $filter = FILTER_DEFAULT, array|int $options = 0 ): mixed
  • filter_input_array( int $type, array|int $options = FILTER_DEFAULT, bool $add_empty = true ): array|false | null
  • filter_var( mixed $value, int $filter = FILTER_DEFAULT, array|int $options = 0 ): mixed
  • filter_var_array( array $array, array|int $options = FILTER_DEFAULT, bool $add_empty = true ): array|false | null
  • (int) $var
  • (float) $var
  • intval( mixed $value, int $base = 10 ): int
  • floatval( mixed $value ): float
  • etc.

WordPress functions that start with sanitize_* or wp_filter_*, and a few more:

  • absint( mixed $maybeint ): int
  • esc_url_raw( string $url, string[] $protocols = null ): string
  • sanitize_bookmark( stdClass|array $bookmark, string $context = 'display' ): strClass|array
  • sanitize_bookmark_field( string $field, mixed $value, int $bookmark_id, string $context ): mixed
  • sanitize_category( object|array $category, string $context ): object|array
  • sanitize_category_field( string $field, mixed $value, int $cat_id, string $context ): mixed
  • sanitize_file_name( string $filename ): string
  • sanitize_email( string $email ): string
  • sanitize_hex_color( string $color ): string|void
  • sanitize_hex_color_no_hash( string $color ): string|null
  • sanitize_html_class( string $class, string $fallback = '' ): string
  • sanitize_key( string $key ): string
  • sanitize_meta( string $meta_key, mixed $meta_value, string $object_type, string $object_subtype = '' ): mixed
  • sanitize_mime_type( string $mime_type ): string
  • sanitize_option( string $option, string $value ): string
  • sanitize_post( object|WP_Post|array $post, string $context = 'display' ): object|WP_Post|array
  • sanitize_post_field( string $field, mixed $value, int $post_id, string $context = 'display' ): mixed
  • sanitize_sql_orderby( string $orderby ): string|false
  • sanitize_term( array|object $term, string $taxonomy, string $context = 'display' ): array|object
  • sanitize_term_field( string $field, string $value, int $term_id, string $taxonomy, string $context ): mixed
  • sanitize_text_field( string $str ): string
  • sanitize_textarea_field( string $str ): string
  • sanitize_title( string $title, string $fallback_title = '', string $context = 'save' ): string
  • sanitize_title_for_query( string $title ): string
  • sanitize_title_with_dashes( string $title, string $raw_title = '', string $context = 'display' ): string
  • sanitize_user( string $username, bool $strict = false ): string
  • sanitize_user_field( string $field, mixed $value, int $user_id, string $context ): mixed
  • wp_filter_comment( array $commentdata ): array
  • wp_filter_content_tags( string $content, string $context = null ): string
  • wp_filter_nohtml_kses( string $data): string
  • wp_filter_object_list( array $list, array $args = [], string $operator = 'and', bool|string $field = false ): array
  • wp_filter_oembed_iframe_title_attribute( string $result, object $data, string $url ): string
  • wp_filter_oembed_result( string $result, object $data, string $url ): string
  • wp_filter_pre_oembed_result( null|string $result, string $url, array $args ): null|string
  • wp_kses( string $string, array[]|string $allowed_html, string[] $allowed_protocols = [] ): string
  • etc.

Above are a massive functions amount even though how should I find the proper for my case? It would be best if you had to choose the most strict function for your case. In other words, look at your data and think logically. Naturally, the best friend is your brain, and an autocomplete in PhpStorm. Besides, feel free to dive into these functions’ work. Let’s dive into a few examples:

<?php

if ( ! wp_verify_nonce( sanitize_key( $_POST['_wpnonce'] ), 'security::example' ) ) {
	return;
}

$post_title   = isset( $_POST['post_title'] ) ? sanitize_text_field( wp_unslash( $_POST['post_title'] ) ) : '';
$post_content = isset( $_POST['post_content'] ) ? sanitize_textarea_field( wp_unslash( $_POST['post_content'] ) ) : '';
$email        = isset( $_POST['email'] ) ? sanitize_email( $_POST['email'] ) : '';
$post_id      = isset( $_GET['post_id'] ) ? absint( $_GET['post_id'] ) : 0;
$referer_url  = ! empty( $_SERVER['HTTP_REFERER'] ) ? esc_url_raw( wp_unslash( $_SERVER['HTTP_REFERER'] ) ) : '';
$orderby      = ! empty( $_COOKIE['my_plugin_orderby'] ) && strtoupper( sanitize_text_field( wp_unslash( $_COOKIE['my_plugin_orderby'] ) ) ) === 'ASC' ?
	'ASC' : 'DESC';
$admin_color  = ! empty( $_SESSION['my_plugin_admin_color'] ) ? sanitize_hex_color( wp_unslash( $_SESSION['my_plugin_admin_color'] ) ) : '#181818';

For an easy to go, I prepared a list of what and when to use:

  • numbers – (int)..., (float)..., absint( ... )
  • nonces – sanitize_key( ... )
  • one-line field – sanitize_text_field( wp_unslash( ... ) )
  • few-lines field – sanitize_textarea_field( wp_unslash( ... ) )
  • email – sanitize_email( ... )
  • urls – esc_url( ... ), esc_url_raw( ... )
  • file names – sanitize_file_name( wp_unslash( ... ) )

Why do we use wp_unslash sometimes?

WordPress core cares about sites security and adds additional slashes for symbols such as options, post content, user data, etc., to prevent XSS attacks. In two words, if anyone finds a way to put the <script>alert( 'Bingo' );</script> into your database, it will be stored inside the database as <script>alert( \'Bingo\' );</script>. As a result, when you print the data, the script wouldn’t work.

How to deal with WordPress database?

One more thing that requires your attention is dealing with the database. Of course, you can use WordPress functions, e.g., get_posts, get_terms, get_users, etc., to do it. However, when you deal with your tables and custom queries, sanitizing it’s not enough to warranty SQL injections free. The next code has sanitizing and unslashing but you can still make a SQL injection from the first part of this article:

<?php

$search = sanitize_text_field( wp_unslash( $_POST['search'] ) );

global $wpdb;

$posts = $wpdb->get_results(
	'SELECT ID, post_title, post_content FROM ' . $wpdb->posts . '
	WHERE post_type="post"
	AND post_content LIKE "%' . $search . '%"' // SQL Injection here.
);

wp_send_json_success( $posts );

The code above still has the vulnerability. Why has it happened? SQL injection can be passed in different ways:

  • Multi-queries (close the previous query and write one more)
  • Subqueries(use a subquery as a value for the main query)
  • Quotes(by closing a quote and adding code after)
  • The UNION statement (add a new query via UNION SELECT ...)

Firstly, multi-queries are possible to inject only via mysqli_multi_query and mysql_multi_query functions (or their object-oriented counterparts). Hopefully, WordPress doesn’t use them. To prevent multi-queries (SELECT * FROM wp_posts; SELECT * FROM wp_users) you must always use wpdb for your custom queries.

Secondly, subqueries are possible to inject into numeric values and into the IN statement. What is the best way? Here are two rules: sanitize numeric values via (int), (float), absint( ... ), etc., and don’t use changeable strings inside the IN statement (yep, it’s hard, but I don’t have another recipe.

Last but not least, quotes and the UNION statement. You should use special wpdb methods for creating, updating, replacing, and deleting or the wpdb::prepare method firstly inside all other cases. In addition, a few examples of the proper use of the wpdb methods.

To summarize all these rules for custom queries:

  • Always use wpdb
  • Always sanitize values
  • Numerical values should be sanitized with extra care
  • Don’t use changable string values inside the IN statements
  • For creating, udating, and deleting use special wpdb methods:
    • $wpdb->insert( ... );
    • $wpdb->update( ... );
    • $wpdb->replace( ... );
    • $wpdb->delete( ... );
  • For reading always use $wpdb->prepare inside:
    • $wpdb->select( $wpdb->prepare( … ) );
    • $wpdb->get_results( $wpdb->prepare( … ) );
    • $wpdb->get_var( $wpdb->prepare( … ) );
    • $wpdb->get_col( $wpdb->prepare( … ) );
    • $wpdb->get_row( $wpdb->prepare( … ) );
<?php

global $wpdb;

$wpdb->insert(
	$wpdb->posts,
	[
		'post_title' => sanitize_text_field( wp_unslash( $_POST['post_title'] ) ),
	],
	[
		'%s',
	]
);

$wpdb->update(
	$wpdb->posts,
	[
		'post_title' => sanitize_text_field( wp_unslash( $_POST['post_title'] ) ),
	],
	[
		'ID' => absint( $_POST['post_id'] ),
	],
	[
		'%s',
	]
);

$wpdb->query(
	$wpdb->prepare(
		"SELECT * FROM {$wpdb->posts} WHERE ID > %d AND post_title LIKE %s",
		absint( $_POST['post_id'] ),
		'%' . $wpdb->esc_like( sanitize_text_field( wp_unslash( $_POST['search'] ) ) ) . '%'
	)
);

Custom sanitize functions

When you are doing a large project, you will most likely have your functions for data sanitization. In the example above, we have a reasonably large function to handle sort order.

function security_sanitize_orderby( $string, $default = 'ASC' ) {

	$string          = strtoupper( trim( sanitize_text_field( wp_unslash( $string ) ) ) );
	$allowed_orderby = [
		'RAND()',
		'ASC',
		'DESC',
	];
	$default         = in_array( $default, $allowed_orderby, true ) ? $default : 'ASC';

	foreach ( $allowed_orderby as $orderby ) {
		if ( $string === $orderby ) {
			return $orderby;
		}
	}

	return $default;
}

However, adding a function isn’t enough since who always wants to mute some PHPCS rules? Therefore, let’s register our new function inside our coding standards.

<?xml version="1.0"?>
<ruleset name="MyCodingStandards">
	<!-- ... -->
	<rule ref="WordPress.Security.ValidatedSanitizedInput">
		<properties>
			<property name="customSanitizingFunctions" type="array">
				<element value="security_sanitize_orderby"/>
			</property>
			<property name="customUnslashingSanitizingFunctions" type="array">
				<element value="security_sanitize_orderby"/>
			</property>
		</properties>
	</rule>
	<!-- ... -->
</ruleset>

Above, we added two sections: customSanitizingFunctions and customUnslashingSanitizingFunctions. The first section registers your sanitizing function. Likewise, the second section allows you to skip unslashing before.

To wrap up, always sanitize user input to be sure that your data is under your control. Likewise, you must sanitize data once and as soon as possible. Even one injection can make a lot of sleepless nights. In addition, if you deal with an open-source, be twice attentive for sanitizing.

Leave a Reply

Your email address will not be published. Required fields are marked *

Subscribe to news. I promise not to spam :)
Follow me, don't be shy