Zend PHP

Strings and Patterns – PHP Certification Exam Series [3]

by Edward Chung, PMP, PMI-ACP, ITIL Foundation · April 5, 2017

Introduction: This is the third part of my study notes for the Zend PHP Certification Exam. You can read more about my PHP Certification exam journey here.

Article Highlights

String Basics

delimited by single or double quotes
double quotes is used when parsing variables is needed or special characters (\n) are used, no difference in processing speed
strlen($string); // count the string length, i.e. the number of bytes (rather than characters)
str_word_count($string, $format, $charlist); // count the number of words, $format(0 – return number of words, 1 – return an array containing all words, 2 – an associative array with keys indicating the position in the string), $charlist – the list of characters considered as elements of a word (e.g. 3 then fri3nd is considered a word)
count_chars($string,1); // count the occurrence of characters, 1 is the mode, returns an array containing all the characters used (0 will return characters with 0 occurrence)
soundex($string); // Calculate the soundex key of a string (a string 4 characters long, starting with a letter), for searching with the pronunciation, e.g. soundex(“Euler”) == soundex(“Ellery”); // E460
metaphone($string,$phonemes); // Calculate the metaphone key of a string (it’s more accurate than soundex() as it knows the basic rules of English pronunciation, the metaphone generated keys are of variable length), $phonemes restrict the number of characters returned
trim ($string,$optional_char_list); // trim whitespaces(space/tab/newline) from the beginning and end; if $optional_char_list present, trim characters present in $optional_char_list
ltrim ($string,$optional_char_list); // trim whitespaces from the beginning
rtrim ($string,$optional_char_list); // same as chop(), trim whitespaces from the end

Comparing Strings

$string1 == $string2 // compare with data type conversion
$string1 === $string2 //compare with data type check
strcmp($string1,$string2) // case-sensitive comparison, returns < 0 if str1 is less than str2; > 0 if str1 is greater than str2, and 0 if they are equal (=== 0)
strcasecmp($string1,$string2) // case-insensitive comparison
strncasecmp($string1,$string2,$length) // $length is the length of string for comparison
similar_text($string1,$string2[,$percentage]) // return the number of matching chars, if the third argument is passed in, the % of matching will be assigned to the variable ($percentage)
levenshtein($string1,$string2) // levenshtein distance between strings, i.e. the minimum number of characters to be replaced/inserted/deleted to transform the $string to be the same, used for fussy logic for guessing misspelt words. e.g. levenshtein(“ca”,”cn”); //1

Formatting Strings

localeconv (); // get an array containing localized numeric and monetary formatting information
nl_langinfo($item); // get a partial list based on the $item
setlocale ( LC_ALL, ”zh-hk”); // set the local for locale aware functions
number_format ($number); // returned a formatted number
money_format ($format, $number); // money_format() is undefined in Windows.
quotemeta ($string); // a backslash character (\) before every character that is among these: . \ + * ? [ ^ ] ( $ )
htmlspecialchars ($string); // convert &,”,’,<,> into HTML entities e.g. "
htmlspecialchars_decode ($string); // decode the above HTML entities to special characters
htmlentities ($string,FLAGS); // converts all applicable to HTML entities (European accents, etc.), only necessary if your pages use encodings such as ASCII or LATIN-1 instead of UTF-8.

html_entity_decode ($string); // converts all HTML entities in the string
get_html_translation_table (HTML_ENTITIES | HTML_SPECIALCHAR); // will return the translation table that is used internally for htmlspecialchars() and/or htmlentities(), default both
strip_tags ($string,$allowable_tags); // remove all tags [opening&closing] (except $allowable_tags) from the string
nl2br ($string); // convert newline to <br>
wordwrap ($string,$length,$break_symbol,$breakwords); // insert the $break_symbol or default “\n” at every $length, if $breakwords is TRUE, words will be broken down (default FALSE)
ucfirst ($string); // uppercase the first character
lcfirst ($string); // lowercase the first character
strtoupper ($string); // uppercase the string
strtolower ($string); // lowercase the string
ucwords ($string); // uppercase the first character of each word
bin2hex ($string); hex2bin ($string); // convert binary data into hexadecimal data (ASCII representation), vice versa
convert_cyr_string ($string); // convert from 1 Cyrillic character set to another
hebrev ($string); // convert logical Hebrew text to visual text (for rtl reading direction)
hebrevc ($string); // convert logical Hebrew text to visual text with newline conversion
chr($ascii); // return the special character, e.g. chr(10) -> \n
ord($string); // return the ASCII value, e.g. ord(“\n”) -> 10
convert_uuencode($string); // encodes a string using the uuencode algorithm, translate strings into printable characters, about 35% larger than original
convert_uudecode($string); // decodes
base64_encode($string); // encodes using the base64 algorithm, about 33% larger than original
base64_decode($string); // decodes
quoted_printable_encode($string); // >PHP 5.3 encodes a 8 bit string in Quoted-Printable Content-Transfer-Encoding for use in MIME
quoted_printable_decode($string); // decodes
print ($string); or print “$string”; // always return 1
printf ($format, $args, …); // prints a formatted string, returns the length of the outputted string, e.g. $str = printf(‘Hello %s‘,’Edward’); prints Hello Edward, $str == 12
sprintf ($format, $args); // returns a formatted string
vprintf ($format, $array_args); // prints a formatted string, accepts an array as argument
vsprintf (); // returns a formatted string, accepts an array as argument
fprintf ($handle, $format, $args); // sends a formatted string to a resource, e.g. write to an open file stream
sscanf ($string, $format, $optional_args); // read from $string and parse it to the required $format as an array, if $optional_args are supplied, the values of the $format parts will be assigned to the $optional_args
fscanf ($handle, $format, $optional_args); // read from a resource, e.g. file stream, each call reads 1 line
str_pad ($string,$length,$optional_padding_character); // pad a string to a certain $length with the padding character

Examples

$format = 'The %2$s contains %1$04d monkeys'; // 2$=>the second argument, 04d=>with 4 digits
echo printf($format, $num, $location); // The tree contains 0005 monkeys
$s = 'monkey'; $t = 'many monkeys';
printf("[%s]\n",$s); // standard string output [monkey]
printf("[%10s]\n",$s); // right-justification with spaces [    monkey]
printf("[%-10s]\n",$s); // left-justification with spaces [monkey    ]
printf("[%010s]\n",$s); // zero-padding works on strings too [0000monkey]
printf("[%'#10s]\n",$s); // use the custom padding character '#' [####monkey]
printf("[%10.10s]\n",$t); // left-justification but with a cutoff of 10 characters [many monke]

$number = 1234.56;
setlocale(LC_MONETARY, 'en_US');
$money_format = money_format('%i', $number) . "\n"; // USD 1,234.56
$english_format_number = number_format($number, 2, '.', ','); // 1,234.56

$test = "string 1234 string 5678";
$result = sscanf($test, "%s %d %s %d"); // $result = array('string','1234','string','5678');

Printf Formatting Specifiers

% – a literal percent character. No argument is required, i.e. %%
b – the argument is treated as an integer, and presented as a binary number.
c – the argument is treated as an integer, and presented as the character with that ASCII value.
d – the argument is treated as an integer, and presented as a (signed) decimal number. 10d – 10 digits
e – the argument is treated as scientific notation (e.g. 1.2e+2).
u – the argument is treated as an integer, and presented as an unsigned decimal number.
f – the argument is treated as a float, and presented as a floating-point number (locale aware). .10f – 10 decimal places
F – the argument is treated as a float, and presented as a floating-point number (non-locale aware). Available since PHP 4.3.10 and PHP 5.0.3.
o – the argument is treated as an integer, and presented as an octal number.
s – the argument is treated as and presented as a string.
x – the argument is treated as an integer and presented as a hexadecimal number (with lowercase letters).
X – the argument is treated as an integer and presented as a hexadecimal number (with uppercase letters).

Escape Sequences (Control Characters)

\n linefeed (LF or 0x0A (10) in ASCII)
\r carriage return (CR or 0x0D (13) in ASCII)
\t horizontal tab (HT or 0x09 (9) in ASCII)
\\ backslash
\$ dollar sign
\” double-quote
\[0-7]{1,3} the sequence of characters matching the regular expression is a character in octal notation

Functions

substr($string, $start, $length)
– returns the substring from the start position with the given length
– a negative start to count from the end
– a negative length to count from the end
– return FALSE on failure or an empty string

substr_compare($string1, $string2, $offset, $length, $case-insentivity)
– binary safe comparison of two strings from an offset, up to length characters
– return 0 if they are equal

substr_count($haystack, $needle, $offset)
– count the number of substring occurrences

substr_replace($string, $replacement, $start, $length)
– replace text within a portion of a string

strstr($haystack, $needle, TRUE | FALSE) // TRUE is before, FALSE is including the $needle and after
stristr($haystack, $needle, TRUE | FALSE) // case-insensitive
strchr($haystack, $needle, TRUE | FALSE )
– return the part of $haystack after(and including) or before the first occurrence of $needle
– e.g. $email = ‘[email protected]’; $domain = strstr($email, ‘@’); // $domain==’@example.com’
$user = strstr($email, ‘@’, TRUE); // $user == ‘name’

strpos($haystack, $needle, $offset)
stripos($haystack, $needle, $offset) // case-insensitive
strrpos($haystack, $needle, $offset) // search from end
– find the position of the first occurrence of a substring in a string (optionally ignore the first few elements as specified by $offset)

strpbrk($string, $charlist) // case sensitive
– break the string starting from the character found and return, or FALSE if it is not found

str_replace($search, $replace, $subject)
str_ireplace($search, $replace, $subject) // case-insensitive
– find and replace, $search can be an array, an optional $count for the 4th argument returns the number of replacements

strtr($string, $from, $to)
strtr($string, $replace_array)
– translate characters or replace substrings
– e.g. echo strtr(“baab”, “ab”, “01”); // 1001
– e.g. $trans = array(“ab” => “01”); echo strtr(“baab”, $trans); // ba01

str_repeat($string, $times)
– repeat a string a number of $times

str_split($string,$maximum_length)
– split the string into an array, each with a length equals or below the maximum length

strspn($string,$charlist,$start,$length)
– returns the length of the initial segment of subject that contains only characters from mask

strcspn($string,$charlist,$start,$length)
– returns the length of the initial segment of subject that does not contain any characters from mask

strrev($string)
– reverse a given string

str_shuffle($string)
– shuffle the string

parse_str($string,$array)
– parse the string into variables, e.g. $str = “first=value&arr[]=foo+bar&arr[]=baz”; parse_str($str); // $first == ‘value’

parse_url($url)
– parse the url to get the information in an associative array

str_getcsv($input,$delimiter,…)
– PHP > 5.3, parse a CSV string into an array

get_magic_quotes_gpc()
– return 1 if magic_quotes_gpc is on, 0 is off

addslashes($string)
– quote string with backslashes before “,’,\,NUL
stripslashes($string)
– unquote a quoted string
addcslashes($string, $charList_to_add)
– quote string with backslashes before characters listed in $charList_to_add, [\]^_~ will always be quoted
stripcslashes($string, $charList_to_add)
– unquote a string quoted with addcslashes

strtok($string,$delimiter)
– split and return the $string into a smaller chunk based on the $delimiter (not included)
– subsequent calls only requires strtok($delimiter), delimiter not included

chunk_split($string)
– split the string into smaller chunks, e.g. used in base64_encoding() or used in email output

crypt($string,$optional_salt)
– one-way encryption of $string, return a hashed string using the standard Unix DES-based algorithm or alternative algorithms that may be available on the system (from PHP 5.3, PHP contains its now implementation of algorithms), e.g. $hashed_password = crypt(‘mypassword’); if (crypt($user_input, $hashed_password) == $hashed_password) { echo “Password verified!”; }
– slower for better protection
– crypt_md5() vs md5() // crypt_md5() encrypts many times using md5 algorithm.

hash($algo,$data)
– returns the hash according to the $algo selected
md5($string) // not for password
md5_file($filestream)
– returns the hash as a 32-character hexadecimal number
crc32($string) // not for password
– generates the cyclic redundancy checksum polynomial of 32-bit lengths, usually to validate the integrity of data being transmitted
sha1($string) // not for password
sha1_file($filestream)
– Returns the hash as a 40-character hexadecimal number

str_rot13($string)
– shifts every letter by 13 places in the alphabet, encode and decode with the same function

mbstring(Multibyte String)

Internal representation of PHP is always encoded with UTF-8
While there are many languages in which every necessary character can be represented by a one-to-one mapping to an 8-bit value, there are also several languages which require so many characters for written communication that they cannot be contained within the range a mere byte can code (A byte is made up of eight bits. Each bit can contain only two distinct values, one or zero. Because of this, a byte can only represent 256 unique values (two to the power of eight)). Multibyte character encoding schemes were developed to express more than 256 characters in the regular bytewise coding system.
not a default module, must be enabled with the configure option (–enable-mbstring=all)
to use function overloading (automatically use mb_ counterpart of some built-in functions), set mbstring.func_overload in php.ini to a positive value that represents a combination of bitmasks specifying the categories of functions to be overloaded. It should be set to 1 to overload the mail() function. 2 for string functions, 4 for regular expression functions, etc.
handles encoding conversion
mb_check_encoding ($string, $encoding); // verifies whether the string is valid for the specified encoding
mb_internal_encoding (“UTF-8”); // set internal character encoding to UTF-8
mb_strlen ();

PCRE(Perl Compatible Regular Expressions)

multi-byte string compatible
delimiter – used in the beginning and end of each pattern, can be manually assigned, usually “/”, “#”, “~”, “!” or use brackets: {pattern}
greediness – by default the maximum match is returned for each character symbol

Meta-characters

\ general escape character
[] a class
| or
() a sub-pattern
[^] negate the class, must be put in the first character
[-] range
Character Classes
\d Digits 0-9 [:digit:]
\D Anything not a digit
\w Any alphanumeric character or an underscore (_) [:word:]
\W Anything not an alphanumeric character or an underscore
\s Any whitespace (spaces, tabs, newlines) [:space:]
\S Any non-whitespace character
. Any character except for a newline
alnum letter and digits
alpha letters
lower lower case letters
upper upper case letters

Anchors

^ Start of a line
$ End of a line (if multiline mode is on, /n evaluates to end of line)

Positioners

\b word boundary
\B not a word boundary
\A Start of a string
\Z End of a string or newline at end
\z End of a string
\G first matching position in subject

Quantifiers

? Occurs 0 or 1 time
* Occurs 0 or more times
+ Occurs 1 or more times
{n} Occurs exactly n times
{,n} Occurs at most n times
{m,} Occurs m or more times
{m,n} Occurs between m and n times
Combination of ? with * or + makes the pattern non-greedy, i.e. *? or +?

Unicode Character Properties (for UTF-8)

\p{xx} a character with the xx property
\P{xx] a character without the xx property
\X an extended Unicode sequence

Pattern Modifiers

i – Case insensitive search
m – Multiline, $ and ^ will match at newlines
s – Makes the dot metacharacter match newlines
x – Allows for commenting
U – Makes the engine un-greedy
u – Turns on UTF8 support
e – Matched with preg_replace() allows you to call

Example
$pattern = ‘/^\s+/i’;

Functions

preg_match ($pattern, $subject, $matches, $flags, $offset); // perform a regular expression match, stop once matched, return 1 if matched, 0 if not matched, FALSE if error occurs
preg_match_all (); // Perform a global regular expression match, returns the number of matches
preg_grep ($pattern, $array); // returns the array consisting of the elements of the input array that match the given pattern, keys preserved, like preg_filter except without replacement
preg_filter ($pattern, $replace, $subject); // returns and replace the $subject when there is a match, $subject can be arrays
preg_replace ($pattern, $replace, $subject); // returns all the $subject after replacement with matches
preg_replace_callback ($pattern, $callback, $subject) // transform using a callback function
$array = preg_split ($pattern, $string); // the array contains the $string split with $pattern
preg_quote ($sting, $optional_delimiter); // format the string into a PECL pattern with escape characters
preg_last_error (); // return the error code of the last regex execution, e.g. PREG_NO_ERROR, PREG_BAD_UTF8_OFFSET_ERROR

Other articles in the series Zend PHP Certification ExamPHP Basics - PHP Certification Exam Series [1]
Functions - PHP Certification Exam Series [2]
Strings and Patterns – PHP Certification Exam Series [3]
Arrays – PHP Certification Exam Series [4]
I/O – PHP Certification Exam Series [5]
Security – PHP Certification Exam Series [6]
Databases – PHP Certification Exam Series [7]
Object Oriented Programming – PHP Certification Exam Series [8]
Data Formats & Types – PHP Certification Exam Series [9]
Web Features – PHP Certification Exam Series [10]
Passing the Zend PHP Certification Exam
← Previous Next →

You can read more about my PHP Certification exam journey here.

Support website running for FREE, thanks!

If you find this post helpful and if you are thinking of buying from Amazon, please support the running cost of this website at no extra cost to you by searching and buying through the search box below. Thank you very much for your help!

Edward Chung

Edward Chung aspires to become a full-stack web developer and project manager. In the quest to become a more competent professional, Edward studied for and passed the PMP Certification, ITIL v3 Foundation Certification, PMI-ACP Certification and Zend PHP Certification. Edward shares his certification experience and resources here in the hope of helping others who are pursuing these certification exams to achieve exam success.

Ajay Aggarwal says:
October 20, 2017 at 20:55
Hi, I am preparing for Zend PHP certification.
But i am not able to find latest php 7 study material.
And its very tough to go through PHP manual and practice and remember all functions as there are thousands of functions.
Cam you please guide me to how to prepare for exam in less time.
And have you found php certification mock tests anywhere.
Reply
- Edward Chung says:
  October 23, 2017 at 14:23
  Hi Ajay,
  Really sorry that I do not have any shortcuts. At the time of my exam prep, I read through the PHP documentation website page by page – it was hard work.
  Hope you can find an easier way.
  Wish you PHP Certification success!
  Reply

About Edward

Hi, my name is Edward Chung, PMP, PMI-ACP®, ITIL® Foundation. Like most of us, I am a working professional pursuing career advancements through Certifications. As I am having a full-time job and a family with 3 kids, I need to pursue professional certifications in the most effective way (i.e. with the least amount of time). I share my exam tips here in the hope of helping fellow Certification aspirants!

If you have any queries, I am more than happy to help. Please review the certification FAQ here OR leave your queries in the comment section. I promise to attend to them asap.

Wish you certification success!

P.S. my PMP / PMI-ACP® Certification status can be verified here (last name: Chung, first name: Chi Wing).