Friday, February 1, 2013

New Features in PHP 5.4

New Features in PHP 5.4


by Rasmus Lerdorf

The LAMP stack has new competition, but features in this release have PHP pushing the envelope once again.
Published April 2012
Almost exactly eight years ago I wrote an article for the Oracle Technology Network called, “Do You PHP?”. In that article, I talked about PHP's stubborn, function-over-form approach to solving the "Web problem" and the fight to keep things simple. We were getting close to releasing PHP 5.0 at the time. Now here we are almost a decade later with a shiny new PHP 5.4.0 release, and while much has happened during that time, there are also many things that haven't changed at all.
One thing that hasn't changed is that the ecosystem is as important as it ever was. Solving the Web problem is about much more than choosing a scripting language. It is about the entire ecosystem around it. The LAMP stack has been strong for nearly 15 years now and it is still popular, but we are starting to see strong alternatives. PHP-FPM with nginx has been gaining popularity rapidly because of the much improved support starting in PHP 5.3 and further streamlined in 5.4. The M, or database, part of the stack is also starting to become much more diverse than it was 8 years ago. The various NoSQL solutions and MySQL Cluster provide a richer set of options than just throwing everything into a MyISAM table.
A number of interesting technologies have appeared and PHP extensions have been written to make them easily accessible. One of my favorites is libevent, which you can use to write high-performance event-driven applications in PHP. Another is ZeroMQ, which is a high-level socket library. Much like SQLite removed any and all need to ever write another raw file format and associated parser, ZeroMQ has removed any reason to play around with socket protocols and associated socket handling code. You can even combine libevent with ZeroMQ for an awesome, high-performance, standalone event-driven server. (See this example if you are interested in that.) I also really like SVM (Support Vector Machine), a machine-learning algorithm that you can throw a lot of problems at without needing to become a machine-learning geek.
There are also a number of extensions that have become well-established over the last couple of years. Gearman, in particular, has become popular and shows up as part of the common stack people deploy. Gearman lets you dispatch jobs to be done by workers asynchronously. The workers can be spread out over many servers and they can even further dispatch more jobs MapReduce-style.
After the release of PHP 5.0 in 2004, 5.1 followed in 2005 with the new DateTime implementation, PDO, and performance improvements. PHP 5.2 came in 2006 and brought an improved memory manager, JSON support, and input filtering. At that point we started the push to PHP 6, which was a super-ambitious plan to completely rewrite everything around the ICU (International Components for Unicode) library. It turned out to be a little too ambitious - we couldn't get enough developers excited about it, and instead ended up rolling all the various features that had been sitting around waiting for PHP 6 into a PHP 5.3 release in 2009. This 3-year gap between 5.2 and 5.3 also meant that 5.3 brought a lot of new things to PHP: namespaces, late static binding, closures, garbage collection, restricted goto, mysqlnd (MySQL native driver), much better Windows performance, and many other things.
In hindsight, it probably would have made sense to call this release PHP 6, but PHP 6 had become synonymous with the Unicode effort to the point that books had even been written about it, so we didn't feel we could release PHP 6 without a major push toward Unicode. We did introduce an ICU extension called “intl” that also compiles against PHP 5.2, which gives you access to much of ICU's functionality. The mbstring extension has improved steadily over time, which means that pretty much any Unicode-related problem has a solution, it just isn't cleanly integrated into the language itself.
This brings us to the PHP 5.4 release in 2012. Again, an almost 3-year gap between releases, which is something we want to improve upon. I would like to get back to annual releases with fewer new features in each.
Here are the major features you will see when you upgrade to 5.4:

Memory and Performance Improvements

A number of internal structures have been made smaller or eliminated entirely, which has resulted in 20-50% memory savings in large PHP applications. And performance is up by 10-30% (heavily dependent on what your code is doing) through a number of optimizations including inlining various common code-paths, $GLOBALS has been added to the JIT, the ‘@’ operator is faster, run-time class/function/constant caches have been added, run-time string constants are now interned, constant access is faster through pre-computed hashes, empty arrays are quicker and consume less memory, unserialize() and FastCGI request handling is faster, plus many more memory and performance tweaks were made throughout the code.
Some early test have shown that Zend Framework runs 21% faster and uses 23% less memory under 5.4, for example, and Drupal uses 50% less memory and runs about 7% faster.

Traits

Traits is probably the most talked about new feature in PHP 5.4 - think of them as compiler-assisted copy-and-paste. Traits are a feature of Scala as well. Other languages may refer to them as "mixins" - or they may not name them at all but have an extended interface mechanism that allows the interface to contain an actual implementation of its methods.
In contrast to mixins, traits in PHP include explicit conflict resolution if multiple traits implement the same methods.

trait Singleton {
    public static function getInstance() { ... }
}

class A {
    use Singleton;
    // ...
}

class B extends ArrayObject {
    use Singleton;
    // ...
}

// Singleton method is now available for both classes
A::getInstance();
B::getInstance();

See php.net/traits for many more examples including the conflict resolution syntax, method precedence, visibility, and support for constants and properties in traits. Also, for more of the theory behind the concept, you can read Nathan Schärli’s dissertation, “Traits: Composing Classes from Behavioral Building Blocks.”

Short Array Syntax

A simple, but very popular syntax addition:
$a = [1, 2, 3];
$b = ['foo' => 'orange', 'bar' => 'apple'];

That is, you now no longer need to use the ‘array’ keyword to define an array.

Function Array De-referencing

Another oft-requested syntax addition. Function calls that return an array can now be de-referenced directly:
function fruits() {
    return ['apple', 'banana', 'orange'];
}
echo fruits()[0]; // Outputs: apple

Instance Method Call

Related to function array dereferencing, you can now call a method on object instantiation. And as in previous versions, you can of course still chain method calls, so you can now write code like this:
class foo {
    public $x = 1;

    public function getX() {
        return $this->x;
    }
    public function setX($val) {
        $this->x = $val;
        return $this;
    }
}

$X = (new foo)->setX(20)->getX();
echo $X; // 20

Although, unless your constructor is doing something useful, since the instantiated object is thrown away perhaps you should be using a static method call here instead. If we combine it with the short array syntax and function array de-referencing we can write some really convoluted code:
class foo extends ArrayObject {
    public function __construct($arr) {
        parent::__construct($arr);
    }
}

echo (new foo( [1, [4, 5], 3] ))[1][0];

At a glance, can you tell what the output will be? Here we pass a two-dimensional array to the constructor, which just returns the array. We then pick out the first element of the second dimension, so this would output “4”.

Closure Binding

Closures were introduced in PHP 5.3, but in 5.4 we have refined how they interact with objects. For example:
class Foo {
  private $prop;
  function __construct($prop) {
    $this->prop = $prop;
  }
  public function getPrinter() {
    return function() { echo ucfirst($this->prop); };
  }
}

$a = new Foo('bar');;
$func = $a->getPrinter();
$func(); // Outputs: Bar

Note that the closure accesses $this->prop, which is a private property. By default, closures in PHP use early-binding - which means that variables inside the closure will have the value they had when the closure was defined. This can be switched to late-binding by using references. However, closures may also be re-bound:
$a = new Foo('bar');
$b = new Foo('pickle');
$func = $a->getPrinter();
$func(); // Outputs: Bar
$func = $func->bindTo($b);
$func(); // Outputs: Pickle

Here we have re-bound the closure from the $a instance to the instance in $b. If you don’t want your closure to ever have access to the object instance, you can declare it static:
class Foo {
  private $prop;
  function __construct($prop) {
    $this->prop = $prop;
  }
  public function getPrinter() {
    return static function() { echo ucfirst($this->prop); };
  }
}

$a = new Foo('bar');;
$func = $a->getPrinter();
$func(); // Fatal error: Using $this when not in object context

Objects as Functions

There is a new magic method called “__invoke” which can be used like this:
class MoneyObject {
    private $value;
    function __construct($val) {
        $this->value = $val;
    }
    function __invoke() {
        return sprintf('$%.2f',$this->value);
    }
}
$Money = new MoneyObject(11.02/5*13);
echo $Money(); // Outputs: $28.65

Built-in Web Server (CLI)

The CLI server is a tiny Web server implementation that you can run from the command line:
% php -S localhost:8000
PHP 5.4.0 Development Server started at Sun Mar 11 13:27:09 2012
Listening on localhost:8080
Document root is /home/rasmus
Press Ctrl-C to quit.

CLI Server is not intended for use as a production Web server; we are going to be using it for running some of our PHP regression tests and I can see other unit testing mechanisms and probably IDEs also making use of it. It does have some really useful features for day-to-day debugging of your code from the command line. By default it uses the current directory as the DocumentRoot; it handles static file requests as well. The default directory index file is “index.php” so you can fire it up in a directory full of .php, .css, .jpg, etc. file and it will just work. For more complex applications that might use mod_rewrite to send all requests to a front-controller or router, you can wrap that router with a simple little script and start the CLI Server like this:
% php -S localhost:8080 /path/to/router.php
PHP 5.4.0 Development Server started at Sun Mar 11 13:28:01 2012
Listening on localhost:8080
Document root is /tmp/web
Press Ctrl-C to quit.

The router.php script might look like this:
<?php
if (preg_match('!\.php$!', $_SERVER["REQUEST_URI"])) {
    require basename($_SERVER["REQUEST_URI"]);
} else if (strpos($_SERVER["REQUEST_URI"], '.')) {
    return false; // serve the requested file as-is.
} else {
    Framework::Router($_SERVER["REQUEST_URI"]);
}

This wrapper loads up direct .php requests, passes any other requests that contain a “.” through to the static file handler and everything else gets passed to the framework’s router. You can run Drupal and Symphony directly from the command line like this.

Native Session Handler Interface

As a small convenience feature, there is now a session handler interface you can implement. Now you can just pass an instance of your session handling object to session_set_save_handler() instead of having to pass it six ugly function names:
SessionHandler implements SessionHandlerInterface {
  public int close ( void )
  public int destroy ( string $sessionid )
  public int gc ( int $maxlifetime )
  public int open ( string $save_path , string $sessionid )
  public string read ( string $sessionid )
  public int write ( string $sessionid , string $sessiondata )
}
session_set_save_handler(new MySessionHandler);

JsonSerializable Interface

You can now control what happens if someone tries to json_encode() your object by implementing the JsonSerializable interface:
class Foo implements JsonSerializable {
    private $data = 'Bar';
    public function jsonSerialize() {
        return array('data'=>$this->data);
    }
}
echo json_encode(new Foo); // Outputs: {"data":"Bar"}

Binary Notation

To go along with PHP’s native support for hexadecimal and octal there is now also binary notation:
$mask = 0b010101;

Improved Error Messages

Error messages are slightly improved.
Before:
% php -r 'class abc foo' 
Parse error: syntax error, unexpected T_STRING, expecting '{' 
in Command line code on line 1

After:
% php -r 'class abc foo'
Parse error: syntax error, unexpected 'foo' (T_STRING), expecting '{' 
in Command line code on line

Perhaps somewhat subtle, but the difference is the value of the stray token “foo” is shown in the error message now.

Array to String Conversion Notice

If you have ever used PHP you have probably ended up with the word “Array” randomly appearing in your page because you tried to output an array directly. Whenever an array is directly converted to a string, chances are it is a bug and there is now a notice for that:
$a = [1,2,3];
echo $a;

Note: Array to string conversion in example.php onlLine 2

Removed Features

We finally pulled the trigger on a number of features that have been marked as deprecated for years. These include allow_call_time_pass_reference, define_syslog_variables, highlight.bg, register_globals, register_long_arrays, magic_quotes, safe_mode, zend.ze1_compatibility_mode, session.bug_compat42, session.bug_compat_warn and y2k_compliance.
Out of these magic_quotes is probably the biggest risk. Despite all the things that are wrong with magic_quotes, naively-written applications that don’t do anything to protect themselves from SQL injection are protected by magic_quotes in previous versions. Upgrading to PHP 5.4 without verifying that proper SQLi-protection measures have been taken could lead to security vulnerabilities.

Other Changes and Features

  • There is a new “callable” typehint for when a method takes a callback as an argument.
  • htmlspecialchars() and htmlentities() now have better support for Asian characters and they default to UTF-8 instead of ISO-8859-1 if the PHP default_charset isn’t explicitly set in the php.ini file.
  • <?= (the short echo syntax) is now always available regardless of the value of the short_tags ini setting. This should make templating system authors happy.
  • Session ids are now generated with entropy from /dev/urandom (or equivalent) by default instead of being an option you explicitly have to enable, as in previous versions.
  • mysqlnd, the bundled MySQL Native Driver library, is now used by default for the various extensions that talk to MySQL unless explicitly overridden through ./configure at compile-time.
There are probably another 100 small changes and features. An upgrade from PHP 5.3 to 5.4 should be extremely smooth, but read the migration guide to make sure. If you are upgrading from an earlier version you likely have a bit more work to do. Check the previous migration guides to get started.

What's Next for PHP?

We don’t have a long term roadmap for PHP. PHP moves with the Web. We don’t know what will be the important Web trends and technologies in 5-10 years, but we do know that PHP will be there with our take on how to approach them.

In the shorter term we discuss PHP development on the “internals” mailing list and as consensus starts to form around a large feature it evolves to an RFC. You can find the RFCs at wiki.php.net/rfc. Once a good set of new features have been voted on and properly implemented and tested, we start down the path toward a new release.

PHP has grown and evolved along with the Web and has kept a steady market share of approximately a third of all the Web sites in the world. This includes not only some of the biggest names on the Web, but also a large percentage of the smallest. It is that latter fact that sets PHP apart for me: Scaling up is a natural and even expected characteristic and one that appeals strongly to engineers, but scaling down is less natural and in some cases harder. When you strike the right balance and achieve both in the same codebase allowing dorm room hacks to grow to billion-dollar companies, then you really have something.

No comments:

Post a Comment