You are here: start » plugin_tutorial_data

This is an old revision of the document!


Plugin Tutorial: Data Storage

Many plugins need to store data. As CMSimple_XH is a flat file based system, this is usually done in flat files too. As there is no central data repository 1) in CMSimple_XH, you're free to store your plugin's data in a format and place of your choice. Some notes about common and useful practices are given in the following.

Location

Traditionally many plugins store their data directly in the plugin folder or in a subfolder thereof (say data/ or content/). This is convenient for the developer and for installation of the plugin, but it may not be the best idea regarding an easy backup for the user and updates, where existing data may accidentially be overwritten.

An alternative is to store data in the content/ folder of CMSimple_XH. This is particularly reasonable, if you want have separate data files for each language resp. subsite 2). In this case you have to make sure, that you don't overwrite files that are managed by the core (content.htm, pagedata.php and their backups) or by other plugins.

However, as there is currently no agreed standard, where to store plugin data, it may be best to offer a configuration option, so the user can decide for himself, where the data should be stored.

Format

Flat files don't pose any restriction on the format of the data you store inside them. It's up to you to choose a suitable one. Common formats which are generally supported by PHP are CSV, PHP includes, JSON, XML and serialized data.

Of course you're not restricted to store your plugin's data in a flat file. You may well use a database of some kind. PHP provides several interfaces, which may or not be available on a shared hosting server, so you should document which PHP version or extension your plugin requires.

CSV

CSV files store tabular data, so they can be used as a poor man's relational database, each file storing a single table. The CSV format has the advantage of being human readable and editable, and that it can be easily imported to several applications such as spreadsheets. Their drawback is, that it takes quite some time to dynamically split the rows and columns, so they are not the fastest storage format.

Note that the name CSV (comma separated values) is slightly misleading. Basically the column delimiter can be any character (string) you prefer. This allows to process the contents of such files with implode()/explode(), as long as neither the column nor the row delimiter are not contained in any of the values. If this is not given, you can use fgetcsv() to read the file (which may be the fastest solution anyway), and fputcsv() to write it. The latter function is available since PHP 5.1, so if you want to support older versions, you can use a fallback (there should be several available on the web, and writing your own shouldn't be too hard either). Note, however, that for historic reasons PHP's CSV escaping strategy isn't exactly the same as for many other applications/libraries.

PHP Includes

Using plain PHP include files to store data is very versatile and quite fast. The files are human readable and editable (though not as intuitive as CSV files). Reading the data is done with include(); writing the data back is a bit more demanding, but var_exports() caters for most of the details.

The greatest issue with this kind of data storage is, that an attacker may be able to insert any PHP code to them, which will be executed when you include the file. So you have to be particularly careful, that this can't happen–otherwise your plugin suffers an arbitrary code execution vulnerability.

Another issue arises with regard to opcode caches. CMSimple_XH 1.5.9 caters for the OPcache extension, which is shipped with PHP 5.5 and later, by simply disabling it completely, what was a quick workaround for the problem. Since CMSimple_XH 1.6 you are responsible for invalidating cached files after saving:

if (function_exists('opcache_invalidate')) {
    opcache_invalidate($filename);
}

JSON

JSON (JavaScript Object Notation) is very suitable for storing human readable and editable data. It's quite fast and easy to handle with PHP's json_encode()/json_decode(). Note however, that these functions are not available in PHP < 5.2, so you might consider to provide a fallback for older PHP versions (e.g. from the CMBUtils).

XML

XML (eXtensible Markup Language) is sometimes used for data storage, even if it might not be the best idea to do so. Due to the widespread use of XML particularly on the web, PHP offers several extensions which deal with this format. SimpleXML is probably the most convenient of these, but that's not availble for PHP 4 (and might not be available in early versions PHP 5.x). A faster and generally available alternative is the XML Parser, but this is hard to deal with for any non trivial XML format, and constructing the XML has to be done “manually”.

Serialized Data

This is the fastest and most convenient way to store and retrieve arbitrary PHP values (it is the default way to store session variables by PHP). Its biggest disadvantage is that it's hard to read, and nearly impossible to write for humans (particularly when encoded as UTF-8).

Concurrency

Web sites may be used by many visitors simultaneously, so you have to cater for concurrent access to your data files. An exception may be made for data that only will be accessed from the back-end, but even in this case it may be reasonable to cater for potential concurrent access. For files that will never be written (a rare case), you can ignore this issue.

The typical way to cater for concurrent file access is to deploy some kind of file locking. PHP offers flock() as a simple solution to do so. Note that flock() has some limitations so it's not absolutely foolproof. However it may still be good enough for most purposes.

CMSimple_XH 1.6.3 is going to introduce XH_lockFile(). For now that is only a simple wrapper around flock(), but it might be augmented. Consider to use this function, if available.

Another strategy is to make use of rename(), which is implemented as atomic operation. This way you make sure, that either all modifications to a single file are done, or none. Note that rename() doesn't work on Windows with PHP < 5.3 (see https://bugs.php.net/bug.php?id=41985).

CMSimple_XH 1.6 introduced XH_renameFile() which is supposed to always work on Windows. Use this function instead of rename(), if available.

Particularly if you store you plugin's data across multiple files, you have to find a way to keep these files in sync.

Page Data

CMSimple_XH introduced a way to store data, which are specific to every single page, the so called page data. These data are managed by CMSimple_XH, and you should use the page data API to access them. The page data API consists of two parts, which are explained in the following.

General Interface

The general interface allows to read and write arbitrary page data, which are stored as one-dimensional associative array of string values. The page data are managed via the global object $pd_router. To register a new page data field simply call $pd_router→add_interest() and pass the name of the field as the only parameter. As the page data are shared between all plugins, the “golden rule” applies:

You are not alone!

So always prefix the page data field names of your plugins with a unique string, such as the name of the plugin.

To retrieve the page data array of a single page use $pd_router→find_page() with the numeric index of the page as only parameter. The data of the current page are available via the global $pd_current. An example:

$time = $pd_current['last_edit']; // sets $time to the last edit date of the current page, which is a UNIX timestamp, e.g. 1317313940

To modify the page data via the general interface, you have to call $pd_router→update($page_index, $array). However, this does not trigger the page data being written to disk. So after you've finished modifying the page data call $pd_router→model→save(). Afterwards you should exit CMSimple_XH after setting an appropriate Location header to guarantee proper synchronization of the modified page data for the core and other plugins. An example:

$pd_router->add_interest('my_field'); // register the new field
$pageIndex = 2; // we're interested in the 3rd page
$pageData = $pd_router->find($pageIndex); // get the page data of the 3rd page
$pageData['my_field'] = 'a string'; // set my_field of page 3 to 'a string'
$pd_router->update($pageIndex, $pageData); // write the modified page data back
$pd_router->model->save(); // trigger writing the page data to disk
header('Location: ' . $sn, true, 303); // set Location header
exit; // exit CMSimple_XH

Since CMSimple_XH 1.6 the pagedata are stored inside content.htm, so you can use XH_saveContents() to store the pagedata. Actually, this is the preferred way over $pd_router→model→save(), as the model property of the PageDataRouter object should be regarded as private.

Note that modifying the page data via the general interface is necessary only in rare cases, and may better be avoided otherwise.

Tab Interface

The tab interface is the usual way to give a user control over the values of the page data of the current page by adding a tab above the editor in which the user can view, modify and save the page data of your plugin. To do so you have to add a PHP file to your plugin, which is by convention placed directly in the plugin folder and called MY_PLUGIN_view.php (where you have to replace MY_PLUGIN with the actual name of your plugin of course). Inside this file you have to define a function named the same as the file without the .php extension, which expects a single array parameter: the page data of the current page. The function has to return some (X)HTML, which consists mainly of a “form” element with inputs for the page data fields and a submit button. A simplified example:

function my_plugin_view($page)
{
    global $sn, $su;
 
    $url = "$sn?$su";
    return <<<EOS
<form action="$url" method="post">
    <input type="text" name="my_field" value="$page[my_field]">
    <input type="submit" class="submit" name="save_page_data">
</form>
EOS;
}

Note that the inputs have to be named after the respective page data field they represent, and that the submit button has to be named “save_page_data”, to let CMSimple_XH process the form submission as expected. To be able to use checkboxes you can use a trick to force a value to be submitted when the checkbox is not checked: add a hidden field with the same name to the form before the checkbox:

tag('input type="hidden" name="'.$field.'" value="0"')
.  tag('input type="checkbox" name="'.$field.'" value="1"' . $checked)

Since CMSimple_XH 1.6 the forms of the page data tabs are submitted via AJAX. If you want to force a normal form submission, you have to add onsubmit="return true" to the form element.

To actually display the tab above the editor you have to call $pd_router→add_tabs() with two parameters: the label of the tab (which you'll usually get from $plugin_tx) and the path of the file.

« Escaping of Strings | JavaScript »

1)
except the page data; see below
2)
otherwise you have to make use of $sl to distinguish the files
 
You are here: start » plugin_tutorial_data
Except where otherwise noted, content on this wiki is licensed under the following license: GNU Free Documentation License 1.3
Valid XHTML 1.0 Valid CSS Driven by DokuWiki