Composr Tutorial: Importing data into Composr

Written by Chris Graham
Image

Importing will generally use the contents of database tables designed for one product to create equivalent data suitable for Composr

Importing will generally use the contents of database tables designed for one product to create equivalent data suitable for Composr

(Click to enlarge)

An integrated importer system is provided, which allows merging of content stored in any supported data format into a running site. This merging process allows combination of any number of supported websites, to form a single combined website.

The importer functionality can be reached from:
Admin Zone > Tools > Import


Importers

At the time of writing, the following software importers are available:
And the following special ones:

Support for other versions and software can be implemented by a Composr partner or other developer; this is not officially-maintained functionality by the core developers.

Data format importing

We try and support import of neutral data formats into Composr. For example, CSV spreadsheet files, or importing all the downloads in a directory. These features may help ease a transition and are linked from the importers list also (as shortcuts). Neutral formats are an important part of our approach because having an importer for everything is clearly not viable.

Getting a new importer written for you

New importers can be written by the community (they are not officially supported or maintained by the core developer team). The cost is highly dependent on the software involved and the developer you hire. Importers can be developed by any skilled PHP developer (instructions are in the Code Book). A developer quoting to write an importer would probably not be quoting to do a full site conversion – make sure you are clear in what you ask for.

Memory limits

Importers may use a lot of memory in order to transfer large amounts of data, so you may need to raise the memory limit on your server or import on a different server then copy your site over. Information on PHP memory limits is included in our FAQ.

Using importers

Image

A configuration file may be required. This screen-shot illustrates what they are and where they tend to be.

A configuration file may be required. This screen-shot illustrates what they are and where they tend to be.

(Click to enlarge)

Image

Choosing an importer session

Choosing an importer session

(Click to enlarge)

Image

The list of importers

The list of importers

(Click to enlarge)

Image

Import setup

Import setup

(Click to enlarge)

Image

Import choice of import actions

Import choice of import actions

(Click to enlarge)


When you have chosen a product to import from, Composr will ask you for some details. Importers work by connecting to the database of the product being imported. In addition, some require the presence of a configuration file for the product at an accessible path on your server, and will auto-detect database settings from this file. It is recommended that you leave your old site installed and running, although perhaps at a moved location, so that the importer can find all the associated files that it may want to import.

It is strongly recommended that you backup your site files and database before running an importer, in case the importer fails in some way (perhaps an incomplete, or unsatisfactory import, or duplication of data by a poorly written third-party importer).

We also recommend that if you are importing data that could contain HTML or XML tags, such as forum posts, the “Subject to a more liberal HTML filter” privilege be turned on for all users. Otherwise imported HTML may not be legible once subjected to Composr's own particular HTML inclusion-list rules.

The importer system is designed to be robust, and is programmed as 're-entrant' code; this means that if installation is halted, via failure, timeout, or cancellation, it can continue from where it left off. This is of particular use if there is an incompatibility between your data and the importer that a programmer needs to fix (such a situation may not be very unlikely due to the wide variation in data for any single product across different versions and deployments). Hopefully an import will go completely smoothly, but it is inherently a complex process.

Sometimes an importer will list further actions that must be taken after import has finished. The following forms of further action are commonly required:
  • stats recalculation (especially for forum importers)
  • moving of on-disk files from the imported products upload directory, to Composr's (this is sometimes done automatically, depending on how the importer was written).

Advanced information

Import sessions

The importer system uses a concept of 'import sessions'. These are built on top of the Composr login sessions, and are an important feature in allowing you to merge multiple sites into Composr: they keep the progress and 'ID remap table' from each import separate. The 'choose your session' screen exists so that if your Composr session is lost, you can still resume a previous import.

Features, content and dependencies

Importers define a list of features they can import, along with a dependency system to ensure that a feature can only be imported once any features that it is dependent upon have already been imported (for example, forum posts are always dependent on forum topics, and forum topics are always dependent on forums).

On the screen listing each feature to import, features are listed in order according to dependencies. A feature might first depend on importing other features listed above it in the list but will never depend on any of the features listed below it. That way, unless you skip features, you shouldn't run into any dependency errors.

Cache-rebuild

Composr is designed so that forms of redundancy, such as parsed Comcode, and various forms of tally, can be recalculated dynamically as Composr runs. Knowing this, and in order to remove load from the importer itself, these tasks are therefore usually omitted by the importer.

Converting to Conversr

Image

Composr forum drivers are specially coded PHP files stored in the sources/forum directory (or sources_custom/forum).

Composr forum drivers are specially coded PHP files stored in the <kbd>sources/forum</kbd> directory (or sources_custom/forum).

(Click to enlarge)

Image

The forum driver can be changed using the Installation Options editor, but you should not do this unless you know exactly what you are doing

The forum driver can be changed using the Installation Options editor, but you <strong class="comcode-bold">should not do this</strong> unless you know exactly what you are doing

(Click to enlarge)

If you have been running Composr and a third-party forum, and wish to switch to using a complete Composr solution (Composr with Conversr), this is possible if there is a forum importer for your current forum product. The opportunity is presented to move to Conversr as the last importable feature of a forum import, and the function will 'jump' forum drivers for you and remap any usergroup and user IDs. It is still strongly advised to check your permissions after performing this to ensure extra access wasn't accidentally opened up to users.

Converting an existing Composr website manually (experts-only)

If you have installed Composr, and interfaced to a third-party forum, but want to switch to Conversr without an import (because your forum is essentially empty still), then it is possible but we would discourage it for anyone other than an expert user with time on their hands.

The complexity is due to the member and usergroup IDs Composr uses being tied to the member and usergroup IDs of the third-party forum, and these being different after changing to Conversr.

To do this you need to:
  1. Lock down the website so only your IP address can access it (outside the scope of Composr – use .htaccess files, for example).
  2. Use the Installation Options editor to set a different forum driver.
  3. Reset all your permissions on the website.
  4. Any Composr systems that reference users will reference different users after switching, as member IDs will have changed: for example, point transactions and admin logs will reference the wrong users. Therefore you'll likely want to do some manually cleaning up of the database (such as deleting point transaction records, to erase the problem).
  5. Re-open access.

Specifics of importers

This section covers some particular limitations of particular importers.

Discussion Forum importing

Compatibility notes in general:
  • personal/private message will be glued together to form Private Topics. This is a very useful feature, and really cleans up an inbox.

phpBB

Compatibility notes:
  • phpBB uses a very strange usergroup configuration, so it is necessary to check your usergroups, permissions and usergroup membership after import. Forum permissions will not import properly.
  • phpBB uses their own HTML entities / tags in post and poll content. Check your forum posts and topic polls to ensure they are rendering correctly after import. You may need to make some manual edits.

vBulletin

Compatibility notes:
  • the vBulletin calendar recurrence system is very different to the Composr calendar recurrence system, so recurrences may not be imported perfectly
  • forms of rating, such as topic rating, karma, and 'goes to coventry', are not imported. However reputation is imported as points.
  • attachments, photos and avatars are extracted from the database to the appropriate Composr uploads directory. It is best to use the live database for the import, because there is a MySQL/vBulletin bug in some MySQL versions that causes binary database data to be corrupted in SQL dumps.

Invision Board

Compatibility notes:
  • Many Invision Board options will not be imported
  • attachments, photos and avatars are moved to the appropriate Composr upload directory

Simple Machine Forum

Compatibility notes:
  • SMF has an advanced IP and member banning system. Please consider the following nuances:
    • The Composr software does not support banning by e-mail (members can easily create more e-mails) or hostname (hostname should be banned at the server or firewall level). Those triggers will be ignored.
    • The Composr software does not support member bans that expire; they will be imported instead as probation time.
    • Non-expiring partial bans on members in SMF where they are restricted on posting but not logging in will be imported as a 'permanent' probation (due to the 32-bit integer problem, the expiration will be 19 January, 2038. See tracker issue #3046).
    • IP triggers in SMF will always be treated as full bans when imported into the Composr software; those IP addresses will not have any access to the site. Expiration will be respected if there is one.
    • IP ranges are not supported in the Composr software; any IP triggers specified as a range in SMF will be ignored. IP range bans can be done on the web server level and are much more efficient than through the Composr software.

Composr merge (advanced)

When merging with another copy of Composr, you should make sure the other copy must be running the same major&minor version.

The 'cms merge' importer can merge multiple Composr websites together that either:
  1. each run on Conversr (and thus, Conversr data gets merged)
  2. or, each share a forum database (what we call a "multi-site-network" situation)

The 'cms merge' importer cannot:
  • Work with anything other than Composr data (third-party forum data can not be merged, for instance)
  • Merge a Composr site into a Composr site that does not use the same forum database
    • unless you are highly technically proficient and capable of manually changing member and usergroup IDs, using a tool such as phpMyAdmin (because these IDs could not be mapped correctly for data that used a 'foreign' forum)
    • or unless both sites run on Conversr (because in this situation, the importer can import everything, and correctly remap any member and usergroup IDs)
  • Import Conversr data directly into a third-party forum
    • because the imported data would end up in the Conversr database tables, regardless of whether they are currently being used for the Composr site's active forum.

Remember:
  • you must specify to import from a Composr database, not a forum database.

Other limitations:
  • The importer does not cover Comcode pages fully, only their metadata. However these are just .txt files (<zone>/pages/comcode_custom/<language>/<page>.txt) that may be copied from one install to another.
  • Shopping orders aren't imported, make sure there are no outstanding orders at the point of importing.
  • Commandr-fs GUIDs and filenames aren't imported, so these will change post-import.
  • Due to a complex cyclic-dependency, usergroup custom fields won't import
  • Comcode ownership is not imported if multi-lang-content is on

Please note that URL and page-link links will not be altered during the import, meaning it is likely they will need updating (because resource IDs change). For example, if a link somewhere linked to download #5, it might need to be changed to link to download #123.

HTML website importer (advanced)

The HTML website importer is a special importer that can import an HTML site that is stored on disk. It is designed for migrating existing static HTML websites into Composr.

It is a very advanced tool that is suitable only for programmers able to tweak the code.

The importer will try and establish what your GLOBAL_HTML_WRAP.tpl template should be, but it cannot be perfect at this. It is also not able to extract panels or menus in a particularly clever way (they all go as static markup in the aforementioned template files), so you should consider your imported site as a base that will require some cleaning.

If you do not have access to the files of your site, other than from the live server, you can download a website using the 'wget' tool. This tool exists on most Linux installs by default, but can also be installed for Mac and Windows.
You run wget using a command like:

Code (Bash)

wget -nc -r <yoururl>
 
and your files from the URL's domain will neatly be saved in a directory underneath the command prompts current working directory named after the domain name. wget works by spidering/crawling your website for files, so it can only download what it finds by following the links that exist on it. Note that it also is not able to find files referenced by your CSS (e.g. background images).

The HTML website importer will try to do the following:
  • Create zones
  • Create Comcode pages
  • Copy over PHP files as pages (mini-modules)
  • Create the GLOBAL_HTML_WRAP.tpl template
  • Try and fix links and file paths to be workable Composr links
  • Copy over other files that are referenced (such as image files), to underneath uploads/website_specific, and fix the URLs accordingly
  • Work out your website name
  • Meta keywords and description, for each page
When you run the importer you will only get an option to import 'Comcode pages'; all the above things are subsumed within that.

The importer uses a sophisticated algorithm to detect what your header and footer is. It isn't 100% perfect however (it is very CPU intensive, and may lock onto markup similarities between comparison pages that should not be universal). If you have a header.txt and/or footer.txt file in your source directory, the importer will consider these the header/footer instead, an use them when it comes to stripping down the pages.

After importing

Image

After importing some data a success screen is shown. Often special messages will be included on this screen.

After importing some data a success screen is shown. Often special messages will be included on this screen.

(Click to enlarge)

If the importer you used copied all relevant files, like avatars, photos and attachments, into Composr's directories, then you can remove the imported product directory in whole.
However, it is advisable to keep the directory, database, and import session, around for a few weeks – just in case any data was not correctly imported and extra maintenance required to put things right: importing is a technically complex process, so it is always best to keep your doors open.

Additional help

As importing may not always go smoothly, you may want to arrange to have a professional developer help with the process. You may wish to contact a developer before import, so that someone can be prepared to assist, or perform the whole process themselves.

See also


Feedback

Please rate this tutorial:

Have a suggestion? Report an issue on the tracker.