Composr Tutorial: Importing data into Composr
Written by Chris Graham
An integrated importer system is provided, which allows merging of content stored in any supported data format into a running site. This merging process allows combination of any number of supported websites, to form a single combined website.The importer functionality can be reached from:
Admin Zone > Tools > Import
Importers
At the time of writing, the following software importers are available:- Invision Board 2.0.x (maintenance status)
- MyBB 1.8.x (maintenance status)
- phpBB 3.3.x (maintenance status)
- Simple Machine Forum 2.1.x (maintenance status)
- vBulletin 3.0.x / 3.5.x (maintenance status)
- Wordpress (maintenance status)
- HTML website importer (maintenance status)
- Merge from another copy of the latest version of Composr (maintenance status)
Data format importing
We try and support import of neutral data formats into Composr. For example, CSV spreadsheet files, or importing all the downloads in a directory. These features may help ease a transition and are linked from the importers list also (as shortcuts). Neutral formats are an important part of our approach because having an importer for everything is clearly not viable.Getting a new importer written for you
New importers can be written. It costs around $1000 to develop an importer, but obviously the cost is highly dependent on the software involved. Importers can be developed by any skilled PHP developer (instructions are in the Code Book). A developer quoting to write an importer would probably not be quoting to do a full site conversion – make sure you are clear in what you ask for.Memory limits
Importers may use a lot of memory in order to transfer large amounts of data, so you may need to raise the memory limit on your server or import on a different server then copy your site over. Information on PHP memory limits is included in our FAQ.Using importers
When you have chosen a product to import from, Composr will ask you for some details. Importers work by connecting to the database of the product being imported. In addition, some require the presence of a configuration file for the product at an accessible path on your server, and will auto-detect database settings from this file. It is recommended that you leave your old site installed and running, although perhaps at a moved location, so that the importer can find all the associated files that it may want to import.
It is strongly recommended that you backup your site files and database before running an importer, in case the importer fails in some way (perhaps an incomplete, or unsatisfactory import, or duplication of data by a poorly written third-party importer).
We also recommend that if you are importing data that could contain HTML or XML tags, such as forum posts, the “Subject to a more liberal HTML filter” privilege be turned on for all users. Otherwise imported HTML may not be legible once subjected to Composr's own particular HTML inclusion-list rules.
The importer system is designed to be robust, and is programmed as 're-entrant' code; this means that if installation is halted, via failure, timeout, or cancellation, it can continue from where it left off. This is of particular use if there is an incompatibility between your data and the importer that a programmer needs to fix (such a situation may not be very unlikely due to the wide variation in data for any single product across different versions and deployments). Hopefully an import will go completely smoothly, but it is inherently a complex process.
Sometimes an importer will list further actions that must be taken after import has finished. The following forms of further action are commonly required:
It is strongly recommended that you backup your site files and database before running an importer, in case the importer fails in some way (perhaps an incomplete, or unsatisfactory import, or duplication of data by a poorly written third-party importer).
We also recommend that if you are importing data that could contain HTML or XML tags, such as forum posts, the “Subject to a more liberal HTML filter” privilege be turned on for all users. Otherwise imported HTML may not be legible once subjected to Composr's own particular HTML inclusion-list rules.
The importer system is designed to be robust, and is programmed as 're-entrant' code; this means that if installation is halted, via failure, timeout, or cancellation, it can continue from where it left off. This is of particular use if there is an incompatibility between your data and the importer that a programmer needs to fix (such a situation may not be very unlikely due to the wide variation in data for any single product across different versions and deployments). Hopefully an import will go completely smoothly, but it is inherently a complex process.
Sometimes an importer will list further actions that must be taken after import has finished. The following forms of further action are commonly required:
- stats recalculation (especially for forum importers)
- moving of on-disk files from the imported products upload directory, to Composr's (this is sometimes done automatically, depending on how the importer was written).
Advanced information
Import sessions
The importer system uses a concept of 'import sessions'. These are built on top of the Composr login sessions, and are an important feature in allowing you to merge multiple sites into Composr: they keep the progress and 'ID remap table' from each import separate. The 'choose your session' screen exists so that if your Composr session is lost, you can still resume a previous import.Features, content and dependencies
Importers define a list of features they can import, along with a dependency system to ensure that a feature can only be imported once any features that it is dependent upon have already been imported (for example, forum posts are always dependent on forum topics, and forum topics are always dependent on forums).On the screen listing each feature to import, features are listed in order according to dependencies. A feature might first depend on importing other features listed above it in the list but will never depend on any of the features listed below it. That way, unless you skip features, you shouldn't run into any dependency errors.
Cache-rebuild
Composr is designed so that forms of redundancy, such as parsed Comcode, and various forms of tally, can be recalculated dynamically as Composr runs. Knowing this, and in order to remove load from the importer itself, these tasks are therefore usually omitted by the importer.Converting to Conversr
If you have been running Composr and a third-party forum, and wish to switch to using a complete Composr solution (Composr with Conversr), this is possible if there is a forum importer for your current forum product. The opportunity is presented to move to Conversr as the last importable feature of a forum import, and the function will 'jump' forum drivers for you and remap any usergroup and user IDs. It is still strongly advised to check your permissions after performing this to ensure extra access wasn't accidentally opened up to users.
Converting an existing Composr website manually (experts-only)
If you have installed Composr, and interfaced to a third-party forum, but want to switch to Conversr without an import (because your forum is essentially empty still), then it is possible but we would discourage it for anyone other than an expert user with time on their hands.The complexity is due to the member and usergroup IDs Composr uses being tied to the member and usergroup IDs of the third-party forum, and these being different after changing to Conversr.
To do this you need to:
- Lock down the website so only your IP address can access it (outside the scope of Composr – use .htaccess files, for example).
- Use the Installation Options editor to set a different forum driver.
- Reset all your permissions on the website.
- Any Composr systems that reference users will reference different users after switching, as member IDs will have changed: for example, point transactions and admin logs will reference the wrong users. Therefore you'll likely want to do some manually cleaning up of the database (such as deleting point transaction records, to erase the problem).
- Re-open access.
Specifics of importers
This section covers some particular limitations of particular importers.Discussion Forum importing
Compatibility notes in general:- personal/private message will be glued together to form Private Topics. This is a very useful feature, and really cleans up an inbox.
phpBB
Compatibility notes:- phpBB uses a very strange usergroup configuration, so it is necessary to check your usergroups, permissions and usergroup membership after import. Forum permissions will not import properly.
- phpBB uses their own HTML entities / tags in post and poll content. Check your forum posts and topic polls to ensure they are rendering correctly after import. You may need to make some manual edits.
vBulletin
Compatibility notes:- the vBulletin calendar recurrence system is very different to the Composr calendar recurrence system, so recurrences may not be imported perfectly
- forms of rating, such as topic rating, karma, and 'goes to coventry', are not imported. However reputation is imported as points.
- attachments, photos and avatars are extracted from the database to the appropriate Composr uploads directory. It is best to use the live database for the import, because there is a MySQL/vBulletin bug in some MySQL versions that causes binary database data to be corrupted in SQL dumps.
Invision Board
Compatibility notes:- Many Invision Board options will not be imported
- attachments, photos and avatars are moved to the appropriate Composr upload directory
Simple Machine Forum
Compatibility notes:- SMF has an advanced IP and member banning system. Please consider the following nuances:
- The Composr software does not support banning by e-mail (members can easily create more e-mails) or hostname (hostname should be banned at the server or firewall level). Those triggers will be ignored.
- The Composr software does not support member bans that expire; they will be imported instead as probation time.
- Non-expiring partial bans on members in SMF where they are restricted on posting but not logging in will be imported as a 'permanent' probation (due to the 32-bit integer problem, the expiration will be 19 January, 2038. See tracker issue #3046).
- IP triggers in SMF will always be treated as full bans when imported into the Composr software; those IP addresses will not have any access to the site. Expiration will be respected if there is one.
- IP ranges are not supported in the Composr software; any IP triggers specified as a range in SMF will be ignored. IP range bans can be done on the web server level and are much more efficient than through the Composr software.
Composr merge (advanced)
When merging with another copy of Composr, you should make sure the other copy must be running the same major&minor version.The 'cms merge' importer can merge multiple Composr websites together that either:
- each run on Conversr (and thus, Conversr data gets merged)
- or, each share a forum database (what we call a "multi-site-network" situation)
The 'cms merge' importer cannot:
- Work with anything other than Composr data (third-party forum data can not be merged, for instance)
- Merge a Composr site into a Composr site that does not use the same forum database
- unless you are highly technically proficient and capable of manually changing member and usergroup IDs, using a tool such as phpMyAdmin (because these IDs could not be mapped correctly for data that used a 'foreign' forum)
- or unless both sites run on Conversr (because in this situation, the importer can import everything, and correctly remap any member and usergroup IDs)
- Import Conversr data directly into a third-party forum
- because the imported data would end up in the Conversr database tables, regardless of whether they are currently being used for the Composr site's active forum.
Remember:
- you must specify to import from a Composr database, not a forum database.
Other limitations:
- The importer does not cover Comcode pages fully, only their metadata. However these are just .txt files (<zone>/pages/comcode_custom/<language>/<page>.txt) that may be copied from one install to another.
- Shopping orders aren't imported, make sure there are no outstanding orders at the point of importing.
- Commandr-fs GUIDs and filenames aren't imported, so these will change post-import.
- Due to a complex cyclic-dependency, usergroup custom fields won't import
- Comcode ownership is not imported if multi-lang-content is on
Please note that URL and page-link links will not be altered during the import, meaning it is likely they will need updating (because resource IDs change). For example, if a link somewhere linked to download #5, it might need to be changed to link to download #123.
HTML website importer (advanced)
The HTML website importer is a special importer that can import an HTML site that is stored on disk. It is designed for migrating existing static HTML websites into Composr.It is a very advanced tool that is suitable only for programmers able to tweak the code.
The importer will try and establish what your GLOBAL_HTML_WRAP.tpl template should be, but it cannot be perfect at this. It is also not able to extract panels or menus in a particularly clever way (they all go as static markup in the aforementioned template files), so you should consider your imported site as a base that will require some cleaning.
If you do not have access to the files of your site, other than from the live server, you can download a website using the 'wget' tool. This tool exists on most Linux installs by default, but can also be installed for Mac and Windows.
You run wget using a command like:
Code (Bash)
wget -nc -r <yoururl>
The HTML website importer will try to do the following:
- Create zones
- Create Comcode pages
- Copy over PHP files as pages (mini-modules)
- Create the GLOBAL_HTML_WRAP.tpl template
- Try and fix links and file paths to be workable Composr links
- Copy over other files that are referenced (such as image files), to underneath uploads/website_specific, and fix the URLs accordingly
- Work out your website name
- Meta keywords and description, for each page
The importer uses a sophisticated algorithm to detect what your header and footer is. It isn't 100% perfect however (it is very CPU intensive, and may lock onto markup similarities between comparison pages that should not be universal). If you have a header.txt and/or footer.txt file in your source directory, the importer will consider these the header/footer instead, an use them when it comes to stripping down the pages.
After importing
If the importer you used copied all relevant files, like avatars, photos and attachments, into Composr's directories, then you can remove the imported product directory in whole.However, it is advisable to keep the directory, database, and import session, around for a few weeks – just in case any data was not correctly imported and extra maintenance required to put things right: importing is a technically complex process, so it is always best to keep your doors open.
Additional help
As importing may not always go smoothly, you may want to arrange to have a professional developer help with the process. If you run a website that you are able to justify temporarily hiring us, you may wish to contact a developer before import, so that someone can be prepared to assist, or perform the whole process themselves.See also
Feedback
Please rate this tutorial:
Have a suggestion? Report an issue on the tracker.