Opened 11 months ago

Closed 4 months ago

#1884 closed enhancement (fixed)

Strip phpBB bbcode, bbcode_uid and magic urls during import

Reported by: netweb Owned by: netweb
Priority: normal Milestone: 2.3
Component: Importers Version: 2.0
Severity: normal Keywords: has-patch
Cc:

Description

Currently during an import of data from phpBB all data is processed raw from the source database tables, ideally this BBCode and 'magic_url' should be stripped during import.

BBCode 'quote' with bbcode_uid '177gyb6t'

eg. [quote:177gyb6t]

BBCode 'img' with image width '799 and height '628'

eg. [IMG:799:628]

Truncated URL due to phpBB's magic_url truncating URL >55 characters

eg. http://example.com/subdir/sub/wind … se-preview

Further reading on these phpBB

http://wiki.phpbb.com/Tutorial.Parsing_text
http://wiki.phpbb.com/Strip_bbcode

Attachments (4)

1884.diff (5.7 KB) - added by netweb 5 months ago.
[URL] URL Re-writes now complete
1884.2.diff (12.4 KB) - added by netweb 5 months ago.
Update with fixed _bbp_forum_parent_id and inline docs
1884.3.diff (19.6 KB) - added by netweb 5 months ago.
Fix now strips custom phpBB codes with regex from $field first then the clean $field is parsed with parser.php to decode remaining HTML | 8 new phpBB user profile fields mapped to _usermeta | Whitespace & Inline Docs also updated for import tools consistency.
1884.4.diff (24.7 KB) - added by netweb 4 months ago.
(All previous diffs are included in this latest iteration 1884.4.diff)

Download all attachments as: .zip

Change History (22)

  • Type changed from defect to enhancement

I also have the need for this but with Invision. I imagine the callback should be built into each converter to remove all special code. Perhaps some standard functions like the [quote] shortcode can be converted to a format that bbpress quotes plugin can recognize.

For what it's worth, I've more or less implemented some BBCode clean-up code when I developped my migration plugin from e107 to WordPress+bbPress.

You'll find my code there: http://plugins.trac.wordpress.org/browser/e107-importer/trunk/e107-importer.php#L1289 (see the pre_cleanup_markup() and post_cleanup_markup() methods). This might inspire you for the feature you're requesting.

Last edited 10 months ago by Coolkevman (previous) (diff)

@Coolkevman — Thanks will take a look

Example raw data from phpBB

[quote=someone]test text[/quote]

Parsing http://bbpress.trac.wordpress.org/browser/branches/plugin/bbp-admin/bbp-parser.php#L831 & http://bbpress.trac.wordpress.org/browser/branches/plugin/bbp-admin/bbp-parser.php#L681

(bbp-parser.php is the NBBC BBCode Parser http://sourceforge.net/projects/nbbc/)

$title = "Quote:";

else $title = htmlspecialchars(trim($default)) . " wrote:";
return "\n<div class=\"bbcode_quote\">\n<div class=\"bbcode_quote_head\">"
. $title . "</div>\n<div class=\"bbcode_quote_body\">"
. $content . "</div>\n</div>\n";
}

Resulting in bbPress

<div class="bbcode_quote">
<div class="bbcode_quote_head">someone wrote:</div>
<div class="bbcode_quote_body">test text</div>
</div>

The base bbcode_ CSS classes can be easily styled with CSS, any phpBB BBCodes with a 'bbcode_uid' needs to be stripped during the import/conversion otherwise bbp-parser.php leaves the BBCodes in tact eg. [quote:177gyb6t]

(I'll create a new ticket for bbcode_ CSS class names to be added to bbpress.css & bbpress-rtl.css)

  • Owner set to netweb

Note: 1884.diff is only a 'work in progress'
Hope to finish the horrid last regex in the morning and then upload the finished diff

Due to the vast array of URL formats being imported from phpBB this ticket needs to be put on hold until #1932 is resolved and then the URL 'preg_replace' rules in this ticket can be fully tested.

Will finish this off shortly and watch for any gotchas related to ​#WP23050

  • Milestone changed from 2.3 to 2.4

Moving to 2.4 so we can focus on importers then. If an urgent/compelling patch comes in, we can sneak this in to 2.3.

netweb5 months ago

[URL] URL Re-writes now complete

  • Keywords has-patch added
  • Milestone changed from 2.4 to 2.3

This is now finished ready for commit, moving back to 2.3

netweb5 months ago

Update with fixed _bbp_forum_parent_id and inline docs

@netweb: I tested your patch (1884.2.diff).
I seems that the $phpbb_uid variable is not used in the callback_html function. It should be set to $field, right?
However, if I do so, the conversion of the [quote="xyz"] tags does not work properly for me.

  • Keywords has-patch removed

Hugo,

Thanks for testing it, I will update the patch shorty.

netweb5 months ago

Fix now strips custom phpBB codes with regex from $field first then the clean $field is parsed with parser.php to decode remaining HTML | 8 new phpBB user profile fields mapped to _usermeta | Whitespace & Inline Docs also updated for import tools consistency.

  • Keywords needs-testing added

An FAQ/Known Issues docs page is now on the codex http://codex.bbpress.org/import-forums/phpbb/

Works fine for me now. Thanks a lot!

  • Keywords has-patch added; needs-testing removed

Detailed breakdown of all changes contained in 1884.4.diff (inclusive of all diff's in this ticket)

  • Cleaned whitespace and updated inline docs and phpdoc for consistency accross all import tools.
  • Fixed '_bbp_forum_parent_id'
  • Added Forum topic count -> '_bbp_topic_count'
  • Added Forum reply count -> '_bbp_reply_count'
  • Added Forum total topic count -> '_bbp_total_topic_count'
  • Added Forum status -> '_bbp_status'
  • Fixed Forum dates -> to_type = 'forum' and to_fieldname = 'post_date'
  • Fixed Forum dates -> to_type = 'forum' and to_fieldname = 'post_date_gmt',
  • Fixed Forum dates -> to_type = 'forum' and to_fieldname = 'post_modified'
  • Fixed Forum dates -> to_type = 'forum' and to_fieldname = 'post_modified_gmt'
  • Added Topic reply count -> '_bbp_reply_count',
  • Added Topic total reply count -> '_bbp_total_reply_count'
  • Added Topic date -> '_bbp_last_active_time'
  • Added Topic author ip -> '_bbp_author_ip'
  • Added phpbb user ICQ -> '_bbp_phpbb_user_icq'
  • Added phpbb user MSNM -> '_bbp_phpbb_user_msnm'
  • Added phpbb user Jabber -> 'jabber'
  • Added phpbb user Occupation -> '_bbp_phpbb_user_occ'
  • Added phpbb user Interests -> '_bbp_phpbb_user_interests'
  • Added phpbb user Signature -> _bbp_phpbb_user_sig','
  • Added phpbb user Location -> '_bbp_phpbb_user_from'
  • Added phpbb user avatar filename -> '_bbp_phpbb_user_avatar'
  • Added function callback_topic_reply_count
  • Added function callback_html (Strips custom phpBB 'magic_url' and 'bbcode_uid' first from $field before parsing $field to parser.php)
  • Added bbPress @mentions to converted phpBB [quote] BBCodes

netweb4 months ago

(All previous diffs are included in this latest iteration 1884.4.diff)

  • Resolution set to fixed
  • Status changed from new to closed

(In [4703]) Updates to phpBB converter:

  • Cleaned whitespace and updated inline docs and phpdoc for consistency accross all import tools.
  • Fixed '_bbp_forum_parent_id'
  • Added Forum topic count -> '_bbp_topic_count'
  • Added Forum reply count -> '_bbp_reply_count'
  • Added Forum total topic count -> '_bbp_total_topic_count'
  • Added Forum status -> '_bbp_status'
  • Fixed Forum dates -> to_type = 'forum' and to_fieldname = 'post_date'
  • Fixed Forum dates -> to_type = 'forum' and to_fieldname = 'post_date_gmt',
  • Fixed Forum dates -> to_type = 'forum' and to_fieldname = 'post_modified'
  • Fixed Forum dates -> to_type = 'forum' and to_fieldname = 'post_modified_gmt'
  • Added Topic reply count -> '_bbp_reply_count',
  • Added Topic total reply count -> '_bbp_total_reply_count'
  • Added Topic date -> '_bbp_last_active_time'
  • Added Topic author ip -> '_bbp_author_ip'
  • Added phpbb user ICQ -> '_bbp_phpbb_user_icq'
  • Added phpbb user MSNM -> '_bbp_phpbb_user_msnm'
  • Added phpbb user Jabber -> 'jabber'
  • Added phpbb user Occupation -> '_bbp_phpbb_user_occ'
  • Added phpbb user Interests -> '_bbp_phpbb_user_interests'
  • Added phpbb user Signature -> _bbp_phpbb_user_sig','
  • Added phpbb user Location -> '_bbp_phpbb_user_from'
  • Added phpbb user avatar filename -> '_bbp_phpbb_user_avatar'
  • Added function callback_topic_reply_count
  • Added function callback_html (Strips custom phpBB 'magic_url' and 'bbcode_uid' first from $field before parsing $field to parser.php)
  • Added bbPress @mentions to converted phpBB [quote] BBCodes
  • Props netweb.
  • Fixes #1884.
Note: See TracTickets for help on using tickets.