MediaWiki-Artikel automatisch in LaTeX konvertieren

Ich würde unsere Wiki-Artikel gerne automatisiert in LaTeX-Dokumente umwandeln, wenn man mal hübsche Ausdrucke braucht. Gerade bei längeren Artikeln wäre das sehr nützlich…

Dazu habe ich die MediaWiki-Extension Wiki2LaTeX gefunden, die schon eine ganze Menge leistet und sehr variabel erweiterbar ist. Was mir aber noch gefehlt hat, ist die Verarbeitung von Bildern. Die Extension bearbeitet diese nämlich nicht.

Daher habe ich mich einmal rangesetzt und eine kleine Klasse entwickelt, die diese Aufgabe übernimmt. Habe es schon auf der Diskussionsseite der Extension veröffentlicht, aber zur Sicherheit kommt’s hier nochmal hin 😉

Image Processing

I have implemented a simple solution for processing of internal images. It searches for the filename in the images/ directory and copies it to an image directory under the tmp/tmp-123... directory. You can add my little helper class with the following steps:

First of all you need to add a few lines to function internalLinkHelper in w2lParser.php before the line // First, check for |:
if ( (stripos($link, "Bild:") === 0) or (stripos($link, "Image:") === 0) ) { $link = str_replace('Bild:', '', $link); $link = str_replace('Image:', '', $link); return "<IMAGE>" . $link . "</IMAGE>"; }
Then you need to include my class file at the top of w2lExporter.php:
require_once('w2lImages.php');
In w2lExporter.php you need to edit function w2l_unknown_action to process the images. I added the following line
$parsed = w2lImages::processImages($parsed, $mytemp);
to the sections where $action is w2lpdf or w2ltex right behind these lines:
$parsed = $parser->parse($to_parse); $mytemp = $helper->path;
And this is my little class file w2lImages.php:
define('W2L_ImageDir', "Bilder"); define('W2L_ImageTitle', "Abbildung"); class w2lImages { public static function processImages($parsed, $mytemp) { $matches = array(); $matchCount = preg_match_all('/<IMAGE>(.*)</IMAGE>/', $parsed, $matches); if ($matchCount > 0) { $cntr = 0; foreach ($matches[1] as $link) { $imgTag = $matches[0][$cntr]; $links = explode("|", $link); $imgFileName = $links[0]; if (strpos($imgFileName, 'jpg') != false) { $imgFile = shell_exec("find ./images -name " . $imgFileName); if ($imgFile) { $imgFiles = explode("n", $imgFile); foreach ($imgFiles as $if) { if (strlen($if) > 0 && strpos($if, "thumb") == false) $imgFile = $if; } if (!file_exists($mytemp . "/" . W2L_ImageDir)) { mkdir($mytemp . "/" . W2L_ImageDir, 0777); } $copied = copy($imgFile, $mytemp . "/" . W2L_ImageDir . "/" . $imgFileName); if ($copied) { $imgCaption = (isset($links[1])) ? $links[1] : W2L_ImageTitle . " " . $cntr; $imgLatex = 'begin{figure}[htb]' . "n"; $imgLatex .= 'centering' . "n"; $imgLatex .= 'includegraphics[width=textwidth]{'. $imgFileName . '}' . "n"; $imgLatex .= 'caption{' . $imgCaption . '}' . "n"; $imgLatex .= 'label{fig:' . W2L_ImageTitle . $cntr . '}' . "n"; $imgLatex .= 'end{figure}' . "n"; $parsed = str_replace($imgTag, $imgLatex, $parsed); } else { $parsed = str_replace($imgTag, "Image could not be copied: " . $imgFileName, $parsed); } } } else { $parsed = str_replace($imgTag, "Image is not a JPG: " . $imgFileName, $parsed); } $cntr++; } return $parsed; } else { return $parsed; } } }

This will result in images included like this:
\\begin{figure}[htb] \\centering \\includegraphics[width=\\textwidth]{ImageName.jpg} \\caption{Abbildung 1} \\label{fig:Abbildung1} \\end{figure}
Any comments are welcome 😉

2 Kommentare

Dirk Hünniger
16. Juni 2012 um 11:02:08

Zum export vom MediaWiki Artikeln nach LaTeX verwende ich:

http://de.wikibooks.org/wiki/Benutzer:Dirk_Huenniger/wb2pdf

dieses verarbeitet auch Bilder.

Leider musste ich den Link zerdeppern damit er durch den Spamfilter kommt.
Stefan
16. Juni 2012 um 20:46:18

Hallo Dirk. Danke für den Tipp. Habe den Link korrigiert.

Schreibe einen Kommentar

Deine E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind mit * markiert

To create code blocks or other preformatted text, indent by four spaces:

    This will be displayed in a monospaced font. The first four 
    spaces will be stripped off, but all other whitespace
    will be preserved.
    
    Markdown is turned off in code blocks:
     [This is not a link](http://example.com)

To create not a block, but an inline code span, use backticks:

Here is some inline `code`.

For more help see http://daringfireball.net/projects/markdown/syntax

Kommentar

Name*

E-Mail*

Website

Mit der Nutzung dieses Formulars erklärst du dich mit der Speicherung und Verarbeitung deiner Daten durch diese Website einverstanden. *

MediaWiki-Artikel automatisch in LaTeX konvertieren

Image Processing

Über Stefan

2 Kommentare

Schreibe einen Kommentar

Suche

Werbung

Meta