1
package Text::ASCIIMathML;
3
# The Text::ASCIIMathML module is copyright (c) 2006 Mark Nodine,
4
# USA. All rights reserved.
6
# You may use and distribute them under the terms of either the GNU
7
# General Public License or the Artistic License, as specified in the
10
# Included in the MultiMarkdown package by Fletcher T. Penney
12
# $Id: ASCIIMathML.pm 499 2008-03-23 13:03:19Z fletcher $
14
# MultiMarkdown Version 2.0.b6
20
Text::ASCIIMathML - Perl extension for parsing ASCIIMathML text into MathML
24
use Text::ASCIIMathML;
26
$parser=new Text::ASCIIMathML();
28
$parser->SetAttributes(ForMoz => 1);
30
$ASCIIMathML = "int_0^1 e^x dx";
31
$mathML = $parser->TextToMathML($ASCIIMathML);
32
$mathML = $parser->TextToMathML($ASCIIMathML, [title=>$ASCIIMathML]);
33
$mathML = $parser->TextToMathML($ASCIIMathML, undef, [displaystyle=>1]);
35
$mathMLTree = $parser->TextToMathMLTree($ASCIIMathML);
36
$mathMLTree = $parser->TextToMathMLTree($ASCIIMathML, [title=>$ASCIIMathML]);
37
$mathMLTree = $parser->TextToMathMLTree($ASCIIMathML,undef,[displaystyle=>1]);
39
$mathML = $mathMLTree->text();
40
$latex = $mathMLTree->latex();
44
Text::ASCIIMathML is a parser for ASCIIMathML text which produces
45
MathML XML markup strings that are suitable for rendering by any
46
MathML-compliant browser.
48
The parser uses the following attributes which are settable through
49
the SetAttributes method:
55
Specifies that the fonts should be optimized for Netscape/Mozilla/Firefox.
59
The output of the TextToMathML method always follows the schema
60
<math><mstyle>...</mstyle></math>
61
The first argument of TextToMathML is the ASCIIMathML text to be
62
parsed into MathML. The second argument is a reference to an array of
63
attribute/value pairs to be attached to the <math> node and the third
64
argument is a reference to an array of attribute/value pairs for the
65
<mstyle> node. Common attributes for the <math> node are "title" and
66
"xmlns"=>"&mathml;". Common attributes for the <mstyle> node are
67
"mathcolor" (for text color), "displaystyle"=>"true" for using display
68
style instead of inline style, and "fontfamily".
70
=head2 ASCIIMathML markup
72
The syntax is very permissive and does not generate syntax
73
errors. This allows mathematically incorrect expressions to be
74
displayed, which is important for teaching purposes. It also causes
75
less frustration when previewing formulas.
77
If you encode 'x^2' or 'a_(mn)' or 'a_{mn}' or '(x+1)/y' or 'sqrtx',
78
you pretty much get what you expect. The choice of grouping
79
parenthesis is up to you (they don't have to match either). If the
80
displayed expression can be parsed uniquely without them, they are
81
omitted. Most LaTeX commands are also supported, so the last two
82
formulas above can also be written as '\frac{x+1}{y}' and '\sqrt{x}'.
84
The parser uses no operator precedence and only respects the grouping
85
brackets, subscripts, superscript, fractions and (square) roots. This
86
is done for reasons of efficiency and generality. The resulting MathML
87
code can quite easily be processed further to ensure additional
88
syntactic requirements of any particular application.
92
Here is a definition of the grammar used to parse
93
ASCIIMathML expressions. In the Backus-Naur form given below, the
94
letter on the left of the C<::=> represents a category of symbols that
95
could be one of the possible sequences of symbols listed on the right.
96
The vertical bar C<|> separates the alternatives.
100
c ::= [A-z] | numbers | greek letters | other constant symbols
102
u ::= 'sqrt' | 'text' | 'bb' | other unary symbols for font commands
103
b ::= 'frac' | 'root' | 'stackrel' | 'newcommand' | 'newsymbol'
105
l ::= ( | [ | { | (: | {: left brackets
106
r ::= ) | ] | } | :) | :} right brackets
107
S ::= c | lEr | uS | bSS | "any" simple expression
108
E ::= SE | S/S |S_S | S^S | S_S^S expression (fraction, sub-,
109
super-, subsuperscript)
113
=head3 The translation rules
115
Each terminal symbol is translated into a corresponding MathML
116
node. The constants are mostly converted to their respective Unicode
117
symbols. The other expressions are converted as follows:
121
lSr -> <mrow>lSr</mrow>
122
(note that any pair of brackets can be used to
123
delimit subexpressions, they don't have to match)
124
sqrt S -> <msqrt>S'</msqrt>
125
text S -> <mtext>S'</mtext>
126
"any" -> <mtext>any</mtext>
127
frac S1 S2 -> <mfrac>S1' S2'</mfrac>
128
root S1 S2 -> <mroot>S2' S1'</mroot>
129
stackrel S1 S2 -> <mover>S2' S1'</mover>
130
S1/S2 -> <mfrac>S1' S2'</mfrac>
131
S1_S2 -> <msub>S1 S2'</msub>
132
S1^S2 -> <msup>S1 S2'</msup>
133
S1_S2^S3 -> <msubsup>S1 S2' S3'</msubsup> or
134
<munderover>S1 S2' S3'</munderover> (in some cases)
138
In the rules above, the expression C<S'> is the same as C<S>, except that if
139
C<S> has an outer level of brackets, then C<S'> is the expression inside
144
A simple syntax for matrices is also recognized:
146
l(S11,...,S1n),(...),(Sm1,...,Smn)r
148
l[S11,...,S1n],[...],[Sm1,...,Smn]r.
150
Here C<l> and C<r> stand for any of the left and right
151
brackets (just like in the grammar they do not have to match). Both of
152
these expressions are translated to
154
<mrow>l<mtable><mtr><mtd>S11</mtd>...
155
<mtd>S1n</mtd></mtr>...
156
<mtr><mtd>Sm1</mtd>...
157
<mtd>Smn</mtd></mtr></mtable>r</mrow>.
159
Note that each row must have the same number of expressions, and there
160
should be at least two rows.
162
LaTeX matrix commands are not recognized.
166
The input formula is broken into tokens using a "longest matching
167
initial substring search". Suppose the input formula has been
168
processed from left to right up to a fixed position. The longest
169
string from the list of constants (given below) that matches the
170
initial part of the remainder of the formula is the next token. If
171
there is no matching string, then the first character of the remainder
172
is the next token. The symbol table at the top of the ASCIIMathML.js
173
script specifies whether a symbol is a math operator (surrounded by a
174
C<< <mo> >> tag) or a math identifier (surrounded by a C<< <mi> >>
175
tag). For single character tokens, letters are treated as math
176
identifiers, and non-alphanumeric characters are treated as math
177
operators. For digits, see "Numbers" below.
179
Spaces are significant when they separate characters and thus prevent
180
a certain string of characters from matching one of the
181
constants. Multiple spaces and end-of-line characters are equivalent
186
A string of digits, optionally followed by a decimal point (a period)
187
and another string of digits, is parsed as a single token and
188
converted to a MathML number, i.e., enclosed with the C<< <mn> >>
195
=item Lowercase letters
197
C<alpha> C<beta> C<chi> C<delta> C<epsilon> C<eta> C<gamma> C<iota>
198
C<kappa> C<lambda> C<mu> C<nu> C<omega> C<phi> C<pi> C<psi> C<rho>
199
C<sigma> C<tau> C<theta> C<upsilon> C<xi> C<zeta>
201
=item Uppercase letters
203
C<Delta> C<Gamma> C<Lambda> C<Omega> C<Phi> C<Pi> C<Psi> C<Sigma>
208
C<varepsilon> C<varphi> C<vartheta>
212
=head3 Standard functions
214
sin cos tan csc sec cot sinh cosh tanh log ln det dim lim mod gcd lcm
217
=head3 Operation symbols
219
Type Description Entity
226
xx Cross product ×
227
-: Divided by ÷
228
@ Compose functions ∘
229
o+ Circle with plus ⊕
230
ox Circle with x ⊗
231
o. Circle with dot ⊙
232
sum Sum for sub- and superscript ∑
233
prod Product for sub- and superscript ∏
235
^^^ Logic "and" for sub- and superscript ⋀
237
vvv Logic "or" for sub- and superscript ⋁
238
nn Logic "intersect" ∩
239
nnn Logic "intersect" for sub- and superscript ⋂
240
uu Logic "union" ∪
241
uuu Logic "union" for sub- and superscript ⋃
243
=head3 Relation symbols
245
Type Description Entity
250
<= Less than or equal ≤
251
>= Greater than or equal ≥
252
-lt Precedes ≺
253
>- Succeeds ≻
255
!in Not an element of ∉
258
sube Subset or equal ⊆
259
supe Superset or equal ⊇
260
-= Equivalent ≡
261
~= Congruent to ≅
262
~~ Asymptotically equal to ≈
263
prop Proportional to ∝
265
=head3 Logical symbols
267
Type Description Entity
273
iff If and only if ⇔
275
EE There exists ∃
276
_|_ Perpendicular, bottom ⊥
278
|-- Right tee ⊢
279
|== Double right tee ⊨
281
=head3 Grouping brackets
283
Type Description Entity
290
(: Left angle bracket ⟨
291
:) Right angle bracket ⟩
292
{: Invisible left grouping element
293
:} Invisible right grouping element
295
=head3 Miscellaneous symbols
297
Type Description Entity
299
oint Countour integral ∮
300
del Partial derivative &del;
301
grad Gradient ∇
302
+- Plus or minus ±
305
aleph Hebrew letter aleph ℵ
307
:. Therefore ∴
309
cdots Three centered dots ⋯
310
\<sp> Non-breaking space (<sp> means space)
311
quad Quad space
312
diamond Diamond ⋄
313
square Square □
314
|__ Left floor ⌊
315
__| Right floor ⌋
316
|~ Left ceiling ⌈
317
~| Right ceiling ⌉
318
CC Complex numbers ℂ
319
NN Natural numbers ℕ
320
QQ Rational numbers ℚ
321
RR Real numbers ℝ
326
Type Description Entity
328
darr Down arrow ↓
329
rarr Right arrow →
330
-> Right arrow →
331
larr Left arrow ←
332
harr Horizontal (two-way) arrow ↔
333
rArr Right double arrow ⇒
334
lArr Left double arrow ⇐
335
hArr Horizontal double arrow ⇔
339
Type Description Output
340
hat x Hat over x <mover><mi>x</mi><mo>^</mo></mover>
341
bar x Bar over x <mover><mi>x</mi><mo>¯</mo></mover>
342
ul x Underbar under x <munder><mi>x</mi><mo>_</mo></munder>
343
vec x Right arrow over x <mover><mi>x</mi><mo>→</mo><mover>
344
dot x Dot over x <mover><mi>x</mi><mo>.</mo><mover>
345
ddot x Double dot over x <mover><mi>x</mi><mo>..</mo><mover>
351
bbb A Double-struck A
352
cc A Calligraphic (script) A
353
tt A Teletype (monospace) A
357
=head3 Defining new commands and symbols
359
It is possible to define new commands and symbols using the
360
'newcommand' and 'newsymbol' binary operators. The former defines a
361
macro that gets expanded and reparsed as ASCIIMathML and the latter
362
defines a constant that gets used as a math operator (C<< <mo> >>)
363
element. Both of the arguments must be text, optionally enclosed in
364
grouping operators. The 'newsymbol' operator also allows the
365
second argument to be a group of two text strings where the first is
366
the mathml operator and the second is the latex code to be output.
368
For example, 'newcommand "DDX" "{:d/dx:}"' would define a new command
369
'DDX'. It could then be invoked like 'DDXf(x)', which would
370
expand to '{:d/dx:}f(x)'. The text 'newsymbol{"!le"}{"≰"}'
371
could be used to create a symbol you could invoke with '!le', as in 'a
374
=head2 Attributes for <math>
380
The title attribute for the element, if specified. In many browsers,
381
this string will appear if you hover over the MathML markup.
385
The id attribute for the element, if specified.
389
The class attribute for the element, if specified.
393
=head2 Attributes for <mstyle>
399
The displaystyle attribute for the element, if specified. One of the
400
values "true" or "false". If the displaystyle is false, then fractions
401
are represented with a smaller font size and the placement of
402
subscripts and superscripts of sums and integrals changes.
406
The mathvariant attribute for the element, if specified. One of the
407
values "normal", "bold", "italic", "bold-italic", "double-struck",
408
"bold-fraktur", "script", "bold-script", "fraktur", "sans-serif",
409
"bold-sans-serif", "sans-serif-italic", "sans-serif-bold-italic", or
414
The mathsize attribute for the element, if specified. Either "small",
415
"normal" or "big", or of the form "number v-unit".
419
A string representing the font family.
423
The mathcolor attribute for the element, if specified. It be in one of
424
the forms "#rgb" or "#rrggbb", or should be an html-color-name.
426
=item C<mathbackground>
428
The mathbackground attribute for the element, if specified. It should
429
be in one of the forms "#rgb" or "#rrggbb", or an html-color-name, or
430
the keyword "transparent".
433
=head1 BUGS AND SUGGESTIONS
435
If you find bugs, think of anything that could improve Text::ASCIIMathML
436
or have any questions related to it, feel free to contact the author.
440
Mark Nodine <mnodine@alum.mit.edu>
445
<http://www1.chapman.edu/~jipsen/mathml/asciimathsyntax.xml>
447
=head1 ACKNOWLEDGEMENTS
449
This Perl module has been created by modifying Peter Jipsen's
450
ASCIIMathML.js script. He deserves full credit for the original
451
implementation; any bugs have probably been introduced by me.
455
The Text::ASCIIMathML module is copyright (c) 2006 Mark Nodine,
456
USA. All rights reserved.
458
You may use and distribute them under the terms of either the GNU
459
General Public License or the Artistic License, as specified in the
469
# Creates a new Text::ASCIIMathML parser object
472
return bless {}, $class;
475
# Sets an attribute to a given value
476
# Arguments: Attribute name, attribute value
478
# Supported attributes:
479
# ForMoz Boolean to optimize for Netscape/Mozilla/Firefox
480
sub SetAttribute : method {
481
my ($self, $attr, $val) = @_;
482
$self->{attr}{$attr} = $val;
485
# Converts an AsciiMathML string to a MathML one
486
# Arguments: AsciiMathML string,
487
# optional ref to array of attribute/value pairs for math node,
488
# optional ref to array of attribute/value pairs for mstyle node
489
# Returns: MathML string
490
sub TextToMathML : method {
491
my $tree = TextToMathMLTree(@_);
492
return $tree ? $tree->text : '';
495
# Converts an AsciiMathML string to a tree of MathML nodes
496
# Arguments: AsciiMathML string,
497
# optional ref to array of attribute/value pairs for math node,
498
# optional ref to array of attribute/value pairs for mstyle node
499
# Returns: top Text::ASCIIMathML::Node object or undefined
500
sub TextToMathMLTree : method {
501
my ($self, $expr, $mathAttr, $mstyleAttr) = @_;
502
$expr = '' unless defined $expr;
503
my $mstyle = $self->_createElementMathML('mstyle');
504
$mstyle->setAttribute(@$mstyleAttr) if $mstyleAttr;
505
$self->{nestingDepth} = 0;
507
$mstyle->appendChild(($self->_parseExpr($expr, 0))[0]);
508
return unless $mstyle->childNodes > 0;
509
my $math = $self->_createMmlNode('math', $mstyle);
511
$math->setAttribute(@$mathAttr) if $mathAttr;
518
# Creates an Text::ASCIIMathML::Node object with no tag
520
# Returns: node object
521
sub _createDocumentFragment : method {
523
return Text::ASCIIMathML::Node->new($self);
526
# Creates an Text::ASCIIMathML::Node object
528
# Returns: node object
529
sub _createElementMathML : method {
531
return Text::ASCIIMathML::Node->new($self, $t);
534
# Creates an Text::ASCIIMathML::Node object and appends a node as a child
535
# Arguments: tag, node
536
# Returns: node object
537
sub _createMmlNode : method {
538
my ($self, $t, $obj) = @_;
539
my $node = Text::ASCIIMathML::Node->new($self, $t);
540
$node->appendChild($obj);
544
# Creates an Text::ASCIIMathML::Node text object with the given text
546
# Returns: node object
547
sub _createTextNode : method {
548
my ($self, $text) = @_;
549
return Text::ASCIIMathML::Node->newText ($self, $text);
552
# Finds maximal initial substring of str that appears in names
553
# return null if there is none
555
# Returns: matched input, entry from AMSymbol (if any)
556
sub _getSymbol : method {
558
my ($input, $symbol) = $self->_getSymbol_(@_);
559
$self->{previousSymbol} = $symbol->{ttype} if $symbol;
560
return $input, $symbol;
564
# character lists for Mozilla/Netscape fonts
565
my $AMcal = [0xEF35,0x212C,0xEF36,0xEF37,0x2130,0x2131,0xEF38,0x210B,0x2110,0xEF39,0xEF3A,0x2112,0x2133,0xEF3B,0xEF3C,0xEF3D,0xEF3E,0x211B,0xEF3F,0xEF40,0xEF41,0xEF42,0xEF43,0xEF44,0xEF45,0xEF46];
566
my $AMfrk = [0xEF5D,0xEF5E,0x212D,0xEF5F,0xEF60,0xEF61,0xEF62,0x210C,0x2111,0xEF63,0xEF64,0xEF65,0xEF66,0xEF67,0xEF68,0xEF69,0xEF6A,0x211C,0xEF6B,0xEF6C,0xEF6D,0xEF6E,0xEF6F,0xEF70,0xEF71,0x2128];
567
my $AMbbb = [0xEF8C,0xEF8D,0x2102,0xEF8E,0xEF8F,0xEF90,0xEF91,0x210D,0xEF92,0xEF93,0xEF94,0xEF95,0xEF96,0x2115,0xEF97,0x2119,0x211A,0x211D,0xEF98,0xEF99,0xEF9A,0xEF9B,0xEF9C,0xEF9D,0xEF9E,0x2124];
569
# Create closure for static variables
571
"sqrt" => { tag=>"msqrt", output=>"sqrt", tex=>'', ttype=>"UNARY" },
572
"root" => { tag=>"mroot", output=>"root", tex=>'', ttype=>"BINARY" },
573
"frac" => { tag=>"mfrac", output=>"/", tex=>'', ttype=>"BINARY" },
574
"/" => { tag=>"mfrac", output=>"/", tex=>'', ttype=>"INFIX" },
575
"stackrel" => { tag=>"mover", output=>"stackrel", tex=>'', ttype=>"BINARY" },
576
"_" => { tag=>"msub", output=>"_", tex=>'', ttype=>"INFIX" },
577
"^" => { tag=>"msup", output=>"^", tex=>'', ttype=>"INFIX" },
578
"text" => { tag=>"mtext", output=>"text", tex=>'', ttype=>"TEXT" },
579
"mbox" => { tag=>"mtext", output=>"mbox", tex=>'', ttype=>"TEXT" },
580
"\"" => { tag=>"mtext", output=>"mbox", tex=>'', ttype=>"TEXT" },
583
"newcommand" => { ttype=>"BINARY"},
584
"newsymbol" => { ttype=>"BINARY" },
587
"alpha" => { tag=>"mi", output=>"α", tex=>'', ttype=>"CONST" },
588
"beta" => { tag=>"mi", output=>"β", tex=>'', ttype=>"CONST" },
589
"chi" => { tag=>"mi", output=>"χ", tex=>'', ttype=>"CONST" },
590
"delta" => { tag=>"mi", output=>"δ", tex=>'', ttype=>"CONST" },
591
"Delta" => { tag=>"mo", output=>"Δ", tex=>'', ttype=>"CONST" },
592
"epsi" => { tag=>"mi", output=>"ε", tex=>"epsilon", ttype=>"CONST" },
593
"varepsilon" => { tag=>"mi", output=>"ɛ", tex=>'', ttype=>"CONST" },
594
"eta" => { tag=>"mi", output=>"η", tex=>'', ttype=>"CONST" },
595
"gamma" => { tag=>"mi", output=>"γ", tex=>'', ttype=>"CONST" },
596
"Gamma" => { tag=>"mo", output=>"Γ", tex=>'', ttype=>"CONST" },
597
"iota" => { tag=>"mi", output=>"ι", tex=>'', ttype=>"CONST" },
598
"kappa" => { tag=>"mi", output=>"κ", tex=>'', ttype=>"CONST" },
599
"lambda" => { tag=>"mi", output=>"λ", tex=>'', ttype=>"CONST" },
600
"Lambda" => { tag=>"mo", output=>"Λ", tex=>'', ttype=>"CONST" },
601
"mu" => { tag=>"mi", output=>"μ", tex=>'', ttype=>"CONST" },
602
"nu" => { tag=>"mi", output=>"ν", tex=>'', ttype=>"CONST" },
603
"omega" => { tag=>"mi", output=>"ω", tex=>'', ttype=>"CONST" },
604
"Omega" => { tag=>"mo", output=>"Ω", tex=>'', ttype=>"CONST" },
605
"phi" => { tag=>"mi", output=>"ϕ", tex=>'', ttype=>"CONST" },
606
"varphi" => { tag=>"mi", output=>"φ", tex=>'', ttype=>"CONST" },
607
"Phi" => { tag=>"mo", output=>"Φ", tex=>'', ttype=>"CONST" },
608
"pi" => { tag=>"mi", output=>"π", tex=>'', ttype=>"CONST" },
609
"Pi" => { tag=>"mo", output=>"Π", tex=>'', ttype=>"CONST" },
610
"psi" => { tag=>"mi", output=>"ψ", tex=>'', ttype=>"CONST" },
611
"Psi" => { tag=>"mi", output=>"Ψ", tex=>'', ttype=>"CONST" },
612
"rho" => { tag=>"mi", output=>"ρ", tex=>'', ttype=>"CONST" },
613
"sigma" => { tag=>"mi", output=>"σ", tex=>'', ttype=>"CONST" },
614
"Sigma" => { tag=>"mo", output=>"Σ", tex=>'', ttype=>"CONST" },
615
"tau" => { tag=>"mi", output=>"τ", tex=>'', ttype=>"CONST" },
616
"theta" => { tag=>"mi", output=>"θ", tex=>'', ttype=>"CONST" },
617
"vartheta" => { tag=>"mi", output=>"ϑ", tex=>'', ttype=>"CONST" },
618
"Theta" => { tag=>"mo", output=>"Θ", tex=>'', ttype=>"CONST" },
619
"upsilon" => { tag=>"mi", output=>"υ", tex=>'', ttype=>"CONST" },
620
"xi" => { tag=>"mi", output=>"ξ", tex=>'', ttype=>"CONST" },
621
"Xi" => { tag=>"mo", output=>"Ξ", tex=>'', ttype=>"CONST" },
622
"zeta" => { tag=>"mi", output=>"ζ", tex=>'', ttype=>"CONST" },
624
# binary operation symbols
625
"*" => { tag=>"mo", output=>"⋅", tex=>"cdot", ttype=>"CONST" },
626
"**" => { tag=>"mo", output=>"⋆", tex=>"star", ttype=>"CONST" },
627
"//" => { tag=>"mo", output=>"/", tex=>'', ttype=>"CONST" },
628
"\\\\" => { tag=>"mo", output=>"\\", tex=>"backslash", ttype=>"CONST" },
629
"setminus" => { tag=>"mo", output=>"\\", tex=>'', ttype=>"CONST" },
630
"xx" => { tag=>"mo", output=>"×", tex=>"times", ttype=>"CONST" },
631
"-:" => { tag=>"mo", output=>"÷", tex=>"div", ttype=>"CONST" },
632
"@" => { tag=>"mo", output=>"∘", tex=>"circ", ttype=>"CONST" },
633
"o+" => { tag=>"mo", output=>"⊕", tex=>"oplus", ttype=>"CONST" },
634
"ox" => { tag=>"mo", output=>"⊗", tex=>"otimes", ttype=>"CONST" },
635
"o." => { tag=>"mo", output=>"⊙", tex=>"odot", ttype=>"CONST" },
636
"sum" => { tag=>"mo", output=>"∑", tex=>'', ttype=>"UNDEROVER" },
637
"prod" => { tag=>"mo", output=>"∏", tex=>'', ttype=>"UNDEROVER" },
638
"^^" => { tag=>"mo", output=>"∧", tex=>"wedge", ttype=>"CONST" },
639
"^^^" => { tag=>"mo", output=>"⋀", tex=>"bigwedge", ttype=>"UNDEROVER" },
640
"vv" => { tag=>"mo", output=>"∨", tex=>"vee", ttype=>"CONST" },
641
"vvv" => { tag=>"mo", output=>"⋁", tex=>"bigvee", ttype=>"UNDEROVER" },
642
"nn" => { tag=>"mo", output=>"∩", tex=>"cap", ttype=>"CONST" },
643
"nnn" => { tag=>"mo", output=>"⋂", tex=>"bigcap", ttype=>"UNDEROVER" },
644
"uu" => { tag=>"mo", output=>"∪", tex=>"cup", ttype=>"CONST" },
645
"uuu" => { tag=>"mo", output=>"⋃", tex=>"bigcup", ttype=>"UNDEROVER" },
647
# binary relation symbols
648
"!=" => { tag=>"mo", output=>"≠", tex=>"ne", ttype=>"CONST" },
649
":=" => { tag=>"mo", output=>":=", tex=>'', ttype=>"CONST" },
650
#"lt" => { tag=>"mo", output=>"<", tex=>'', ttype=>"CONST" },
651
"lt" => { tag=>"mo", output=>"<", tex=>'', ttype=>"CONST" },
652
"<=" => { tag=>"mo", output=>"≤", tex=>"le", ttype=>"CONST" },
653
"lt=" => { tag=>"mo", output=>"≤", tex=>"leq", ttype=>"CONST", latex=>1 },
654
">=" => { tag=>"mo", output=>"≥", tex=>"ge", ttype=>"CONST" },
655
"geq" => { tag=>"mo", output=>"≥", tex=>'', ttype=>"CONST", latex=>1 },
656
"-<" => { tag=>"mo", output=>"≺", tex=>"prec", ttype=>"CONST", latex=>1 },
657
"-lt" => { tag=>"mo", output=>"≺", tex=>'', ttype=>"CONST" },
658
">-" => { tag=>"mo", output=>"≻", tex=>"succ", ttype=>"CONST" },
659
"in" => { tag=>"mo", output=>"∈", tex=>'', ttype=>"CONST" },
660
"!in" => { tag=>"mo", output=>"∉", tex=>"notin", ttype=>"CONST" },
661
"sub" => { tag=>"mo", output=>"⊂", tex=>"subset", ttype=>"CONST" },
662
"sup" => { tag=>"mo", output=>"⊃", tex=>"supset", ttype=>"CONST" },
663
"sube" => { tag=>"mo", output=>"⊆", tex=>"subseteq", ttype=>"CONST" },
664
"supe" => { tag=>"mo", output=>"⊇", tex=>"supseteq", ttype=>"CONST" },
665
"-=" => { tag=>"mo", output=>"≡", tex=>"equiv", ttype=>"CONST" },
666
"~=" => { tag=>"mo", output=>"≅", tex=>"cong", ttype=>"CONST" },
667
"~~" => { tag=>"mo", output=>"≈", tex=>"approx", ttype=>"CONST" },
668
"prop" => { tag=>"mo", output=>"∝", tex=>"propto", ttype=>"CONST" },
671
"<" => { tag=>"mo", output=>"<", tex=>'', ttype=>"CONST" },
672
"gt" => { tag=>"mo", output=>">", tex=>'', ttype=>"CONST" },
673
">" => { tag=>"mo", output=>">", tex=>'', ttype=>"CONST" },
676
"and" => { tag=>"mtext", output=>"and", tex=>'', ttype=>"SPACE" },
677
"or" => { tag=>"mtext", output=>"or", tex=>'', ttype=>"SPACE" },
678
"not" => { tag=>"mo", output=>"¬", tex=>"neg", ttype=>"CONST" },
679
"=>" => { tag=>"mo", output=>"⇒", tex=>"implies", ttype=>"CONST" },
680
"if" => { tag=>"mo", output=>"if", tex=>'if', ttype=>"SPACE" },
681
"<=>" => { tag=>"mo", output=>"⇔", tex=>"iff", ttype=>"CONST" },
682
"AA" => { tag=>"mo", output=>"∀", tex=>"forall", ttype=>"CONST" },
683
"EE" => { tag=>"mo", output=>"∃", tex=>"exists", ttype=>"CONST" },
684
"_|_" => { tag=>"mo", output=>"⊥", tex=>"bot", ttype=>"CONST" },
685
"TT" => { tag=>"mo", output=>"⊤", tex=>"top", ttype=>"CONST" },
686
"|--" => { tag=>"mo", output=>"⊢", tex=>"vdash", ttype=>"CONST" },
687
"|==" => { tag=>"mo", output=>"⊨", tex=>"models", ttype=>"CONST" },
690
"(" => { tag=>"mo", output=>"(", tex=>'', ttype=>"LEFTBRACKET" },
691
")" => { tag=>"mo", output=>")", tex=>'', ttype=>"RIGHTBRACKET" },
692
"[" => { tag=>"mo", output=>"[", tex=>'', ttype=>"LEFTBRACKET" },
693
"]" => { tag=>"mo", output=>"]", tex=>'', ttype=>"RIGHTBRACKET" },
694
"{" => { tag=>"mo", output=>"{", tex=>'', ttype=>"LEFTBRACKET" },
695
"}" => { tag=>"mo", output=>"}", tex=>'', ttype=>"RIGHTBRACKET" },
696
"|" => { tag=>"mo", output=>"|", tex=>'', ttype=>"LEFTRIGHT" },
697
# {input:"||", tag:"mo", output:"||", tex:null, ttype:LEFTRIGHT},
698
"(:" => { tag=>"mo", output=>"〈", tex=>"langle", ttype=>"LEFTBRACKET" },
699
":)" => { tag=>"mo", output=>"〉", tex=>"rangle", ttype=>"RIGHTBRACKET" },
700
"<<" => { tag=>"mo", output=>"〈", tex=>'langle', ttype=>"LEFTBRACKET" },
701
">>" => { tag=>"mo", output=>"〉", tex=>'rangle', ttype=>"RIGHTBRACKET" },
702
"{:" => { tag=>"mo", output=>"{:", tex=>'', ttype=>"LEFTBRACKET", invisible=>"true" },
703
":}" => { tag=>"mo", output=>":}", tex=>'', ttype=>"RIGHTBRACKET", invisible=>"true" },
705
# miscellaneous symbols
706
"int" => { tag=>"mo", output=>"∫", tex=>'', ttype=>"CONST" },
707
"dx" => { tag=>"mi", output=>"{:d x:}", tex=>'', ttype=>"DEFINITION" },
708
"dy" => { tag=>"mi", output=>"{:d y:}", tex=>'', ttype=>"DEFINITION" },
709
"dz" => { tag=>"mi", output=>"{:d z:}", tex=>'', ttype=>"DEFINITION" },
710
"dt" => { tag=>"mi", output=>"{:d t:}", tex=>'', ttype=>"DEFINITION" },
711
"oint" => { tag=>"mo", output=>"∮", tex=>'', ttype=>"CONST" },
712
"del" => { tag=>"mo", output=>"∂", tex=>"partial", ttype=>"CONST" },
713
"grad" => { tag=>"mo", output=>"∇", tex=>"nabla", ttype=>"CONST" },
714
"+-" => { tag=>"mo", output=>"±", tex=>"pm", ttype=>"CONST" },
715
"O/" => { tag=>"mo", output=>"∅", tex=>"emptyset", ttype=>"CONST" },
716
"oo" => { tag=>"mo", output=>"∞", tex=>"infty", ttype=>"CONST" },
717
"aleph" => { tag=>"mo", output=>"ℵ", tex=>'', ttype=>"CONST" },
718
"..." => { tag=>"mo", output=>"...", tex=>"ldots", ttype=>"CONST" },
719
":." => { tag=>"mo", output=>"∴", tex=>"therefore", ttype=>"CONST" },
720
"/_" => { tag=>"mo", output=>"∠", tex=>"angle", ttype=>"CONST" },
721
"\\ " => { tag=>"mo", output=>" ", tex=>'\,', ttype=>"CONST" },
722
"quad" => { tag=>"mo", output=>"  ", tex=>'', ttype=>"CONST" },
723
"qquad" => { tag=>"mo", output=>"    ", tex=>'', ttype=>"CONST" },
724
"cdots" => { tag=>"mo", output=>"⋯", tex=>'', ttype=>"CONST" },
725
"vdots" => { tag=>"mo", output=>"⋮", tex=>'', ttype=>"CONST" },
726
"ddots" => { tag=>"mo", output=>"⋱", tex=>'', ttype=>"CONST" },
727
"diamond" => { tag=>"mo", output=>"⋄", tex=>'', ttype=>"CONST" },
728
"square" => { tag=>"mo", output=>"□", tex=>'', ttype=>"CONST" },
729
"|__" => { tag=>"mo", output=>"⌊", tex=>"lfloor", ttype=>"CONST" },
730
"__|" => { tag=>"mo", output=>"⌋", tex=>"rfloor", ttype=>"CONST" },
731
"|~" => { tag=>"mo", output=>"⌈", tex=>"lceil", ttype=>"CONST" },
732
"~|" => { tag=>"mo", output=>"⌉", tex=>"rceil", ttype=>"CONST" },
733
"CC" => { tag=>"mo", output=>"ℂ", tex=>'', ttype=>"CONST" },
734
"NN" => { tag=>"mo", output=>"ℕ", tex=>'', ttype=>"CONST" },
735
"QQ" => { tag=>"mo", output=>"ℚ", tex=>'', ttype=>"CONST" },
736
"RR" => { tag=>"mo", output=>"ℝ", tex=>'', ttype=>"CONST" },
737
"ZZ" => { tag=>"mo", output=>"ℤ", tex=>'', ttype=>"CONST" },
738
"f" => { tag=>"mi", output=>"f", tex=>'', ttype=>"UNARY", func=>"true" },
739
"g" => { tag=>"mi", output=>"g", tex=>'', ttype=>"UNARY", func=>"true" },
742
"lim" => { tag=>"mo", output=>"lim", tex=>'', ttype=>"UNDEROVER" },
743
"Lim" => { tag=>"mo", output=>"Lim", tex=>'', ttype=>"UNDEROVER" },
744
"sin" => { tag=>"mo", output=>"sin", tex=>'', ttype=>"UNARY", func=>"true" },
745
"cos" => { tag=>"mo", output=>"cos", tex=>'', ttype=>"UNARY", func=>"true" },
746
"tan" => { tag=>"mo", output=>"tan", tex=>'', ttype=>"UNARY", func=>"true" },
747
"sinh" => { tag=>"mo", output=>"sinh", tex=>'', ttype=>"UNARY", func=>"true" },
748
"cosh" => { tag=>"mo", output=>"cosh", tex=>'', ttype=>"UNARY", func=>"true" },
749
"tanh" => { tag=>"mo", output=>"tanh", tex=>'', ttype=>"UNARY", func=>"true" },
750
"cot" => { tag=>"mo", output=>"cot", tex=>'', ttype=>"UNARY", func=>"true" },
751
"sec" => { tag=>"mo", output=>"sec", tex=>'', ttype=>"UNARY", func=>"true" },
752
"csc" => { tag=>"mo", output=>"csc", tex=>'', ttype=>"UNARY", func=>"true" },
753
"log" => { tag=>"mo", output=>"log", tex=>'', ttype=>"UNARY", func=>"true" },
754
"ln" => { tag=>"mo", output=>"ln", tex=>'', ttype=>"UNARY", func=>"true" },
755
"det" => { tag=>"mo", output=>"det", tex=>'', ttype=>"UNARY", func=>"true" },
756
"dim" => { tag=>"mo", output=>"dim", tex=>'', ttype=>"CONST" },
757
"mod" => { tag=>"mo", output=>"mod", tex=>'', ttype=>"CONST" },
758
"gcd" => { tag=>"mo", output=>"gcd", tex=>'', ttype=>"UNARY", func=>"true" },
759
"lcm" => { tag=>"mo", output=>"lcm", tex=>'', ttype=>"UNARY", func=>"true" },
760
"lub" => { tag=>"mo", output=>"lub", tex=>'', ttype=>"CONST" },
761
"glb" => { tag=>"mo", output=>"glb", tex=>'', ttype=>"CONST" },
762
"min" => { tag=>"mo", output=>"min", tex=>'', ttype=>"UNDEROVER" },
763
"max" => { tag=>"mo", output=>"max", tex=>'', ttype=>"UNDEROVER" },
766
"uarr" => { tag=>"mo", output=>"↑", tex=>"uparrow", ttype=>"CONST" },
767
"darr" => { tag=>"mo", output=>"↓", tex=>"downarrow", ttype=>"CONST" },
768
"rarr" => { tag=>"mo", output=>"→", tex=>"rightarrow", ttype=>"CONST" },
769
"->" => { tag=>"mo", output=>"→", tex=>"to", ttype=>"CONST", latex=>1 },
770
"|->" => { tag=>"mo", output=>"↦", tex=>"mapsto", ttype=>"CONST" },
771
"larr" => { tag=>"mo", output=>"←", tex=>"leftarrow", ttype=>"CONST" },
772
"harr" => { tag=>"mo", output=>"↔", tex=>"leftrightarrow", ttype=>"CONST" },
773
"rArr" => { tag=>"mo", output=>"⇒", tex=>"Rightarrow", ttype=>"CONST", latex=>1 },
774
"lArr" => { tag=>"mo", output=>"⇐", tex=>"Leftarrow", ttype=>"CONST" },
775
"hArr" => { tag=>"mo", output=>"⇔", tex=>"Leftrightarrow", ttype=>"CONST", latex=>1 },
777
# commands with argument
779
"hat" => { tag=>"mover", output=>"^", tex=>'', ttype=>"UNARY", acc=>"true" },
780
"bar" => { tag=>"mover", output=>"¯", tex=>"overline", ttype=>"UNARY", acc=>"true" },
781
"vec" => { tag=>"mover", output=>"→", tex=>'', ttype=>"UNARY", acc=>"true" },
782
"dot" => { tag=>"mover", output=>".", tex=>'', ttype=>"UNARY", acc=>"true" },
783
"ddot" => { tag=>"mover", output=>"..", tex=>'', ttype=>"UNARY", acc=>"true" },
784
"ul" => { tag=>"munder", output=>"̲", tex=>"underline", ttype=>"UNARY", acc=>"true" },
786
#======================================================================
787
# vismor -- Original implementation of bold did't work for me.
788
# I haven't tested -- used -- the other math font variants
789
# so I don't whow if they work.
790
#======================================================================
791
#"bb" => { tag=>"mstyle", atname=>"fontweight", atval=>"bold", output=>"bb", tex=>'', ttype=>"UNARY" },
792
#"mathbf" => { tag=>"mstyle", atname=>"fontweight", atval=>"bold", output=>"mathbf", tex=>'', ttype=>"UNARY" },
793
"bb" => { tag=>"mstyle", atname=>"mathvariant", atval=>"bold", output=>"bb", tex=>'', ttype=>"UNARY" },
794
"mathbf" => { tag=>"mstyle", atname=>"mathvariant", atval=>"bold", output=>"mathbf", tex=>'', ttype=>"UNARY" },
795
#======================================================================
796
"sf" => { tag=>"mstyle", atname=>"fontfamily", atval=>"sans-serif", output=>"sf", tex=>'', ttype=>"UNARY" },
797
"mathsf" => { tag=>"mstyle", atname=>"fontfamily", atval=>"sans-serif", output=>"mathsf", tex=>'', ttype=>"UNARY" },
798
"bbb" => { tag=>"mstyle", atname=>"mathvariant", atval=>"double-struck", output=>"bbb", tex=>'', ttype=>"UNARY", codes=>$AMbbb },
799
"mathbb" => { tag=>"mstyle", atname=>"mathvariant", atval=>"double-struck", output=>"mathbb", tex=>'', ttype=>"UNARY", codes=>$AMbbb },
800
"cc" => { tag=>"mstyle", atname=>"mathvariant", atval=>"script", output=>"cc", tex=>'', ttype=>"UNARY", codes=>$AMcal },
801
"mathcal" => { tag=>"mstyle", atname=>"mathvariant", atval=>"script", output=>"mathcal", tex=>'', ttype=>"UNARY", codes=>$AMcal },
802
"tt" => { tag=>"mstyle", atname=>"fontfamily", atval=>"monospace", output=>"tt", tex=>'', ttype=>"UNARY" },
803
"mathtt" => { tag=>"mstyle", atname=>"fontfamily", atval=>"monospace", output=>"mathtt", tex=>'', ttype=>"UNARY" },
804
"fr" => { tag=>"mstyle", atname=>"mathvariant", atval=>"fraktur", output=>"fr", tex=>'', ttype=>"UNARY", codes=>$AMfrk },
805
"mathfrak" => { tag=>"mstyle", atname=>"mathvariant", atval=>"fraktur", output=>"mathfrak", tex=>'', ttype=>"UNARY", codes=>$AMfrk },
808
# Preprocess AMSymbol for lexer regular expression
809
# Preprocess AMSymbol for tex input
810
my %AMTexSym = map(($AMSymbol{$_}{tex} || $_, $_),
811
grep($AMSymbol{$_}{tex}, keys %AMSymbol));
812
my $Ident_RE = join '|', map("\Q$_\E",
813
sort {length($b) - length($a)} (keys %AMSymbol,
816
sub _getSymbol_ : method {
817
my ($self, $str) = @_;
819
/^(\d+(\.\d+)?)/ || /^(\.\d+)/
820
and return $1, {tag=>'mn', output=>$1, ttype=>'CONST'};
822
return $1,$AMTexSym{$1} ? $AMSymbol{$AMTexSym{$1}} : $AMSymbol{$1};
823
$self->{Definition_RE} && /^($self->{Definition_RE})/ and
824
return $1, $self->{Definitions}{$1};
826
return $1, {tag=>'mi', output=>$1, ttype=>'CONST'};
828
return $1 eq '-' && defined $self->{previousSymbol} &&
829
$self->{previousSymbol} eq 'INFIX' ?
830
($1, {tag=>'mo', output=>$1, ttype=>'UNARY', func=>"true"} ) :
831
($1, {tag=>'mo', output=>$1, ttype=>'CONST'});
835
# Used so that Text::ASCIIMathML::Node can get access to the symbol table
841
# Parses an E expression
842
# Arguments: string to parse, whether to look for a right bracket
843
# Returns: parsed node (if successful), remaining unparsed string
844
sub _parseExpr : method {
845
my ($self, $str, $rightbracket) = @_;
846
my $newFrag = $self->_createDocumentFragment();
847
my ($node, $input, $symbol);
849
$str = _removeCharsAndBlanks($str, 0);
850
($node, $str) = $self->_parseIexpr($str);
851
($input, $symbol) = $self->_getSymbol($str);
852
if (defined $symbol && $symbol->{ttype} eq 'INFIX' && $input eq '/') {
853
$str = _removeCharsAndBlanks($str, length $input);
854
my @result = $self->_parseIexpr($str);
856
_removeBrackets($result[0]);
858
else { # show box in place of missing argument
859
$result[0] = $self->_createMmlNode
860
('mo', $self->_createTextNode('A1;'));
863
_removeBrackets($node);
864
$node = $self->_createMmlNode($symbol->{tag}, $node);
865
$node->appendChild($result[0]);
866
$newFrag->appendChild($node);
867
($input, $symbol) = $self->_getSymbol($str);
869
elsif (defined $node) {
870
$newFrag->appendChild($node);
872
} while (defined $symbol && ($symbol->{ttype} ne 'RIGHTBRACKET' &&
873
($symbol->{ttype} ne 'LEFTRIGHT' ||
875
|| $self->{nestingDepth} == 0) &&
876
$symbol->{output} ne '');
877
if (defined $symbol && $symbol->{ttype} =~ /RIGHTBRACKET|LEFTRIGHT/) {
878
my @childNodes = $newFrag->childNodes;
879
if (@childNodes > 1 &&
880
$childNodes[-1]->nodeName eq 'mrow' &&
881
$childNodes[-2]->nodeName eq 'mo' &&
882
$childNodes[-2]->firstChild->nodeValue eq ',') { # matrix
883
my $right = $childNodes[-1]->lastChild->firstChild->nodeValue;
884
if ($right =~ /[\)\]]/) {
885
my $left = $childNodes[-1]->firstChild->firstChild->nodeValue;
886
if ("$left$right" =~ /^\(\)$/ && $symbol->{output} ne '}' ||
887
"$left$right" =~ /^\[\]$/) {
888
my @pos; # positions of commas
891
for (my $i=0; $matrix && $i < $m; $i += 2) {
893
$node = $childNodes[$i];
895
$node->nodeName eq 'mrow' &&
897
$node->nextSibling->nodeName eq 'mo' &&
898
$node->nextSibling->firstChild->nodeValue eq ',')&&
899
$node->firstChild->firstChild->nodeValue eq $left&&
900
$node->lastChild->firstChild->nodeValue eq $right
903
for (my $j=0; $j<($node->childNodes); $j++) {
904
if (($node->childNodes)[$j]->firstChild->
906
push @{$pos[$i]}, $j;
910
if ($matrix && $i > 1) {
911
$matrix = @{$pos[$i]} == @{$pos[$i-2]};
915
my $table = $self->_createDocumentFragment();
916
for (my $i=0; $i<$m; $i += 2) {
917
my $row = $self->_createDocumentFragment();
918
my $frag = $self->_createDocumentFragment();
919
# <mrow>(-,-,...,-,-)</mrow>
920
$node = $newFrag->firstChild;
921
my $n = $node->childNodes;
923
$node->removeChild($node->firstChild); # remove (
924
for (my $j=1; $j<$n-1; $j++) {
925
if ($k < @{$pos[$i]} && $j == $pos[$i][$k]) {
928
($self->_createMmlNode('mtd', $frag));
929
$frag = $self->_createDocumentFragment();
933
$frag->appendChild($node->firstChild);
935
$node->removeChild($node->firstChild);
938
($self->_createMmlNode('mtd', $frag));
939
if ($newFrag->childNodes > 2) {
940
# remove <mrow>)</mrow>
941
$newFrag->removeChild($newFrag->firstChild);
943
$newFrag->removeChild($newFrag->firstChild);
946
($self->_createMmlNode('mtr', $row));
948
$node = $self->_createMmlNode('mtable', $table);
949
$node->setAttribute('columnalign', 'left')
950
if $symbol->{invisible};
951
$newFrag->replaceChild($node, $newFrag->firstChild);
956
$str = _removeCharsAndBlanks($str, length $input);
957
if (! $symbol->{invisible}) {
958
$node = $self->_createMmlNode
959
('mo', $self->_createTextNode($symbol->{output}));
960
$newFrag->appendChild($node);
963
return $newFrag, $str;
966
# Parses an I expression
967
# Arguments: string to parse
968
# Returns: parsed node (if successful), remaining unparsed string
969
sub _parseIexpr : method {
970
my ($self, $str) = @_;
971
$str = _removeCharsAndBlanks($str, 0);
972
my ($in1, $sym1) = $self->_getSymbol($str);
974
($node, $str) = $self->_parseSexpr($str);
975
my ($input, $symbol) = $self->_getSymbol($str);
976
if (defined $symbol && $symbol->{ttype} eq 'INFIX' && $input ne '/') {
977
# if (symbol.input == "/") result = AMparseIexpr(str); else ...
978
$str = _removeCharsAndBlanks($str, length $input);
979
my @result = $self->_parseSexpr($str);
981
_removeBrackets($result[0]);
983
else { # show box in place of missing argument
984
$result[0] = $self->_createMmlNode
985
('mo', $self->_createTextNode("A1;"));
989
my ($in2, $sym2) = $self->_getSymbol($str);
990
my $underover = $sym1->{ttype} eq 'UNDEROVER';
992
$str = _removeCharsAndBlanks($str, length $in2);
993
my @res2 = $self->_parseSexpr($str);
994
_removeBrackets($res2[0]);
996
$node = $self->_createMmlNode
997
($underover ? 'munderover' : 'msubsup', $node);
998
$node->appendChild($result[0]);
999
$node->appendChild($res2[0]);
1000
$node = $self->_createMmlNode('mrow',$node); # so sum does not stretch
1003
$node = $self->_createMmlNode
1004
($underover ? 'munder' : 'msub', $node);
1005
$node->appendChild($result[0]);
1009
$node = $self->_createMmlNode($symbol->{tag}, $node);
1010
$node->appendChild($result[0]);
1016
# Parses an S expression
1017
# Arguments: string to parse
1018
# Returns: parsed node (if successful), remaining unparsed string
1019
sub _parseSexpr : method {
1020
my ($self, $str) = @_;
1021
my $newFrag = $self->_createDocumentFragment();
1022
$str = _removeCharsAndBlanks($str, 0);
1023
my ($input, $symbol) = $self->_getSymbol($str);
1024
return (undef, $str)
1025
if ! defined $symbol ||
1026
$symbol->{ttype} eq 'RIGHTBRACKET' && $self->{nestingDepth} > 0;
1027
if ($symbol->{ttype} eq 'DEFINITION') {
1028
$str = $symbol->{output} . _removeCharsAndBlanks($str, length $input);
1029
($input, $symbol) = $self->_getSymbol($str);
1031
my $ttype = $symbol->{ttype};
1032
if ($ttype =~ /UNDEROVER|CONST/) {
1033
$str = _removeCharsAndBlanks($str, length $input);
1035
$self->_createMmlNode($symbol->{tag},
1036
$self->_createTextNode($symbol->{output})),
1039
if ($ttype eq 'LEFTBRACKET') {
1040
$self->{nestingDepth}++;
1041
$str = _removeCharsAndBlanks($str, length $input);
1042
my @result = $self->_parseExpr($str, 1);
1043
$self->{nestingDepth}--;
1045
if ($symbol->{invisible}) {
1046
$node = $self->_createMmlNode('mrow', $result[0]);
1049
$node = $self->_createMmlNode
1050
('mo', $self->_createTextNode($symbol->{output}));
1051
$node = $self->_createMmlNode('mrow', $node);
1052
$node->appendChild($result[0]);
1054
return $node, $result[1];
1056
if ($ttype eq 'TEXT') {
1057
$str = _removeCharsAndBlanks($str, length $input) unless $input eq '"';
1059
($input, $st) = ($1, $2)
1060
if $str =~ /^(\"()\")/ || $str =~ /^(\"((?:\\\\|\\\"|.)+?)\")/;
1061
($input, $st) = ($1, $2)
1062
if ($str =~ /^(\((.*?)\))/ ||
1063
$str =~ /^(\[(.*?)\])/ ||
1064
$str =~ /^(\{(.*?)\})/);
1065
($input, $st) = ($str) x 2 unless defined $st;
1066
if (substr($st, 0, 1) eq ' ') {
1067
my $node = $self->_createElementMathML('mspace');
1068
$node->setAttribute(width=>'1ex');
1069
$newFrag->appendChild($node);
1071
$newFrag->appendChild
1072
($self->_createMmlNode($symbol->{tag},
1073
$self->_createTextNode($st)));
1074
if (substr($st, -1) eq ' ') {
1075
my $node = $self->_createElementMathML('mspace');
1076
$node->setAttribute(width=>'1ex');
1077
$newFrag->appendChild($node);
1079
$str = _removeCharsAndBlanks($str, length $input);
1080
return $self->_createMmlNode('mrow', $newFrag), $str;
1082
if ($ttype eq 'UNARY') {
1083
$str = _removeCharsAndBlanks($str, length $input);
1084
my @result = $self->_parseSexpr($str);
1085
return ($self->_createMmlNode
1087
$self->_createTextNode($symbol->{output})), $str)
1088
if ! defined $result[0];
1089
if ($symbol->{func}) {
1090
return ($self->_createMmlNode
1092
$self->_createTextNode($symbol->{output})), $str)
1093
if $str =~ m!^[\^_/|]!;
1094
my $node = $self->_createMmlNode
1095
('mrow', $self->_createMmlNode
1096
($symbol->{tag}, $self->_createTextNode($symbol->{output})));
1097
$node->appendChild($result[0]);
1098
return $node, $result[1];
1100
_removeBrackets($result[0]);
1101
if ($symbol->{acc}) { # accent
1102
my $node = $self->_createMmlNode($symbol->{tag}, $result[0]);
1104
($self->_createMmlNode
1105
('mo', $self->_createTextNode($symbol->{output})));
1106
return $node, $result[1];
1108
if ($symbol->{atname}) { # font change command
1109
if ($self->{attr}{ForMoz} && $symbol->{codes}) {
1110
my @childNodes = $result[0]->childNodes;
1111
my $nodeName = $result[0]->nodeName;
1112
for (my $i=0; $i<@childNodes; $i++) {
1113
if ($childNodes[$i]->nodeName eq 'mi'||$nodeName eq 'mi') {
1114
my $st = $nodeName eq 'mi' ?
1115
$result[0] ->firstChild->nodeValue :
1116
$childNodes[$i]->firstChild->nodeValue;
1117
$st =~ s/([A-Z])/sprintf "&#x%X;",$symbol->{codes}[ord($1)-65]/ge;
1118
if ($nodeName eq 'mi') {
1119
$result[0] = $self->_createTextNode($st);
1122
$result[0]->replaceChild
1123
($self->_createTextNode($st), $childNodes[$i]);
1128
my $node = $self->_createMmlNode($symbol->{tag}, $result[0]);
1129
$node->setAttribute($symbol->{atname}=>$symbol->{atval});
1130
return $node, $result[1];
1132
return $self->_createMmlNode($symbol->{tag}, $result[0]), $result[1];
1134
if ($ttype eq 'BINARY') {
1135
$str = _removeCharsAndBlanks($str, length $input);
1136
my @result = $self->_parseSexpr($str);
1137
return ($self->_createMmlNode
1138
('mo', $self->_createTextNode($input)), $str)
1139
if ! defined $result[0];
1140
_removeBrackets($result[0]);
1141
my @result2 = $self->_parseSexpr($result[1]);
1142
return ($self->_createMmlNode
1143
('mo', $self->_createTextNode($input)), $str)
1144
if ! defined $result2[0];
1145
_removeBrackets($result2[0]);
1146
if ($input =~ /new(command|symbol)/) {
1148
# Look for text in both arguments
1149
my $text1 = $result[0];
1150
my $haveTextArgs = 0;
1151
$text1 = $text1->firstChild while $text1->nodeName eq 'mrow';
1152
if ($text1->nodeName eq 'mtext') {
1153
my $text2 = $result2[0];
1154
$text2 = $text2->firstChild while $text2->nodeName eq 'mrow';
1156
if ($result2[0]->childNodes > 1 && $input eq 'newsymbol') {
1157
# Process the latex string for a newsymbol
1158
my $latexdef = $result2[0]->child(1);
1159
$latexdef = $latexdef->firstChild
1160
while $latexdef->nodeName eq 'mrow';
1161
$latex = $latexdef->firstChild->nodeValue;
1163
if ($text2->nodeName eq 'mtext') {
1164
$self->{Definitions}{$text1->firstChild->nodeValue} = {
1166
output=>$text2->firstChild->nodeValue,
1167
ttype =>$what eq 'symbol' ? 'CONST' : 'DEFINITION',
1169
$self->{Definition_RE} = join '|',
1170
map("\Q$_\E", sort {length($b) - length($a)}
1171
keys %{$self->{Definitions}});
1172
$self->{Latex}{$text2->firstChild->nodeValue} = $latex
1177
if (! $haveTextArgs) {
1178
$newFrag->appendChild($self->_createMmlNode
1179
('mo', $self->_createTextNode($input)),
1180
$result[0], $result2[0]);
1181
return $self->_createMmlNode('mrow', $newFrag), $result2[1];
1183
return undef, $result2[1];
1185
if ($input =~ /root|stackrel/) {
1186
$newFrag->appendChild($result2[0]);
1188
$newFrag->appendChild($result[0]);
1189
if ($input eq 'frac') {
1190
$newFrag->appendChild($result2[0]);
1192
return $self->_createMmlNode($symbol->{tag}, $newFrag), $result2[1];
1194
if ($ttype eq 'INFIX') {
1195
$str = _removeCharsAndBlanks($str, length $input);
1196
return $self->_createMmlNode
1197
('mo', $self->_createTextNode($symbol->{output})), $str;
1199
if ($ttype eq 'SPACE') {
1200
$str = _removeCharsAndBlanks($str, length $input);
1201
my $node = $self->_createElementMathML('mspace');
1202
$node->setAttribute('width', '1ex');
1203
$newFrag->appendChild($node);
1204
$newFrag->appendChild
1205
($self->_createMmlNode($symbol->{tag},
1206
$self->_createTextNode($symbol->{output})));
1207
$node = $self->_createElementMathML('mspace');
1208
$node->setAttribute('width', '1ex');
1209
$newFrag->appendChild($node);
1210
return $self->_createMmlNode('mrow', $newFrag), $str;
1212
if ($ttype eq 'LEFTRIGHT') {
1213
$self->{nestingDepth}++;
1214
$str = _removeCharsAndBlanks($str, length $input);
1215
my @result = $self->_parseExpr($str, 0);
1216
$self->{nestingDepth}--;
1217
my $st = $result[0]->lastChild ?
1218
$result[0]->lastChild->firstChild->nodeValue : '';
1219
my $node = $self->_createMmlNode
1220
('mo',$self->_createTextNode($symbol->{output}));
1221
$node = $self->_createMmlNode('mrow', $node);
1222
if ($st eq '|') { # it's an absolute value subterm
1223
$node->appendChild($result[0]);
1224
return $node, $result[1];
1229
$str = _removeCharsAndBlanks($str, length $input);
1230
return $self->_createMmlNode
1231
($symbol->{tag}, # it's a constant
1232
$self->_createTextNode($symbol->{output})), $str;
1235
# Removes brackets at the beginning or end of an mrow node
1236
# Arguments: node object
1238
# Side-effects: may change children of node object
1239
sub _removeBrackets {
1241
if ($node->nodeName eq 'mrow') {
1242
my $st = $node->firstChild->firstChild->nodeValue;
1243
$node->removeChild($node->firstChild) if $st =~ /^[\(\[\{]$/;
1244
$st = $node->lastChild->firstChild->nodeValue;
1245
$node->removeChild($node->lastChild) if $st =~ /^[\)\]\}]$/;
1249
# Removes the first n characters and any following blanks
1250
# Arguments: string, n
1251
# Returns: resultant string
1252
sub _removeCharsAndBlanks {
1254
my $st = substr($str,
1255
substr($str, $n) =~ /^\\[^\\ ,]/ ? $n+1 : $n);
1256
$st =~ s/^[\x00-\x20]+//;
1260
# Removes outermost parenthesis
1262
# Returns: string with parentheses removed
1265
$s =~ s!^(<mrow>)<mo>[\(\[\{]</mo>!$1!;
1266
$s =~ s!<mo>[\)\]\}]</mo>(</mrow>)$!$1!;
1271
my %Conversion = ('<'=>'lt', '>'=>'gt', '"'=>'quot', '&'=>'amp');
1273
# Encodes special xml characters
1275
# Returns: encoded string
1278
$s =~ s/([<>\"&])/&$Conversion{$1};/g;
1283
package Text::ASCIIMathML::Node;
1286
# Create a closure for the following attributes
1289
# Creates a new Text::ASCIIMathML::Node object
1290
# Arguments: Text::ASCIIMathML object, optional tag
1291
# Returns: new object
1293
my ($class, $parser, $tag) = @_;
1294
my $obj = bless { children=>[] }, $class;
1295
if (defined $tag) { $obj->{tag} = $tag }
1296
else { $obj->{frag} = 1 }
1297
$parser_of{$obj} = $parser;
1301
# Creates a new Text::ASCIIMathML::Node text object
1302
# Arguments: Text::ASCIIMathML object, text
1303
# Returns: new object
1305
my ($class, $parser, $text) = @_;
1306
$text =~ s/^\s*(.*?)\s*$/$1/; # Delete leading/trailing spaces
1307
my $obj = bless { text=>$text }, $class;
1308
$parser_of{$obj} = $parser;
1315
$Null = new Text::ASCIIMathML::Node;
1318
# Appends one or more node objects to the children of an object
1319
# Arguments: list of objects to append
1321
sub appendChild : method {
1323
my @new = map $_->{frag} ? @{$_->{children}} : $_, @_;
1324
push @{$self->{children}}, @new;
1325
map do {$Parent{$_} = $self}, @new;
1329
# Returns a the value for an attribute of a node object
1330
# Arguments: Attribute name
1331
# Returns: Value for the attribute
1333
my ($self, $attr) = @_;
1334
return $self->{attr}{$attr};
1337
# Returns a list of the attributes of a node object
1339
# Returns: Array of attribute names
1342
return $self->{attrlist} ? @{$self->{attrlist}} : ();
1345
# Returns a child with a given index in the array of children of a node
1347
# Returns: Array of node objects
1349
my ($self, $index) = @_;
1350
return $self->{children} && @{$self->{children}} > $index ?
1351
$self->{children}[$index] : $Null;
1354
# Returns an array of children of a node
1356
# Returns: Array of node objects
1359
return $self->{children} ? @{$self->{children}} : ();
1362
# Returns the first child of a node; ignores any fragments
1364
# Returns: node object or self
1367
return $self->{children} && @{$self->{children}} ?
1368
$self->{children}[0] : $Null;
1371
# Returns true if the object is a fragment
1375
return $_[0]->{frag};
1378
# Returns true if the object is a named node
1382
return $_[0]->{tag};
1385
# Returns true if the object is a text node
1389
return defined $_[0]->{text};
1392
# Returns the last child of a node
1394
# Returns: node object or self
1397
return $self->{children} && @{$self->{children}} ?
1398
$self->{children}[-1] : $Null;
1402
# Creates closure for following "static" variables
1403
my (%LatexSym, %LatexMover, %LatexFont, %LatexOp);
1405
# Returns a latex representation of a node object
1407
# Returns: Text string
1411
my $parser = $parser_of{$self};
1413
# Build the entity to latex symbol translator
1414
my $amsymbol = Text::ASCIIMathML::_get_amsymbol_();
1415
foreach my $sym (keys %$amsymbol) {
1416
next unless (defined $amsymbol->{$sym}{output} &&
1417
$amsymbol->{$sym}{output} =~ /&\#x.*;/);
1418
my ($output, $tex) = map $amsymbol->{$sym}{$_}, qw(output tex);
1419
next if defined $LatexSym{$output} && ! $amsymbol->{$sym}{latex};
1420
$tex = $sym if $tex eq '';
1421
$LatexSym{$output} = "\\$tex";
1423
my %math_font = (bbb => 'mathds',
1428
mathfrak => 'mathfrak',
1430
# Add character codes
1431
foreach my $coded (grep $amsymbol->{$_}{codes}, keys %$amsymbol) {
1432
@LatexSym{map(sprintf("&#x%X;", $_),
1433
@{$amsymbol->{$coded}{codes}})} =
1434
map("\\$math_font{$coded}\{$_}", ('A' .. 'Z'));
1436
# Post-process protected symbols
1437
$LatexSym{$_} =~ s/^\\\\/\\/ foreach keys %LatexSym;
1438
%LatexMover = ('^' => '\hat',
1439
'\overline' => '\overline',
1442
'\rightarrow' => '\vec',
1446
%LatexFont = (bold => '\bf',
1447
'double-struck' => '\mathds',
1448
fraktur => '\mathfrak',
1450
'sans-serif' => '\sf',
1453
%LatexOp = (if => '\mbox{if }',
1454
lcm => '\mbox{lcm}',
1455
newcommand => '\mbox{newcommand}',
1456
"\\" => '\backslash',
1462
if (defined $self->{text}) {
1463
my $text = $self->{text};
1464
$text =~ s/([{}])/\\$1/;
1465
$text =~ s/(&\#x.*?;)/
1466
defined $parser->{Latex}{$1} ? $parser->{Latex}{$1} :
1467
defined $LatexSym{$1} ? $LatexSym{$1} : $1/eg;
1470
my $tag = $self->{tag};
1473
if (@{$self->{children}}) {
1474
foreach (@{$self->{children}}) {
1475
push @child_str, $_->latex($parser);
1481
# Need to distinguish bmod from pmod
1482
my $parent = $self->parent;
1483
return $self eq $parent->child(1) &&
1484
$parent->firstChild->firstChild->{text} eq '('
1486
if $child_str[0] eq 'mod';
1487
return $LatexOp{$child_str[0]} if $LatexOp{$child_str[0]};
1488
return $child_str[0] =~ /^\w+$/ ? "\\$child_str[0]" : $child_str[0];
1492
if ($tag eq 'mrow') {
1493
@child_str = grep $_ ne '', @child_str;
1494
# Check for pmod function
1495
if (@child_str > 1 && $child_str[1] eq '\pmod') {
1496
pop @child_str if $child_str[-1] eq ')';
1497
splice @child_str, 0, 2;
1498
return "\\pmod{@child_str}";
1500
# Check if we need \left ... \right
1501
my $is_tall = grep(/[_^]|\\(begin\{array\}|frac|sqrt|stackrel)/,
1503
if ($is_tall && @child_str > 1 &&
1504
($child_str[0] =~ /^([\(\[|]|\\\{)$/ ||
1505
$child_str[-1] =~ /^([\)\]|]|\\\})$/)) {
1506
if ($child_str[0] =~ /^([\(\[|]|\\\{)$/) {
1507
$child_str[0] = "\\left$child_str[0]";
1510
unshift @child_str, "\\left.";
1512
if ($child_str[-1] =~ /^([\)\]|]|\\\})$/) {
1513
$child_str[-1] = "\\right$child_str[-1]";
1516
push @child_str, "\\right.";
1519
return "@child_str";
1527
if ($tag =~ /^m([in]|ath|row|td)$/) {
1528
@child_str = grep $_ ne '', @child_str;
1529
return "@child_str";
1536
if ($tag =~ /^(msu[bp](sup)?|munderover)$/) {
1537
my $base = shift @child_str;
1538
$base = '\mbox{}' if $base eq '';
1539
# Put {} around arguments with more than one character
1540
@child_str = map length($_) > 1 ? "{$_}" : $_, @child_str;
1541
return ($tag eq 'msub' ? "${base}_$child_str[0]" :
1542
$tag eq 'msup' ? "${base}^$child_str[0]" :
1543
"${base}_$child_str[0]^$child_str[1]");
1547
if ($tag eq 'mover') {
1548
# Need to special-case math mode accents
1550
($child_str[1] eq '\overline' && length($child_str[0]) == 1 ?
1551
"\\bar{$child_str[0]}" :
1552
$LatexMover{$child_str[1]} ?
1553
"$LatexMover{$child_str[1]}\{$child_str[0]\}" :
1554
"\\stackrel{$child_str[1]}{$child_str[0]}");
1558
if ($tag eq 'munder') {
1559
return $child_str[1] eq '\underline' ? "$child_str[1]\{$child_str[0]}"
1560
: "$child_str[0]_\{$child_str[1]\}";
1564
if ($tag eq 'mfrac') {
1565
return "\\frac{$child_str[0]}{$child_str[1]}";
1569
if ($tag eq 'msqrt') {
1570
return "\\sqrt{$child_str[0]}";
1574
if ($tag eq 'mroot') {
1575
return "\\sqrt[$child_str[1]]{$child_str[0]}";
1579
if ($tag eq 'mtext') {
1580
my $text = $child_str[0];
1581
my $next = $self->nextSibling;
1582
my $prev = $self->previousSibling;
1583
if (defined $next->{tag} && $next->{tag} eq 'mspace') {
1586
if (defined $prev->{tag} && $prev->{tag} eq 'mspace') {
1589
$text = ' ' if $text eq ' ';
1590
return "\\mbox{$text}";
1595
if ($tag eq 'mspace') {
1600
if ($tag eq 'mtable') {
1601
my $cols = ($child_str[0] =~ tr/&//) + 1;
1602
my $colspec = ($self->{attr}{columnalign} || '') eq 'left' ? 'l' : 'c';
1603
my $colspecs = $colspec x $cols;
1604
return ("\\begin{array}{$colspecs}\n" .
1605
join('', map(" $_ \\\\\n", @child_str)) .
1610
if ($tag eq 'mtr') {
1611
return join ' & ', @child_str;
1615
if ($tag eq 'mstyle') {
1616
@child_str = grep $_ ne '', @child_str;
1617
if ($self->parent->{tag} eq 'math') {
1618
push @child_str, ' ' unless @child_str;
1619
# The top-level mstyle
1620
return (defined $self->{attr}{displaystyle} &&
1621
$self->{attr}{displaystyle} eq 'true') ?
1622
"\$\$@child_str\$\$" : "\$@child_str\$";
1625
# It better be a font changing command
1626
return $child_str[0] if $self->{attr}{mathvariant};
1627
my ($attr) = map($self->{attr}{$_},
1628
grep $self->{attr}{$_},
1629
qw(fontweight fontfamily));
1630
return $attr && $LatexFont{$attr} ?
1631
"$LatexFont{$attr}\{$child_str[0]}" : $child_str[0];
1637
# Returns the next sibling of a node
1639
# Returns: node object or undef
1642
my $parent = $self->parent;
1643
for (my $i=0; $i<@{$parent->{children}}; $i++) {
1644
return $parent->{children}[$i+1] if $self eq $parent->{children}[$i];
1649
# Returns the tag of a node
1652
sub nodeName : method {
1653
return $_[0]{tag} || '';
1656
# Returns the text of a text node
1659
sub nodeValue : method {
1660
return $_[0]{text} || '';
1663
# Returns the parent of a node
1665
# Returns: parent node object or undef
1666
sub parent : method {
1667
return $Parent{$_[0]} || $Null;
1670
# Returns the previous sibling of a node
1672
# Returns: node object or undef
1673
sub previousSibling {
1675
my $parent = $self->parent;
1676
for (my $i=1; $i<@{$parent->{children}}; $i++) {
1677
return $parent->{children}[$i-1] if $self eq $parent->{children}[$i];
1682
# Removes a given child node from a node
1683
# Arguments: child node
1685
# Side-effects: May affect children of the node
1686
sub removeChild : method {
1687
my ($self, $child) = @_;
1688
@{$self->{children}} = grep $_ ne $child, @{$self->{children}}
1689
if $self->{children};
1690
delete $Parent{$child};
1693
# Replaces one child node object with another
1694
# Arguments: old child node object, new child node object
1696
sub replaceChild : method {
1697
my ($self, $new, $old) = @_;
1698
@{$self->{children}} = map $_ eq $old ? $new : $_, @{$self->{children}};
1699
delete $Parent{$old};
1700
$Parent{$new} = $self;
1703
# Sets one or more attributes on a node object
1704
# Arguments: set of attribute/value pairs
1706
sub setAttribute : method {
1709
$self->{attr} = {} unless $self->{attr};
1710
$self->{attrlist} = [] unless $self->{attrlist};
1712
while (my($aname, $aval) = splice(@_, 0, 2)) {
1714
push @{$self->{attrlist}}, $aname unless defined $self->{attr}{$aname};
1715
$self->{attr}{$aname} = $aval;
1719
# Returns the ASCII representation of a node object
1721
# Returns: Text string
1724
return $self->{text} if defined $self->{text};
1725
my $tag = $self->{tag};
1726
my $attr = join '', map(" $_=\"" .
1727
($_ eq 'xmlns' ? $self->{attr}{$_} :
1728
Text::ASCIIMathML::_xml_encode($self->{attr}{$_})) .
1729
"\"", @{$self->{attrlist}})
1731
if (@{$self->{children}}) {
1733
foreach (@{$self->{children}}) {
1734
$child_str .= $_->text;
1736
return $tag ? "<$tag$attr>$child_str</$tag>" : $child_str;
1738
return $tag ? "<$tag$attr/>" : '';