Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing in 4.1 goes into an infinite loop, runs in 4.0 #374

Closed
pgallanis opened this issue Dec 19, 2013 · 5 comments
Closed

Parsing in 4.1 goes into an infinite loop, runs in 4.0 #374

pgallanis opened this issue Dec 19, 2013 · 5 comments

Comments

@pgallanis
Copy link

When parsing a complex equation (500+ elements) with a default order of precedence, the 4.1 parse() method goes into an infinite loop. If the precedence is forced using parenthesis, the parsing completes very quickly. When parsed with 4.0, it does complete (albeit incredibly much longer than with parenthesis, 600+ times slower - 52 seconds versus 0.080 seconds).

When running with 4.1, the trace output never proceeds beyond:

consume [@0,0:4='coeff',<9>,1:0] rule stat
consume [@1,6:6='=',<7>,1:6] rule stat
enter   expr, LT(1)=-
consume [@2,10:10='-',<3>,2:0] rule expr
enter   expr, LT(1)=2.515823692
consume [@3,13:23='2.515823692',<11>,2:3] rule expr
exit    expr, LT(1)=+
exit    expr, LT(1)=+
enter   expr, LT(1)=+
consume [@4,26:26='+',<2>,3:0] rule expr
enter   expr, LT(1)=var1
consume [@5,28:31='var1',<9>,3:2] rule expr

Grammar

grammar Expr;
program:   stat+ ;
stat:  ID '=' expr ';'
    ;
expr:   '-' expr
    |   expr ('*'|'/') expr
    |   expr ('+'|'-') expr
    |   INT
    |   DEC
    |   ID
    |   '(' expr ')'
    ;
ID  :   [a-zA-Z_][a-zA-Z0-9_]* ;
INT :   [0-9]+ ;
DEC :   [0-9]* '.' [0-9]* ;
WS  : [ \r\t\n]+ -> skip;

Input test file

coeff = 
-  2.515823692
+ var1 * -0.232747675
+ var2 * 0.2946511624
+ var3 * 0.6232972863
+ var4 * 0.2073482413
+ var5 * -0.382347791
+ var6 * 0.0952594161
+ var7 * -0.130794859
+ var8 * 0.3254107818
+ var9 * 0.3622159309
+ var10 * 0.1592540607
+ var11 * 0.8240980603
+ var12 * -0.0776068
+ var13 * 0.5063534222
+ var14 * 0.2838570567
+ var15 * 0.4088988648
+ var16 * -1.731424232
+ var17 * -0.347520669
+ var18 * 0.0731910653
+ var19 * 0.0530748982
+ var20 * 0.2912948172
+ var21 * -4.559087231
+ var22 * 0.1915516217
+ var23 * 0.2196965609
+ var24 * 0.2978991597
+ var25 * -0.84134267
+ var26 * 0.7226981225
+ var27 * 0.403039704
+ var28 * 0.0505745514
+ var29 * 0.2594340419
+ var30 * -0.5493138
+ var31 * 0.1989392093
+ var32 * 0.2110652237
+ var33 * 0.2699849661
+ var34 * -0.090345502
+ var35 * -0.287946071
+ var36 * 1.1503148938
+ var37 * -0.998941029
+ var38 * -1.076712487
+ var39 * 0.3086655061
+ var40 * -1.046971829
+ var41 * 0.2896343243
+ var42 * 0.3330737356
+ var43 * 0.8742240635
+ var44 * 1.2853826581
+ var45 * 0.9857931859
+ var46 * -1.099300731
+ var47 * -0.977344203
+ var48 * 0.433231086
+ var49 * 0.8564903506
+ var50 * 0.4752015883
+ var51 * 0.7329066215
+ var52 * 0.7691652851
+ var53 * 0.5385432829
+ var54 * -1.15380243
+ var55 * 0.2943110344
+ var56 * 1.1435120115
+ var57 * 0.4907487525
+ var58 * -1.080080945
+ var59 * 0.482371177
+ var60 * 1.012420398
+ var61 * -0.060520712
+ var62 * 0.4894160054
+ var63 * -0.749024355
+ var64 * -0.860572068
+ var65 * 0.0679394169
+ var66 * -0.105806287
+ var67 * 0.1154505765
+ var68 * 0.3261994522
+ var69 * 0.5693378118
+ var70 * -0.465885215
+ var71 * 0.0552283254
+ var72 * 0.9536306042
+ var73 * 0.1681299618
+ var74 * 0.8187608134
+ var75 * 0.9500678685
+ var76 * 0.3761731429
+ var77 * 0.0410066278
+ var78 * 0.4198556054
+ var79 * -0.332749278
+ var80 * -2.544867767
+ var81 * 0.5755770317
+ var82 * 0.2019082679
+ var83 * 0.283288426
+ var84 * 0.2246953905
+ var85 * -1.090979946
+ var86 * -0.19269156
+ var87 * 0.398820299
+ var88 * 0.3224362388
+ var89 * -2.773585289
+ var90 * 0.1397794997
+ var91 * 0.2588530889
+ var92 * 0.0932297132
+ var93 * 0.4386894523
+ var94 * 0.1670284033
+ var95 * 0.233484116
+ var96 * 0.6121601354
+ var97 * 0.9327377194
+ var98 * 0.6201206533
+ var99 * 0.619424925
+ var100 * -0.071070527
+ var101 * 0.68549894
+ var102 * 0.8476876024
+ var103 * -0.730125394
+ var104 * 1.2103481057
+ var105 * -0.42645119
+ var106 * -2.370928198
+ var107 * -0.296670562
+ var108 * 0.5870135842
+ var109 * 0.3861102827
+ var110 * 0.6974709044
+ var111 * 0.3471907816
+ var112 * -0.099645815
+ var113 * 0.2164140756
+ var114 * 1.1683278831
+ var115 * -0.301309027
+ var116 * -0.084173047
+ var117 * 0.4013581115
+ var118 * 1.0663365025
+ var119 * -1.526026901
+ var120 * -0.157756999
+ var121 * -0.244828519
+ var122 * 0.2599207444
+ var123 * 0.5076196038
+ var124 * 0.5929840068
+ var125 * 0.4216725381
+ var126 * 0.8315142266
+ var127 * 0.7201583514
+ var128 * 0.3293858386
+ var129 * 0.1221397093
+ var130 * 0.4483011987
+ var131 * -0.439755374
+ var132 * 0.639814015
+ var133 * -0.073007433
+ var134 * 0.0616348109
+ var135 * 0.2625391928
+ var136 * 0.1230717521
+ var137 * -1.828233455
+ var138 * -0.437474753
+ var139 * 1.8059899595
+ var140 * 0.2023705649
+ var141 * -1.354790545
+ var142 * -0.337522505
+ var143 * 0.5901831574
+ var144 * -0.215344157
+ var145 * 0.9868984384
+ var146 * 0.2084540017
+ var147 * 0.1643844479
+ var148 * -0.420780098
+ var149 * 0.2660349064
+ var150 * 0.1079328293
+ var151 * 0.5844402249
+ var152 * -1.241821793
+ var153 * 0.3330418755
+ var154 * 0.1458891142
+ var155 * 0.1787351799
+ var156 * -0.305576826
+ var157 * 0.3087652274
+ var158 * 0.4405252736
+ var159 * 0.1103194531
+ var160 * 0.3963850489
+ var161 * 0.4910197631
+ var162 * 2.2858405665
+ var163 * -2.617868654
+ var164 * -1.929933238
+ var165 * -0.158468404
+ var166 * -0.36060211
+ var167 * 0.4511053609
+ var168 * 0.2992043833
+ var169 * 0.3228502329
+ var170 * -1.077310621
+ var171 * 0.4075590796
+ var172 * 0.7892553583
+ var173 * 0.2360589262
+ var174 * 0.9925501094
+ var175 * -1.29917423
+ var176 * -0.086975037
+ var177 * -0.104484889
+ var178 * -0.365427862
+ var179 * 0.5974523758
+ var180 * 0.1121500744
+ var181 * 0.8645579395
+ var182 * -0.110779191
+ var183 * 0.5180979727
+ var184 * 0.9355744537
+ var185 * -0.170993786
+ var186 * 0.2029191182
+ var187 * 1.5669823619
+ var188 * -8.669133685
+ var189 * -0.238906234
+ var190 * -0.314622587
+ var191 * 0.7910361183
+ var192 * -0.175541614
+ var193 * -0.179579252
+ var194 * 0.1837192919
+ var195 * -0.112670204
+ var196 * 0.0453826824
+ var197 * -0.148725413
+ var198 * -0.039623612
+ var199 * -0.046049659
+ var200 * 0.019658765
+ var201 * 0.03166857
+ var202 * -0.023320667
+ var203 * -0.019580347
+ var204 * -0.032727602
+ var205 * -0.059805105
+ var206 * 0.1090918211
+ var207 * 0.061617326
+ var208 * -0.13971687
+ var209 * -0.117588692
+ var210 * -0.050364531
+ var211 * 0.0776756037
+ var212 * -0.294187755
+ var213 * -0.138856005
+ var214 * -0.054134036
+ var215 * 0.1383161122
+ var216 * -0.112906909
+ var217 * 0.0138987667
+ var218 * 0.1338790395
+ var219 * 0.0344609132
+ var220 * 0.6008249009
+ var221 * 0.4016835667
+ var222 * -0.145143948
+ var223 * 0.3906052039
+ var224 * 0.2097089326
+ var225 * 0.4751084401
+ var226 * 0.3407144671
+ var227 * 0.4892700958
+ var228 * 0.4832864402
+ var229 * -2.639522679
+ var230 * 0.2680436419
+ var231 * 0.3860049706
+ var232 * 0.0226116122
+ var233 * -0.023924705
+ var234 * -0.029860608
+ var235 * 0.046056514
+ var236 * 0.089528131
+ var237 * 0.000262674
+ var238 * 0.0420109788
+ var239 * 0.5811034613
+ var240 * -0.000682856
+ var241 * 0.0824681447
+ var242 * 0.064993955
+ var243 * -0.055521841
+ var244 * 0.1394408085
+ var245 * 0.0193242832
+ var246 * 0.0225163227
+ var247 * 0.0619347864
+ var248 * 0.0722874714
+ var249 * -0.047461861
+ var250 * 0.2557409512
+ var251 * -0.016051685
+ var252 * -0.03165986
+ var253 * 0.0162201557
+ var254 * -0.041419146
+ var255 * 0.0562385993
+ var256 * -0.043890115
+ var257 * 0.05836772
+ var258 * var259 * 0.0825015553
;
@parrt
Copy link
Member

parrt commented Dec 19, 2013

Hi. Confirmed that it's slow. It uses full context parsing and the preds slow it down. Standard two-stage parse finishing right away.

public static void main(String[] args) throws Exception {
    CharStream input = new ANTLRFileStream(args[0]);
    ExprLexer lexer = new ExprLexer(input);
    CommonTokenStream tokens = new CommonTokenStream(lexer);
    ExprParser parser = new ExprParser(tokens);
    parser.getInterpreter().setPredictionMode(PredictionMode.SLL);
    try {
        parser.stat();  // STAGE 1
    }
    catch (Exception ex) {
        tokens.reset(); // rewind input stream
        parser.reset();
        parser.getInterpreter().setPredictionMode(PredictionMode.LL);
        parser.stat();  // STAGE 2
        // if we parse ok, it's LL not SLL
    }
    }

I'll close but will keep in mind this known issue.

@parrt parrt closed this as completed Dec 19, 2013
@pgallanis
Copy link
Author

Thanks for the reply, especially the code for doing the two stage parsing.
Does the infinite loop go away with the two stage approach?

Hi. Confirmed that it's slow. It uses full context parsing and the preds
slow it down. Standard two-stage parse finishing right away.

public static void main(String[] args) throws Exception {
CharStream input = new ANTLRFileStream(args[0]);
ExprLexer lexer = new ExprLexer(input);
CommonTokenStream tokens = new CommonTokenStream(lexer);
ExprParser parser = new ExprParser(tokens);
parser.getInterpreter().setPredictionMode(PredictionMode.SLL);
try {
parser.stat();
}
catch (Exception ex) {
if ( ex.getCause() instanceof ParseCancellationException) {
tokens.reset(); // rewind input stream
parser.reset();
// back to standard listeners/handlers
parser.addErrorListener(ConsoleErrorListener.INSTANCE);
parser.setErrorHandler(new DefaultErrorStrategy());
parser.getInterpreter().setPredictionMode(PredictionMode.LL);

        parser.stat();
        // if we parse ok, it's LL not SLL
        ex.printStackTrace();
    }
}
}

I'll close but will keep in mind this known issue.


Reply to this email directly or view it on
GitHubhttps://github.com//issues/374#issuecomment-30952357
.

@ghost
Copy link

ghost commented Jun 5, 2014

Could someone explain this pattern?

Starting a "stage two" of something in a catch block seem arcane. What's going on?

@parrt
Copy link
Member

parrt commented Jun 5, 2014

Hi. The need for the two-stage optimization is explained in detail here: http://www.antlr.org/papers/allstar-techreport.pdf

@sharwell
Copy link
Member

sharwell commented Jun 8, 2014

Please see the release notes for ANTLR 4.2. One-stage parsing of grammars like yours was specifically addressed.

dan2097 pushed a commit to dan2097/chemicaltagger1.3.1 that referenced this issue Apr 13, 2019
…. Note, now ChemistrySentenceParserTest is down to 2.065 from 10.028 sec. Thanks Dan for the suggestion.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants