Laurent, Nicolas
[UCL]
Mens, Kim
[UCL]
Historically, true context-sensitive parsing has seldom been applied to programming languages, due to its inherent complexity. However, many mainstream programming and markup languages (C, Haskell, Python, XML, and more) possess context-sensitive features. These features are traditionally handled with ad-hoc code (e.g., custom lexers), outside of the scope of parsing theory. Current grammar formalisms struggle to express context-sensitive features. Most solutions lack context transparency: they make grammars hard to write, maintain and compose by hardwiring context through the entire grammar. Instead, we approach context-sensitive parsing through the idea that parsers may recall previously matched input (or data derived therefrom) in order to make parsing decisions. We make use of mutable parse state to enable this form of recall. We introduce principled stateful parsing as a new transactional discipline that makes state changes transparent to parsing mechanisms such as backtracking and memoization. To enforce this discipline, users specify parsers using formally specified primitive state manipulation operations. Our solution is available as a parsing library named Autumn. We illustrate our solution by implementing some practical context-sensitive grammar features such as significant whitespace handling and namespace classification.
- V AN W YK, E., AND S CHWERDFEGER, A. Context-aware scanning for parsing extensible languages. In International Conference on Generative Programming and Component Engineering, GPCE 2007 (October 2007), ACM.
- Thurston Adrian D., Cordy James R., A backtracking LR algorithm for parsing ambiguous context-dependent languages, 10.1145/1188966.1188972
- T HE F REE S OFTWARE F OUNDATION. The GNU Bison homepage, 2014. http://www.gnu.org/software/bison/.
- Steindorfer Michael J., Vinju Jurgen J., Optimizing hash-array mapped tries for fast and lean immutable JVM collections, 10.1145/2814270.2814312
- S PIVEY, J. M., AND A BRIAL, J. The Z notation. Prentice Hall, 1992.
- P EREIRA, F., AND W ARREN, D. Definite clause grammars for language analysis. In Readings in Natural Language Processing, B. J. Grosz, K. Sparck-Jones, and B. L. Webber, Eds. Morgan Kaufmann, 1986, pp. 101–124.
- P ARR, T., H ARWELL, S., AND F ISHER, K. Adaptive LL(*) parsing: The power of dynamic analysis. In Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications (2014), OOPSLA ’14, ACM, pp. 579–598.
- O KASAKI, C. Purely Functional Data Structures. Cambridge University Press, New York, NY, USA, 1998.
- M OORS, A., P IESSENS, F., AND O DERSKY, M. Parser combinators in Scala. Tech. Rep. CW491, Department of Computer Science, KU Leuven, Feb. 2008.
- L EIJEN, D., AND M EIJER, E. Parsec: Direct style monadic parser combinators for the real world. Tech. Rep. UU-CS- 2001-35, Department of Information and Computing Sciences, Utrecht University, 2001.
- https:// github.com/ncellar/sle2016.
- L AURENT, N., AND M ENS, K. Taming context-sensitive languages with principled stateful parsing: Artifacts. Software Language Engineering: Artifacts Track (2016).
- L AURENT, N., AND M ENS, K. Parsing expression grammars made practical. In Proceedings of the ACM SIGPLAN International Conference on Software Language Engineering (2015), SLE 2015, ACM, pp. 167–172.
- Knuth Donald E., Semantics of context-free languages, 10.1007/bf01692511
- Klint Paul, Lämmel Ralf, Verhoef Chris, Toward an engineering discipline for grammarware, 10.1145/1072997.1073000
- K ALLMEYER, L. Parsing Beyond Context-Free Grammars. Springer, 2010.
- Jim Trevor, Mandelbaum Yitzhak, Walker David, Semantics and algorithms for data-dependent grammars, 10.1145/1707801.1706347
- J IM, T., AND M ANDELBAUM, Y. A new method for dependent parsing. In Proceedings of the 20th European Conference on Programming Languages and Systems (2011), ESOP 2011, Springer, pp. 378–397.
- Ierusalimschy Roberto, A text pattern-matching tool based on Parsing Expression Grammars, 10.1002/spe.892
- HUTTON GRAHAM, MEIJER ERIK, Monadic parsing in Haskell, 10.1017/s0956796898003050
- H EDIN, G. Reference attributed grammars. Informatica (Slovenia) 24, 3 (2000).
- G RUNE, D., AND J ACOBS, C. J. Parsing Techniques: A Practical Guide, p. 21–23, 2nd ed. Springer, 2008.
- Grimm Robert, Better extensibility through modular syntax, 10.1145/1133255.1133987
- Ford Bryan, Parsing expression grammars : a recognition-based syntactic foundation, 10.1145/982962.964011
- https: //github.com/sirthias/parboiled.
- D OENITZ, M. The Parboiled homepage, 2015.
- C HOMSKY, N. Formal properties of grammar. In Handbook of Mathematical Psychology. Wiley, 1963, ch. 12, pp. 360–363 and 367.
- Chomsky N., Three models for the description of language, 10.1109/tit.1956.1056813
- B AGWELL, P. Ideal hash trees. Tech. Rep. LAMP-REPORT- 2001-001, Ecole polytechnique fédérale de Lausanne, 2001.
- A YCOCK, J., AND H ORSPOOL, R. N. Schrödinger’s token. Software: Practice and Experience 31, 8 (2001), 803–814.
- A TKEY, R. The semantics of parsing with semantic actions. In Proceedings of the 27th Annual IEEE/ACM Symposium on Logic in Computer Science (2012), LICS 2015, IEEE Computer Society, pp. 75–84.
- A FROOZEH, A., AND I ZMAYLOVA, A. One parser to rule them all. In ACM International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software (2015), Onward! 2015, ACM, pp. 151–170.
Bibliographic reference |
Laurent, Nicolas ; Mens, Kim. Taming Context-Sensitive Languages with Principled Stateful Parsing.Software Language Engineering (SLE 2016) (Amsterdam, du 31/10/2016 au 01/11/2016). In: ACM/SIGPLAN Notices, p. 15-27 |
Permanent URL |
http://hdl.handle.net/2078.1/176784 |