RUS  ENG
Full version
JOURNALS // Sistemy i Sredstva Informatiki [Systems and Means of Informatics] // Archive

Sistemy i Sredstva Inform., 2016 Volume 26, Issue 2, Pages 43–62 (Mi ssi461)

Lexical analysis of dynamically generated string expressions

M. I. Polubelova, S. V. Grigorev

Saint Petersburg State University, 7/9 Universitetskaya Nab., St. Petersburg 199034, Russian Federation

Abstract: There is a class of applications which utilizes the idea of string embedding of one language into another. In this approach, a host program generates string representation of clauses in some external language, which are then passed to a dedicated runtime component for analysis and execution. Despite providing better expressiveness and flexibility, this technique makes the behavior of the system less predictable, complicates maintenance, and is a source of such vulnerabilities as SQL injections and cross-site scripting. Static analysis of strings is intended to minimize the drawbacks of the approach by checking well-formedness of a set of all dynamically-generated clauses at compile-time. Lexical analysis, or tokenization, is an important step of static analysis. The paper presents an automated approach to lexical analyzers construction which simplifies implementation of static analyzers of dynamically generated code.

Keywords: string analysis; lexing; string-embedded language; lexer generator.

Received: 26.02.2016



Bibliographic databases:


© Steklov Math. Inst. of RAS, 2026