RUS  ENG
Full version
JOURNALS // Preprints of the Keldysh Institute of Applied Mathematics // Archive

Keldysh Institute preprints, 2019 020, 31 pp. (Mi ipmp2658)

This article is cited in 1 paper

Scraping on the fly of external web resources, driven by HTML page markup

E. L. Kitaev, R. Yu. Skornyakova


Abstract: The paper presents an approach to displaying data from cross origin resources on web pages using the REST API and describes a tool based on this approach that allows one to extract and display on the web page metadata of html documents, pdf files and documents Word posted on the Internet, as well as microdata and data in JSON LD format. The tool includes the REST API on the IIS web server and JavaScript scripts. Examples of using this tool are given. The created REST API enables cross origin resource sharing (CORS) and can be requested from web pages of any origins.

Keywords: web scraping, semantic markup, microdata, JSON-LD, REST API, CORS.

DOI: 10.20948/prepr-2019-20



Bibliographic databases:


© Steklov Math. Inst. of RAS, 2026