TXR: data extraction language

TXR is a pragmatic, convenient tool ready to take on your daily hacking challenges with its dual personality: its whole-document pattern matching and extraction language for scraping information from arbitrary text sources, and its powerful data-processing language to slice through problems like a hot knife through butter. Many tasks can be accomplished with TXR “one liners” directly from your system prompt. TXR is relatively new: the project started in 2009.

It is difficult to give a small introduction to TXR because it is no longer a small language. The PDF rendition of the reference manual, which takes the form of a large Unix man page, is over 600 pages long, with no index or table of contents. There are many ways to solve a given data processing problem with TXR.

TXR is a fusion of many different ideas, a few of which are original, and it is influenced by many languages, such as Common Lisp, Scheme, Awk, M4, POSIX Shell, Prolog, Ruby, Python, Arc, Clojure, S-Lang and others.

Leave a Reply

Your email address will not be published. Required fields are marked *