ANTLR is a code generation tool for making language
parsers. Using a grammer file, you can get ANTLR to generate code to read,
interpret, and execute your very own code.
In today’s article I’ll walk through the basic setup to create a Calculator
language that can execute simple equations in a golang
project of our own.
Before we start, there are some software pre-requisites. You will need to
install ANTLR. This is a simple JAR File
that we can invoke locally.
$ wget http://www.antlr.org/download/antlr-4.7-complete.jar
$ alias antlr='java -jar $PWD/antlr-4.7-complete.jar'
Code generation
Now that we’ve got ANTLR installed, it’s time to generate some code. We do this
using a grammer file. A very comprehensive calculator can be found in the examples
of the antlr grammers repository here.
For today’s example, we’ll just focus on addition, subtraction, multiplication, and division
with the following grammer file:
Even without fully understanding the grammer language, you can see that there is
some basic token definitions, rules, and expression definitions.
MUL, DIV, ADD, SUB, NUMBER, and WHITESPACE all being significant to
the language that we’re definting.
The expression definition not only defines operations for us, but will also be
key in defining operator precedence, with the MulDiv rule occuring before the AddSub
rule, finally dealing with Number.
We can turn this grammer file into some go code with the following invocation:
$ antlr -Dlanguage=Go -o parser Calc.g4
This creates a parser folder for us now with a few different pieces of go code.
Parsers, Lexers, and Listener
If you look in the parser folder at the code that was created, you shoul see something
similar to this:
The Lexer’s job is to perform Lexical Analysis on
arbitrary pieces of text, and tokenizes that text into a set of symbols. For example, the input
of 1 + 2 might get tokenized to NUMBER 1, ADD, NUMBER 2. These tokens are now
fed into the parser.
The Parser’s job is to take these
tokens, and make sure they conform to the rules of the language. You can imagine that
a LISP style language would expect ADD, NUMBER 1, NUMBER 2 rather than a c-style
language that would expect the operator in between the number tokens.
After the string has passed through the lexer and the parser, it now runs through
the listener where we can write some code to respond to these symbols in order.
Implementation
The internal implementation of this calculator is a stack-based calculator. This gets
represented as struct:
The internal state of the calculator are int values on that stack. As operations
execute, the program will take the top of the stack as well that second-to-the-top
and perform arithmetic, leaving the result on the top of the stack.
The BaseCalcListner type that was generated for us has all of the hooks we need
to latch onto the complete the implementation. The NUMBER, ADDSUB, and MULDIV rules
all get their own listener for us to respond to.
func(l*calculatorListener)ExitMulDiv(c*parser.MulDivContext){// get TOS and STOSrhs,lhs:=l.pop(),l.pop()// perform the required operation, pushing the result back// up as the new TOSswitchc.GetOp().GetTokenType(){caseparser.CalcParserMUL:l.push(lhs*rhs)caseparser.CalcParserDIV:l.push(lhs/rhs)default:panic(fmt.Sprintf("not yet implemented: %s",c.GetOp().GetText()))}}func(l*calculatorListener)ExitAddSub(c*parser.AddSubContext){// get TOS and STOSrhs,lhs:=l.pop(),l.pop()// perform the required operation, pushing the result back// up as the new TOSswitchc.GetOp().GetTokenType(){caseparser.CalcParserADD:l.push(lhs+rhs)caseparser.CalcParserSUB:l.push(lhs-rhs)default:panic(fmt.Sprintf("not yet implemented: %s",c.GetOp().GetText()))}}func(l*calculatorListener)ExitNumber(c*parser.NumberContext){// coerce the string into an integeri,err:=strconv.Atoi(c.GetText())iferr!=nil{panic(err.Error())}// push onto the stackl.push(i)}
Execution
Now we go from text input to execution. In the following snippet, the
input stream feeds the text into the lexer. The lexer then gets setup
as a stream ready to tokenize our input.
Finally, all of those tokens get parsed to make sure they represent
valid expressions for our language.
We can now walk the parser tree with a listener attached. The listener
will fire off our hooks that we defined earlier; and our stack-based calculator
should leave us with the result at the TOS.
We should be left with something like this on screen:
1 + 5 - 2 * 20 = -34
Conclusion
As you can see, ANTLR is a very powerful tool for writing all of the pieces
of a compiler (or in this case, an interpreter) to get you kick started very
quickly.
You’d almost be insane to ever do this stuff yourself!
Sometimes you can be just as productive using your shell as you are in any
programming environment, you just need to know a couple of tricks. In this
article, I’ll walk through some basic tips that I’ve come across.
Reading input
You can make your scripts immediately interactive by using the read
instruction.
#/bin/bashecho-n"What is your name? "read NAME
echo"Hi there ${NAME}!"
String length
You can get the length of any string that you’ve stored in a variable by
prefixing it with #.
#/bin/bashecho-n"What is your name? "read NAME
echo"Your name has ${#NAME} characters in it"
Quick arithmetic
You can perform some basic arithmetic within your scripts as well. The value
emitted with the # character is an integral value that we can perform tests
against.
#/bin/bashecho-n"What is your name? "read NAME
if((${#NAME}> 10 ))then
echo"You have a very long name, ${NAME}"fi
Substrings
String enumeration will also allow you to take a substring directly. The
format takes the form of ${VAR:offset:length}.
Passing positive integers for offset and length will make substring
operate from the leftmost side of the string. Negative numbers provide a
reverse index, from the right.
STR="Scripting for the win"echo${STR:10:3}# forecho${STR: -3}# winecho${STR: -7: 3}# the
Replacement
It’s common place to be able to use regular expressions to make substitutions
where needed, and they’re available to you at the shell as well.
#!/bin/bashSTR="Scripting for the win"echo${STR/win/WIN}# Scripting for the WIN
Finishing up
There’s lots more that you can do just from the shell, without needing to
reach for other tools. This is only a few tips and tricks.
TLS has forever played a very large part in securing internet communications. Secure Socket Layer (SSL) filled this space prior to TLS coming to the fore.
In today’s article, I’m going to walk through an exercise of mTLS which is just an extension of TLS.
CA
First of all, we need a certificate authority (CA) that both the client and the server will trust. We generate these using openssl.
This now puts a private key in ca.key and a certificate in ca.crt on our filesystem. We can inspect these a little further with the following.
openssl x509 --in ca.crt -text --noout
Looking at the output, we see some interesting things about our CA certificate. Most importantly the X509v3 Basic Constraints value is set CA:TRUE, telling us that this certificate can be used to sign other certificates (like CA certificates can).
Server
The server now needs a key and certificate. Key generation is simple, as usual:
openssl genrsa -out server.key 2048
We need to create a certificate that has been signed by our CA. This means we need to generate a certificate signing request, which is then used to produce the signed certificate.
This gives us a signing request for the domain of localhost as mentioned in the -subj parameter. This signing request now gets used by the CA to generate the certificate.
Inspecting the server certificate, you can see that it’s quite a bit simpler than the CA certificate. We’re only able to use this certificate for the subject that we nominated; localhost.
Client
The generation of the client certificates is very much the same as the server.
# create a key
openssl genrsa -out client.key 2048
# generate a signing certificate
openssl req -new -key client.key -subj '/CN=my-client' -out client.csr
# create a certificate signed by the CA
openssl x509 -req -in client.csr -CA ca.crt -CAkey ca.key -CAcreateserial -days 365 -out client.crt
The subject in this case is my-client.
The -CAcreateserial number also ensures that we have unique serial numbers between the server and client certificates. Again, this can be verified when you inspect the certificate.
# Server serial number
Serial Number:
5c:2c:47:44:2c:13:3b:c9:56:56:99:37:3f:c9:1e:62:c4:c7:df:20
# Client serial number
Serial Number:
5c:2c:47:44:2c:13:3b:c9:56:56:99:37:3f:c9:1e:62:c4:c7:df:21
Only the last segment was incremented here. You get the idea though. Unique.
Appliation
Now, we setup a basic node.js server that requires mTLS.
consthttps=require('https');constfs=require('fs');consthostname='localhost';constport=3000;constoptions={ca:fs.readFileSync('ca.crt'),cert:fs.readFileSync('server.crt'),key:fs.readFileSync('server.key'),rejectUnauthorized:true,requestCert:true,};constserver=https.createServer(options,(req,res)=>{res.statusCode=200;res.setHeader('Content-Type','text/plain');res.end('Hello World');});server.listen(port,hostname,()=>{console.log(`Server running at http://${hostname}:${port}/`);});
Most important here is that the server’s options specify rejectUnauthorized as well as requestCert. This will force the mTLS feedback look back to the client.
A curl request now verifies that the solution is secured by this system of certificates.
The client’s key, certificate, and the ca cert accompany a successful request. A request in any other format simply fails as the authentication requirements have not been met.
In networking, a port is assigned as a logical entity that a socket is established on. These sockets are owned by processes in your operation system. From time to time, it can be unclear which process owns which socket (or who is hogging which port).
In today’s article, I’ll take you through a few techniques on finding out who is hanging onto particular ports.
netstat
netstat is a general purpose network utility that will tell you about activity within your network interfaces.
If you can not find netstat installed on your system, you can normally get it from the net-tools package.
The following command will give you a breakdown of processes listening on port 8080, as an example:
➜ netstat -ltnp | grep -w ':8080'
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
tcp6 0 0 :::8080 :::* LISTEN -
An important message appears here. “Not all processes could be identified, non-owned process info will not be shown, you would have to be root to see it all.”. There will be processes invisible to you unless you run this command as root.
Breaking down the netstat invocation:
l will only show listening sockets
t will only show tcp connections
n will show numerical addresses
p will show you the PID
You can see above, that no process is shown. Re-running this command as root:
lsof will give you a list of open files on the system. Remember, sockets are just files. By using -i we can filter the list down to those that match on an internet address.
➜ sudo lsof -i :8080
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
docker-pr 2765 root 4u IPv6 36404 0t0 TCP *:http-alt (LISTEN)
fuser
fuser is a program that has overlapping responsibilities with the likes of lsof.
fuser — list process IDs of all processes that have one or more files open
You can filter the list down directly with the command:
➜ sudo fuser 8080/tcp
8080/tcp: 2765
This gives us a PID to work with. Again, note this is run as root. Now all we need to do is to tranform this PID into a process name. We can use ps to finish the job.
PostgreSQL is a very popular relational database which has quite a few different data access libraries available for the Haskell programming language.
Today’s article aims to get you up and running, executing queries against PostgreSQL from your Haskell environment with the least amount of hassle.
postgresql-simple
The first library that we’ll go through is postgresql-simple. This library has a very basic interface, and is really simple to get up an running.
A mid-level client library for the PostgreSQL database, aimed at ease of use and high performance.
Prerequisites
Before you get started though, you’ll need libpq installed.
pacman -S postgresql-libs
Now you’re ready to develop.
You’ll need to add a dependency on the postgresql-simple library to your application. The following code will then allow you to connect to your PostgreSQL database, and ru a simple command.
Hello, Postgres!
{-# LANGUAGE OverloadedStrings #-}moduleMainwhereimportDatabase.PostgreSQL.SimplelocalPG::ConnectInfolocalPG=defaultConnectInfo{connectHost="172.17.0.1",connectDatabase="clients",connectUser="app_user",connectPassword="app_password"}main::IO()main=doconn<-connectlocalPGmapM_print=<<(query_conn"SELECT 1 + 1"::IO[OnlyInt])
When your application successfully builds and executes, you should be met with the following output:
Only {fromOnly = 2}
Walking through this code quickly, we first enable OverloadedStrings so that we can specify our Query values as literal strings.
In order to connect to Postgres, we use a ConnectInfo value which is filled out for us via defaultConnectInfo. We just override those values for our examples. I’m running PostgreSQL in a docker container, therefore I’ve got my docker network address.
conn<-connectlocalPG
The localPG value is now used to connect to the Postgres database. The conn value will be referred to after successful connection to send instructions to.
Finally, we run our query SELECT 1 + 1 using the query_ function. conn is passed to refer to the connecion to execute this query on.
With this basic code, we can start to build on some examples.
Retrieve a specific record
In the Hello, World example above, we were adding two static values to return another value. As exampeles get more complex, we need to give the library more information about the data that we’re working with. Int is very well known already, and already has mechanisms to deal with it (along with other basic data types).
In the client database table we have a list of names and ids. We can create a function to retrieve the name of a client, given an id:
retrieveClient::Connection->Int->IO[OnlyString]retrieveClientconncid=queryconn"SELECT name FROM client WHERE id = ?"$(Onlycid)
The Query template passed in makes use of the ? character to specify where substitutions will be put. Note the use of query rather than query_. In this case, query also accepts a Tuple containing all of the values for substitution.
Using the FromRow type class, our code can define a much stronger API. We can actually retrieve client rows from the database and convert them into Client values.
We need FromRow first:
importDatabase.PostgreSQL.Simple.FromRow
The Client data type needs definition now. It’s how we’ll refer to a client within our Haskell program:
In order of the fields definitions, we give fromRow definition. The retrieveClient function only changes to broaden its query, and change its return type!
retrieveClient::Connection->Int->IO[Client]retrieveClientconncid=queryconn"SELECT id, name FROM client WHERE id = ?"$(Onlycid)
Create a new record
When creating data, you can use the function execute. The execute function is all about execution of the query without any return value.
executeconn"INSERT INTO client (name) VALUES (?)"(Only"Sam")
Extending our API, we can make a createClient function; but with a twist. We’ll also return the generated identifier (because of the id field).
createClient::Connection->String->IO[OnlyInt64]createClientconnname=queryconn"INSERT INTO client (name) VALUES (?) RETURNING id"$(Onlyname)
We need a definition for Int64. This is what the underlying SERIAL in PostgreSQL will translate to inside of your Haskell application.
importData.Int
We can now use createClient to setup an interface of sorts fo users to enter information.
main::IO()main=doconn<-connectlocalPGputStrLn"Name of your client? "clientName<-getLinecid<-createClientconnclientNameputStrLn$"New Client: "++(showcid)
We’ve created a data creation interface now.
Name of your client?
Ringo
New Client: [Only {fromOnly = 4}]
Update an existing record
When it comes to updating data, we don’t expect much back in return aside from the number of records affected by the instruction. The execute function does exactly this. By measuring the return, we can convert the row count into a success/fail style message. I’ve simply encoded this as a boolean here.
updateClient::Connection->Int->String->IOBoolupdateClientconncidname=don<-executeconn"UPDATE client SET name = ? WHERE id = ?"(name,cid)return$n>0
Destroying records
Finally, destroying information out of the database will look a lot like the update.
deleteClient::Connection->Int->IOBooldeleteClientconncid=don<-executeconn"DELETE FROM client WHERE id = ?"$(Onlycid)return$n>0
execute providing the affected count allows us to perform the post-execution validation again.
Summary
There’s some basic operations to get up and running using postgresql-simple. Really looks like you can prototype software all the way through to writing fully blown applications with it.