Cogs and Levers A blog full of technical stuff

Scala type construction

Scala gives the developer flexibility when reasoning about and designing type systems for applications. By using classes and traits, a developer can quickly build a complex hierarchy that can assist in describing constraint and relationship information.

In today’s post, I’m going to walk through a useless but demonstrative example of a type hierarchy and some of the constraint features available to the developer.

Vehicles

We’re going to model some different vehicles. Cars, planes, trucks, skateboards, whatever.

abstract class Vehicle

We could start case-classing this base out or directly adding derivatives that specialise down to the exact vehicle types that we want, but we’re going to reason about some attributes that these vehicles might have. Wheels and Jets.

trait HasWheels extends Vehicle {
  def numberOfWheels: Int
}

trait HasJets extends Vehicle {
  def numberOfJets: Int
}

When a vehicle HasWheels, the type is going to require us to specify numberOfWheels. Likewise numberOfJets for HasJets. These traits are extending our abstract Vehicle class.

When we have wheels, we should be able to set how fast they’re spinning.

trait WheelPropulsion {
  this: HasWheels =>

  def setWheelSpin(velocity: Double) {
    println("Wheel spin: " + velocity)
  }
}

Our WheelPropulsion trait says that this needs to be HasWheels. Makes sense. We can’t spin wheels if we don’t have wheels.

Likewise, we’d want to set the turbine intensity if we have jets.

trait JetPropoulsion {
  this: HasJets =>

  def setTurbine(strength: Double) {
    println("Setting turbine strength to " + strength)
  }
}

Even with this very basic level of type description we can start to make some basic vehicles.

class Motorcycle extends Vehicle with HasWheels
                                 with WheelPropulsion {
  def numberOfWheels = 2                                  
}

class MotorVehicle extends Vehicle with HasWheels
                                   with WheelPropulsion {
  def numberOfWheels = 4
}

class TwinJetPlane extends Vehicle with HasJets
                                   with JetPropoulsion {
  def numberOfJets = 2                                  
}

Mixing in

When you’re assembling a variable of your own, there’s no reason you can’t mix in when creating your own types:

val toyota = new Vehicle() with HasWheels 
                           with WheelPropulsion { 
  def numberOfWheels = 4
}

Of course, we could have just constructed toyota as a MotorVehicle for the same effect. This just demonstrates the instance construction flexibility.

Constraints

Finally, when you’re writing functions that work with your types you can specify rich constraint rules so that you can target functionality with as much precision as you require:

// everything can be painted
def paint(v: Vehicle) = { }

// only a vehicle with wheels can burnout
def doBurnout(v: Vehicle with HasWheels) = { }

As you can see, you not only use the with keyword to define your types; this keyword is also used for variable construction and function signature definition.

Covariance and contravariance in your types

When creating parameterised types, you have control on how those types can be passed. These nuances are referred to as variance and scala allows you to explicitly nominate how this works in your own classes.

An excellent explanation on these terms can be found here. I’ve reproduced the three main points for this article though:

That is, if A and B are types, f is a type transformation, and the subtype relation (i.e. A ≤ B means that A is a subtype of B), we have:

  • f is covariant if A ≤ B implies that f(A) ≤ f(B)
  • f is contravariant if A ≤ B implies that f(B) ≤ f(A)
  • f is invariant if neither of the above holds

Invariant

Invariant parameter types are what ensures that you can only pass MyContainer[Int] to def fn(x: MyContainer[Int]). The guarantee is that the type that you’re containing (when it’s being accessed) is done so as the correct type.

class MyInvariant[T](var value: T)

This guarantees the type of T when we go to work on it.

def double(a: MyInvariant[Int]) = { 
  a.value *= 2
}

You can see here that a good case for invariant is for mutable data.

To show the error case here, we define a show function specialising to MyInvariant[Any]

def show(a: MyInvariant[Any]) = { 
  println("Here is: " + a.value) 
}

Trying to use this function:

scala> show(new MyInvariant[Int](5))
<console>:13: error: type mismatch;
 found   : MyInvariant[Int]
 required: MyInvariant[Any]
Note: Int <: Any, but class MyInvariant is invariant in type T.
You may wish to define T as +T instead. (SLS 4.5)
       show(new MyInvariant[Int](5))
            ^

Covariant

Covariant parameter type is specific. You pass these sorts of types to functions that generalise their inner type access. You need to decorate the type parameter with a +.

class CovariantContainer[+T](var value: T)

Then your function to generalise over this type:

def show(a: CovariantContainer[Any]) = { 
  println("The value is " + a.value)
}

Covariance is a good case for read-only scenarios.

Contravariant

Contravariance is defined by decorating the type parameter with a -. It’s useful in write-only situations.

class ContravariantContainer[-T](var value: T)

We write specialised functions for the type, but that are write-only cases:

def write(a: ContravariantContainer[String]) = {
  println("Writing " + a)
}

Rules

When designing types, the following rules are very important when dealing with parameterization of types.

  • Mutable containers should be invariant
  • Immutable containers should be covariant
  • Transformation inputs should be contravariant
  • Transformation outputs should be covariant

Modeling a function call

Armed with this information, we can generalise function execution into the following type:

trait Fn[-In, +Out] {
  def apply(i: In): Out
}

Defining this trait, allows us to generalise the computation of an input to an output like the following:

val anyToInt = new Fn[Any, Int] {
  def apply(i: Any) = i.toString.toInt
}

DROP DATABASE and other users

From time to time, when I’ve gone and issued DROP DATABASE on my postgres server, I’m returned with the following error:

ERROR:  database "xyz" is being accessed by other users
DETAIL:  There is 1 other session using the database.

All this is telling us is that we need to be the only user/connection on the database before performing such an operation.

First of all, we need to prevent other users from making a connection to the database from this point. To do this, we’ll use REVOKE CONNECT like so:

REVOKE CONNECT ON DATABASE "xyz" FROM public;

Next, we kill off every connection:

SELECT pg_stat_activity.pid, pg_terminate_backend(pg_stat_activity.pid)
FROM pg_stat_activity
WHERE pg_stat_activity.datname = 'xyz';

Now you can issue your DROP DATABASE.

Detailed GC logs

From time to time, it makes sense to perform some GC tuning on your Java Virtual Machines. Whilst there are a lot of tools that can visually help your debugging process, in today’s post I’ll talk you through the GC log that you can optionally turn on in your virtual machine arguments.

Enabling the log

To boost up the logging of your application, you’ll need to tune the execution runtime using command line parameters. The following parameters will get the JVM to log out information that it’s holding on garbage collection events.

-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xloggc:/tmp/gc.log

To explain these a little:

-verbose:gc will ramp the logging level of GC events up to a verbose level, -XX:+PrintGCDetails and -XX:+PrintGCTimeStamps define some features of the log that’s written. Finally -Xloggc:/tmp/gc.log defines the file endpoint on disk that the GC log will be written to.

Reading the log

After you’ve run your program with these parameters engaged, you should find the /tmp/gc.log file sitting on your hard drive waiting to be read. I won’t dump the full log for the test program that I’ve run here; rather I’ll go through it piece by piece.

The header of the file defines what your software versions, memory statistics and virtual machine arguments are.

OpenJDK 64-Bit Server VM (25.66-b01) for linux-amd64 JRE (1.8.0_66-internal-b01), built on Aug  5 2015 09:09:16 by "pbuilder" with gcc 4.9.2
Memory: 4k page, physical 8055396k(6008468k free), swap 8267772k(8267772k free)
CommandLine flags: -XX:InitialHeapSize=1073741824 -XX:MaxHeapSize=1073741824 -XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseParallelGC 

After these initial lines, you’ll start to see some of the memory allocation events appear along with the timestamps (remember, we asked for timestamps above).

0.320: [GC (Allocation Failure) [PSYoungGen: 262144K->43488K(305664K)] 262144K->142896K(1005056K), 0.1552445 secs] [Times: user=0.40 sys=0.20, real=0.16 secs] 
0.649: [GC (Allocation Failure) [PSYoungGen: 305632K->43504K(305664K)] 405040K->275088K(1005056K), 0.2105517 secs] [Times: user=0.58 sys=0.26, real=0.21 secs] 
0.986: [GC (System.gc()) [PSYoungGen: 219445K->43520K(305664K)] 451029K->369480K(1005056K), 0.1570988 secs] [Times: user=0.47 sys=0.14, real=0.15 secs] 
1.143: [Full GC (System.gc()) [PSYoungGen: 43520K->0K(305664K)] [ParOldGen: 325960K->368132K(699392K)] 369480K->368132K(1005056K), [Metaspace: 2530K->2530K(1056768K)], 2.5983336 secs] [Times: user=9.55 sys=0.03, real=2.59 secs] 
3.984: [GC (Allocation Failure) [PSYoungGen: 262144K->32K(305664K)] 630276K->368164K(1005056K), 0.0049817 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] 
4.070: [GC (System.gc()) [PSYoungGen: 108791K->32K(305664K)] 476924K->368164K(1005056K), 0.0041558 secs] [Times: user=0.02 sys=0.00, real=0.00 secs] 
4.074: [Full GC (System.gc()) [PSYoungGen: 32K->0K(305664K)] [ParOldGen: 368132K->133835K(699392K)] 368164K->133835K(1005056K), [Metaspace: 2539K->2539K(1056768K)], 0.4427402 secs] [Times: user=1.59 sys=0.01, real=0.45 secs] 

Pulling one of these lines apart:

0.320: [GC (Allocation Failure) [PSYoungGen: 262144K->43488K(305664K)] 262144K->142896K(1005056K), 0.1552445 secs] [Times: user=0.40 sys=0.20, real=0.16 secs] 

This event was generated 0.320 seconds into the program. This item is a GC (Allocation Failure) event and it’s being reported on the PSYoungGen collection. Prior to the event, the space allocated before was 262144K and after was 43488K. The capacity value is in braces 305664K.

The Full GC events will give you statistics for all of the memory collections:

1.143: [Full GC (System.gc()) [PSYoungGen: 43520K->0K(305664K)] [ParOldGen: 325960K->368132K(699392K)] 369480K->368132K(1005056K), [Metaspace: 2530K->2530K(1056768K)], 2.5983336 secs] [Times: user=9.55 sys=0.03, real=2.59 secs] 

Each of the collections is displayed as [CollectionName: SpaceBefore->SpaceAfter(Capacity)].

Finally, we have a heap analysis of the program as it breaks down amongst the different memory classes: Young Gen, Old Gen and (new for 1.8) Metaspace. Metaspace would have previously been Perm Gen.

Heap
 PSYoungGen      total 305664K, used 5243K [0x00000000eab00000, 0x0000000100000000, 0x0000000100000000)
  eden space 262144K, 2% used [0x00000000eab00000,0x00000000eb01ecf8,0x00000000fab00000)
  from space 43520K, 0% used [0x00000000fab00000,0x00000000fab00000,0x00000000fd580000)
  to   space 43520K, 0% used [0x00000000fd580000,0x00000000fd580000,0x0000000100000000)
 ParOldGen       total 699392K, used 133835K [0x00000000c0000000, 0x00000000eab00000, 0x00000000eab00000)
  object space 699392K, 19% used [0x00000000c0000000,0x00000000c82b2c88,0x00000000eab00000)
 Metaspace       used 2546K, capacity 4486K, committed 4864K, reserved 1056768K
  class space    used 268K, capacity 386K, committed 512K, reserved 1048576K

Standard object protocols in Python

To give your objects a more baked-in feel, you can use python’s standard object protocol functions so that native operators start to operate on your object.

By implementing the following items on your custom objects, infix operators start to work executing your custom code as per defined.

General

By overriding __bool__ in your objects, you can define how your object will respond in conditional scenarios. __bool__ effectively allows you to use your object as a condition in an if or while statement.

class Boolable:
  def __init__(self, n):
    self.num = n

  def __bool__(self):
    return self.num % 2 == 0    

The method __call__ will allow your object to openly accept function calls:

class Callable:
  def __call__(self, *args, *kwargs):
    # implementation here

c = Callable()
c()

Array

The following overrides allow you to make your objects appear like containers (arrays, etc.):

Operation Method Description
Length __len__ Allows the len function to operate on the object
Iterator __iter__ Gets an object to start the iteration process
Next __next__ Gets an object to continue the iteration process
Reverse __reversed__ Reverses the internal sequence

Dictionary

The following overrides allow you to make your object respond like a dict:

Operation Method Description
Set item __setitem__ Sets an item in the dictionary
Get item __getitem__ Retrieves an item from the dictionary
Delete item __delitem__ Removes an item from the dictionary

Mathematic

The following table lists out all of the methods that you can override on a class that will give you access to arithmetic operators.

Operation Operator LHS RHS Inline
Addition + __add__ __radd__ __iadd__
Subtraction - __sub__ __rsub__ __isub__
Multiplication * __mul__ __rmul__ __imul__
True Division / __truediv__ __rtruediv__ __itruediv__
Floor Division // __floordiv__ __rfloordiv__ __ifloordiv__
Modulo % __mod__ __rmod__ __imod__
Division and Modulo divmod __divmod__ __rdivmod__  
Exponentiation ** __pow__ __rpow__ __ipow__
Shift left << __lshift__ __rlshift__ __ilshift__
Shift right >> __rshift__ __rrshift__ __irshift__
Bitwise AND & __and__ __rand__ __iand__
Bitwise OR | __or__ __ror__ __ior__
Bitwise XOR ^ __xor__ __rxor__ __ixor__
Bitwise NOT ~ __invert__    
Function Method
Floor __floor__
Ceil __ceil__
Round __round__
Negate __neg__
Positive __pos__
Absolute __abs__

Comparison

The following table lists all of the comparison operators

Operation Operator Method
Equals == __eq__
Not Equal != __ne__
Greater than, equal to >= __gte__
Greater than > __gt__
Lesser than, equal to <= __lte__
Lesser than < __lt__

Type conversions

Type Method Description
int __int__  
float __float__  
complex __complex__  
index __index__ Allows python to use your object as an array index

Context

The following override allow your objects to measure contexts:

Operation Method Description
Enter __enter__ Measures when a context enters
Exit __exit__ Measures when a context exits

These functions are useful when your object is supplied to a with statement.

class ContextMeasurement:
  def __enter__(self):
    print("Entering context")
      
  def __exit__(self, exc_class, exc_instance, traceback):
    print("Exiting context")
        
with ContextMeasurement():
  print("Inside the context right now")