Cogs and Levers A blog full of technical stuff

MongoDB console cheatsheet

Basic retrieves

Find all items in a collection

	db.collection.find();

Count all items in a collection

	db.collection.count();

Find all items in a collection with criteria (Mongo operators reference)

	db.collection.find({"field": value});

Reduce the field-set returned by a query

	db.collection.find({}, {"field": 1});

Sorting by a field

	db.collection.find().sort({"field": 1});

Limit the number of records returned by a query

	db.collection.find().limit(3);

Skip over a set of documents

	db.collection.find().skip(3);

List processing

Map over a cursor of results

	db.collection.find().map(function(item) {
	    return item;
	});

Iterate over a cursor of results

	db.collection.find().forEach(function(item) {
	    printjson(item);
	});

Updates

Update over an enumerated list

	db.collection.find().forEach(function(item) {
	    // make changes to item
	    db.collection.save(item);
	});

Update from a query (Mongo update reference)

	db.collection.update({"_id": ObjectId("xxxx")}, 
	                     { $set: { "field": value },
	                       $push: { "subdocument": subdocument } 
	                     });

Deleting

Destroy an entire collection

	db.collection.drop();

Delete a document selectively

	db.collection.remove({"_id": ObjectId("xxxx"));

Utilities

Set the record limit returned to the console

	DBQuery.shellBatchSize = 100

Shebang, Ruby and RVM

I needed to distribute one of my ruby programs to be setup as a job, so I needed to add a shebang to the top of the file. I run my development environment using RVM so I had no idea how to address ruby in the shebang. Then it hit me ..

#!/usr/bin/env ruby

Interestingly, things are a little different when trying to execute a ruby with one of your RVM installed rubies from a cron job. The rvm website has a whole section regarding the topic of cron integration. The command that I passed to successfully execute these jobs needed to be addressed absolutely:

bash -c "/home/user/.rvm/bin/ruby /path/to/script.rb"

Easy.

Making Cleaner NASM Code with Macros

Introduction

Cleaner, clearer code is better. It’s easier to debug, it’s easier to read, it’s just plain - better. Assembly code isn’t known for its ability to allow the developer to make their intentions clear in its source code, but we can get closer with some carefully craft macros. Macros are symbols that you can use in your code to represent a block of code. Macros are allowed to take parameters which makes them an extremely flexible and valuable tool in your arsenal. These symbols that you use in your code are swapped out by nasm at the time of assembly for the blocks of code that they represent. If you want to go further in depth to the nasm pre-processor and macros, check it out in the manual here.

Today’s post will be focused on cleaning up the code that we’d written in this previous article to look a little more human.

Revisiting write and strlen

In the previous article “strlen() implementation in NASM”, we’d put together a couple of ways to take the length of a string. This article will assume that we’re already using this code. With this in mind, we can put together a general purpose print function that will display a zero terminated string with the following.

; _print
;
; input
; rdi points to the zero terminated string that 
;     we're printing
;
; output
; none

_print:

  push  rcx       ; start off by preserving the registers
  push  rdx       ; that we know that we'll trash in this
  push  rax       ; proc
  push  rbx

  mov   rcx, rdi  ; rcx = string memory location
  call  _strlen   ; calculate the string's length
  mov   rdx, rax  ; rdx = string length
  mov   rax, 4    ; write() is syscall 4
  mov   rbx, 1    ; we're writing to stdout
  int   0x80      ; execute the call

  pop   rbx       ; restore all of the registers
  pop   rax
  pop   rdx
  pop   rcx

  ret             ; get out

Ok, that’s a nice and neat little bundle. Now, everytime that we want to call this function, we need to write code that looks like the following.

mov  rdi, message    ; load our string into rdi
call _print          ; write the message

Which, isn’t too bad I guess. We can make it look better though. Consider the following code that wraps this code into a macro.

; print - prints a null terminated string
%macro print 1
  push  rdi         ; save off rdi

  mov   rdi, %1     ; load the address of the string
  call  _print      ; print the string

  pop   rdi         ; restore rdi
%endmacro

The syntax here may look a little alien to begin with, but it’ll all make sense in a minute. So, we start the macro block off with a %macro directive. What follows is the name of the macro, in this case print and after that is the number of parameters that this macro will expect (we want one parameter, being the string to print). We have the print code between the directives. You’ll see that rdi gets loaded with %1 which just means “replace %1 with the first thing passed to this macro”. To finish up your macro, you have %endmacro. With that macro defined, you can now print a message to screen by doing this.

print message

This is starting to look a little higher-level now. A bit more “human” on the eyes. Another nifty trick that I’d picked up a while ago was a macro for defining strings. In all of the examples we’ve seen so far, you’d declare strings in the data segment with the following syntax.

section .data

  message db "Hello, world", 0

This is perfectly fine, however we can wrap this string declaration up into a macro of its own as well allowing us to define strings where ever we are. We need to be careful though. Defining a string in the code segment without the appropriate jumps is dangerous as we run the risk of executing the string data. The following macro does this safely.

; sz - defines a zero terminated string
%macro sz 2
  jmp %1_after_def    ; jump over the string that we define
  %1 db %2, 0         ; declare the string
  %1_after_def:       ; continue on
%endmacro

You can see that we’ve declared a macro that expects two parameters. The first parameter is the name of the variable that we declare. This name is also used to formulate the labels that we jump to so that they are unique between string definitions. The second parameter is the actual string data itself. Now that we have both of these macros defined, the following code is perfectly legal and works a treat.

sz message "This is much more interesting than Hello, World!"
print message

Well, this is only the start of what you can accomplish with macros. An exercise to the reader would be to implement your own version of print that prints a new line after it prints the string - you never know, you might even want to call it “println”!

Enjoy.

A Light cron Tutorial

Introduction

cron is the time-based task scheduler for Unix. I think the wikipedia article sums up its description best, so I won’t try and reproduce it:

Cron is the time-based job scheduler in Unix-like computer operating systems. Cron enables users to schedule jobs (commands or shell scripts) to run periodically at certain times or dates.

Today’s post will be a light tutorial in setting up jobs using cron.

Job types

Within the cron system there are two flavors of tasks. The first is at the system level the other is at the user level. The main difference being, the system level tasks (controlled by administrators) are able to run as any particular user. User jobs are setup by the user and installed for the user.

Job installation and modification

Start an editing session of the cron table (crontab) by issuing the following command.

$ crontab -e

You’ll now be looking at the job definitions that are setup. To add a job to the list, you need to add it in the following format.

Minute Hours Day Month DayOfWeek Command [args]
  • Minute is specified as (0 - 59)
  • Hours is specified as (0 - 23)
  • Day is specified as (0 - 31)
  • Month is specified as (0 - 12 where 12 is December)
  • DayOfWeek is specified as (0 - 7 where 7 or 0 are Sunday)
  • Command is the shell command you want to execute

For a system level task, a new field to specify the username is added into this format.

Minute Hours Day Month DayOfWeek Username Command [args]

This will only apply to system level tasks that are added. Operators can be used in conjunction with literal values to short-cut some of the more common tasks.

  • Use an asterisk * to define all values for a field
  • Use a comma , to separate multiple values for a field
  • Use a dash - to define a range

To make some more sense out of the time fields, here are a few examples and when they’d execute.

Crontab entry Interval
0 1 * * * script.sh Run at 1 in the morning everyday
0 6 1 * * script.sh Run at 6am on the first of every month
0 11 * * 1-5 script.sh Run at 11am every weekday
0 17 5 5 * Run at 5 in the afternoon on the 5th of May
0 7-19/2 * * * Run every 2 hours from 7 in the morning until 7 at night

Out of the box, a cron job will email your local unix account with the results of the job run. If you don’t want to receive this email just pipe the output of your cron command to null, like so.

0 7 * * * test.sh >/dev/null 2>&1

Some short-cut “special variables” that you can use in conjunction with the times that these jobs run look like this (these really clean up the way a crontab reads).

Variable Meaning Cron equiv.
@reboot Run once, at startup  
@yearly / @annually Run once per year 0 0 1 1 *
@monthly Run once per month 0 0 1 * *
@weekly Run once per week 0 0 * * 0
@daily / @midnight Run once per day 0 0 * * *
@hourly Run once per hour 0 * * * *

Other maintenance

You can list the cron table with the following command.

$ crontab -l
$ crontab -u user -l

You can remove all entries out of the cron table with the following command.

$ crontab -r
$ crontab -u user -r

That’s it! A nice light tutorial.

strlen() implementation in NASM

Introduction

Seeing so many “Hello, world” concepts for getting up an running in Assembly has annoyed me a little bit. I see people using the $ - msg macro to calculate the length of their string at assemble time. In today’s post, I’ll show you how to measure the length of your string at runtime so you’ll be able to provide the write syscall’s third parameter a little more flexibly.

The logic

The logic behind this procedure is dead-simple. Test the current byte for being null, if it is get out now if its not keep counting! Here’s how the code looks.

_strlen:

  push  rcx            ; save and clear out counter
  xor   rcx, rcx

_strlen_next:

  cmp   [rdi], byte 0  ; null byte yet?
  jz    _strlen_null   ; yes, get out

  inc   rcx            ; char is ok, count it
  inc   rdi            ; move to next char
  jmp   _strlen_next   ; process again

_strlen_null:

  mov   rax, rcx       ; rcx = the length (put in rax)

  pop   rcx            ; restore rcx
  ret                  ; get out

This is just straight-forward memory testing, no great advancements in computer science here! The function expects that the string that requires testing will be in the rdi register. To actually use this function in your application though, you’ll need to transport the result (which sits in rax by the time the function has completed execution) into the register that write expects its length parameter. Here’s how you use your new strlen function (in the Debian scenario).

; strlen(hello)
mov   rdi, hello    ; rdi is the string we want to 
                    ; get the length of

call  _strlen       ; get the length!

mov   rdx, rax      ; rdx now holds the string length
                    ; ready for our write syscall

; write(fd, buf, len)
mov   rax, 4        ; syscall 4 == write
mov   rbx, 1        ; fd = 1 == stdout
mov   rcx, hello    ; the string to write
int   0x80          ; print the string

So you can see that this is quite straight forward. We setup rdx before we setup the rest of the registers. We could have done this the other way around - to be on the safe side, I’ve done it this way as you never know what registers get mowed over in people’s functions. I tried to help this also in the _strlen implementation by saving the only work register that I use rcx. Anyway, that’s how you measure your string.

A more optimal way?

After completing this article, I’d thought about the “brute-forcish” way that I’d crunched out the numbers to derive a string’s length and thought to myself, what if I could just scan the string of bytes - find the null character and subtract this found index from the original starting point. Mathematically I would have calculated the distance in bytes between the start of the string and the NULL character, ergo the string length. So, I’ve written a new string length implementation that does just this and here it is.

_strlen2:

  push  rbx                 ; save any registers that 
  push  rcx                 ; we will trash in here

  mov   rbx, rdi            ; rbx = rdi

  xor   al, al              ; the byte that the scan will
                            ; compare to is zero

  mov   rcx, 0xffffffff     ; the maximum number of bytes
                            ; i'm assuming any string will
                            ; have is 4gb

  repne scasb               ; while [rdi] != al, keep scanning

  sub   rdi, rbx            ; length = dist2 - dist1
  mov   rax, rdi            ; rax now holds our length

  pop   rcx                 ; restore the saved registers
  pop   rbx

  ret                       ; all done!

It may look longer than the first implementation however this second implementation uses SCASB which will be heaps more optimal than my hand-rolled loop.

Enjoy.