October 24, 2016

Data

Exploring the use of Hash Trees for Data Synchronization - Part 1

October 24, 2016

Data

n this post we'll explore a relational database replication strategy that you can use when standard database replication is not an option – so no replication feature, no log shipping, no mirroring etc. The approaches outlined below will only work with a master-slave model where all writes go to the master. Conflict resolution is not addressed in this article.

We’ll cover phase one of a two-phase approach of
1. Generate and compare hash trees to identify blocks of rows that have discrepancies
2. For each block with a different hash value, identify and import the individual changes (insert, update, delete)
This post is really about exploring the approach rather than looking at the implementation details and detailed performance metrics. Perhaps I might share some code and metrics in a later post if people are interested.

Jack Vanlightly

July 3, 2016

Non-Tech

5 Tricks For People Who Are Hyper Sensitive to Computer Screens

Jack Vanlightly

July 3, 2016

Non-Tech

I have a rare neurological condition that means that when my eyes do work I feel pain, normally a mild pain but one that can grow out of control if I am not careful. Naturally, being a programmer where I stare at a computer screen all day may not be the best job for someone who feels pain upon focusing their eyes. But seriously, what job doesn't involve eye focusing? And more importantly, I am a self confessed programming obsessive, what would I do if I wasn't programming? I think my mind would shrivel up and die so rather than accept defeat I have found ways around it.

This is my list of tips and tricks that allow me to be a programmer.

Jack Vanlightly

February 25, 2016

Programming

Optimizing Regex performance with RegexOptions.RightToLeft

Jack Vanlightly

February 25, 2016

Programming

Regex is fast when it is scanning text that doesn't match it's pattern at all. However, when finding text that almost matches, things can start to slow down. You really want the Regex to either match a text or discard it as soon as possible. Building up large potential matches and finding that they don't fit towards the end can end up being very costly.

For patterns that are string literals, the Regex will run in linear time. So if you get lots of partial matches that get discarded after 20 characters and another Regex discards potential matches after 2 characters then the first will be in the order of ten times slower.

Jack Vanlightly

February 24, 2016

Programming

A More Efficient Regex Tokenizer

Jack Vanlightly

February 24, 2016

Programming

As part of a DSL parsing series, I wrote a post about a super simple yet memory inefficient way of tokenizing some input text. The benefit was that the tokenizer was extremely simple but the downside was that it wouldn't be suitable for large texts or if the tokenizer was called excessively.

In this post we'll look at a similar Regex based tokenizer and trade-off a little simplicity for lot of performance gain.

Jack Vanlightly

February 11, 2016

Programming

Implementing a DSL Parser

Jack Vanlightly

February 11, 2016

Programming

We previously created a tokenizer that breaks up a sequence of characters into a sequence of tokens (enum TokenType) and a LL2 production notation grammar that acts as a template for the code of the parser.

The input of this parser will be the sequence of tokens and the output will be an Intermediate Representation (IR) which is a data structure that represents the DSL text in a structured manner. The next step, after parsing, will be translating this IR into SQL.

Jack Vanlightly

February 4, 2016

Programming

Understanding Grammars

Jack Vanlightly

February 4, 2016

Programming

When you use a parser generator like ANTLR, the grammar is fed in and the code of a parser is the result. However we are hand crafting a parser so for us the grammar will act as a reference and will ultimately map neatly onto our code.

There are different styles of grammars and we'll look at two different ways of expressing a grammar.

Jack Vanlightly

February 4, 2016

Programming

Creating a Simple Tokenizer (Lexer) in C#

Jack Vanlightly

February 4, 2016

Programming

What is a Tokenizer?

Before you read on, there is a more efficient version of this tokenizer here: http://jack-vanlightly.com/blog/2016/2/24/a-more-efficient-regex-tokenizer that has significantly better performance.

So a tokenizer or lexer takes a sequence of characters and output a sequence of tokens. Let's dive straight into an example to illustrate this.

Meet a simplified version of Logging Query Language (LQL)

Jack Vanlightly

February 4, 2016

Programming

How to Create a Query Language DSL with C#

Jack Vanlightly

February 4, 2016

Programming

Creating your own DSL is fun, it involves multiple complex steps which can be challenging and very rewarding to figure out. However if you don't know what you're doing then your code can end up one big hack which, while works, is complicated, hard to change and hard to read. Although pleased that you have a working DSL you know that underneath it's no looker. In this series I will go through, step by step, creating a simple query language that gets translated into SQL and executed against SQL Server.

Jack Vanlightly

January 26, 2016

People and Practice

Detective Optimizers

Jack Vanlightly

January 26, 2016

People and Practice

Having worked in a centralised application support team I have had the opportunity to study the personality traits that make a good fit for support and which don’t. When I say support I am talking about supporting the operation of an application in production, whether it is in a centralised team or in a dev ops team that supports what it builds.

Jack Vanlightly

January 19, 2016

People and Practice

The Freedom Quadrant

Jack Vanlightly

January 19, 2016

People and Practice

What is it about programming that most programmers love? Creativity, problem-solving, analysis and critical thinking are just a few. However, when a senior developer explains to a junior, to the letter, how to implement something, we are taking away those great things about programming from the junior and we are taking away the opportunity for learning. But that doesn’t mean that we should give free rein to anyone in the team to implement solutions as they wish. It all depends on the level of the developer and the complexity of the problem/solution space.