November 9, 2016

Programming

Generating SQL from a Data Structure

November 9, 2016

Programming

We will continue with the same intermediate representation of our DSL and generate the artefacts required to perform a query: the query text and a collection of parameters. The query that will be generated is for a ranking query. It will list the top X errors between two dates.

Jack Vanlightly

November 8, 2016

Programming

How to Kill a Keep Alive with a Weak Reference (C#)

Jack Vanlightly

November 8, 2016

Programming

Taskling.NET uses a keep alive or heartbeat to signal that it is still running. This is useful because when running batch jobs in unstable hosts like IIS the process can be killed off with a ThreadAbortException and the job isn't always able to log it's demise. With a keep alive we know that the job really died if a few minutes pass without a keep alive and the status of the job is "In Progress".

But one problem is how do you reliably kill a keep alive?

Jack Vanlightly

November 7, 2016

Programming

How Row Locking Makes Taskling Concurrency Controls Possible

Jack Vanlightly

November 7, 2016

Programming

Your Taskling jobs can be configured with concurrency limits and those jobs will never have more than the configured number of executions of that job running at any time.

Some batch and micro-batch jobs need to be singletons, there to be only one execution running at any point in time. This may be to avoid data consistency issues when persisting results or because only a single session can be opened to a third party service etc. Other batch processes need more than one execution running at the same time in order to cope with the data volume but have a concurrency limit in order to not overwhelm downstream systems or third party services.

Jack Vanlightly

November 6, 2016

Programming

Announcing Taskling.NET, a C# Batch Job API

Jack Vanlightly

November 6, 2016

Programming

Taskling.NET is a C# batch processing library that enables you to avoid rewriting the same code over and over again for batch and micro-batch jobs.

Overview

Partitioning of batches into blocks of work with guaranteed isolation between blocks across batches
Recover from failures with automatic reprocessing/retries of blocks
Limiting the number of concurrent task executions (across servers)
Critical sections across servers
Standardised activity logging and alerting.
Thread-Safe enabling parallel processing of blocks and list block items

Jack Vanlightly

October 30, 2016

Data

Too Busy to Create Your Own Visualizations? Just Leverage the Neo4J Console

Jack Vanlightly

October 30, 2016

Data

In Vueling we have almost 700 applications using hundreds of databases, queues, FTP sites, web services, remote file shares etc. Understanding how everything fits together is a lost battle without visualizations and we model our entire infrastructure in Neo4J.

Jack Vanlightly

October 27, 2016

Programming

Why You Should Understand Databases

Jack Vanlightly

October 27, 2016

Programming

In recent years I have seen developers distance themselves from databases more and more for various reasons. The two most common reasons seem to be

Business logic should not be split between the database and the application, it should all be stored in the application code. So stored procedures and functions are now an anti-pattern.
The ORM (like Entity Framework, Hibernate, ActiveRecord etc) handles the details of SQL and also data migrations. ORMs provide better developer productivity so writing SQL by hand is an anti-pattern.

Jack Vanlightly

October 24, 2016

Data

Exploring the use of Hash Trees for Data Synchronization - Part 1

Jack Vanlightly

October 24, 2016

Data

n this post we'll explore a relational database replication strategy that you can use when standard database replication is not an option – so no replication feature, no log shipping, no mirroring etc. The approaches outlined below will only work with a master-slave model where all writes go to the master. Conflict resolution is not addressed in this article.

We’ll cover phase one of a two-phase approach of
1. Generate and compare hash trees to identify blocks of rows that have discrepancies
2. For each block with a different hash value, identify and import the individual changes (insert, update, delete)
This post is really about exploring the approach rather than looking at the implementation details and detailed performance metrics. Perhaps I might share some code and metrics in a later post if people are interested.

Jack Vanlightly

July 3, 2016

Non-Tech

5 Tricks For People Who Are Hyper Sensitive to Computer Screens

Jack Vanlightly

July 3, 2016

Non-Tech

I have a rare neurological condition that means that when my eyes do work I feel pain, normally a mild pain but one that can grow out of control if I am not careful. Naturally, being a programmer where I stare at a computer screen all day may not be the best job for someone who feels pain upon focusing their eyes. But seriously, what job doesn't involve eye focusing? And more importantly, I am a self confessed programming obsessive, what would I do if I wasn't programming? I think my mind would shrivel up and die so rather than accept defeat I have found ways around it.

This is my list of tips and tricks that allow me to be a programmer.

Jack Vanlightly

February 25, 2016

Programming

Optimizing Regex performance with RegexOptions.RightToLeft

Jack Vanlightly

February 25, 2016

Programming

Regex is fast when it is scanning text that doesn't match it's pattern at all. However, when finding text that almost matches, things can start to slow down. You really want the Regex to either match a text or discard it as soon as possible. Building up large potential matches and finding that they don't fit towards the end can end up being very costly.

For patterns that are string literals, the Regex will run in linear time. So if you get lots of partial matches that get discarded after 20 characters and another Regex discards potential matches after 2 characters then the first will be in the order of ten times slower.

Jack Vanlightly

February 24, 2016

Programming

A More Efficient Regex Tokenizer

Jack Vanlightly

February 24, 2016

Programming

As part of a DSL parsing series, I wrote a post about a super simple yet memory inefficient way of tokenizing some input text. The benefit was that the tokenizer was extremely simple but the downside was that it wouldn't be suitable for large texts or if the tokenizer was called excessively.

In this post we'll look at a similar Regex based tokenizer and trade-off a little simplicity for lot of performance gain.