May 26, 2018

Programming

Docker, .NET Core and Redshift Drivers

May 26, 2018

Programming

Just a quick post to share how to include the Redshift ODBC driver in your .NET Core docker container. This is from the CdcTools.CdcToRedshift application in my CDC Tools repo.

Jack Vanlightly

April 21, 2018

Programming

Processing Pipelines Series - Reactive Extensions (Rx.NET)

Jack Vanlightly

April 21, 2018

Programming

Whereas TPL Dataflow is all about passing messages between blocks, Reactive Extensions is about sequences. With sequences we can create projections, transformations and filters. We can combine multiple sequences into a single one. It is a very flexible and powerful paradigm but with such power comes extra complexity. I find TPL Dataflow easier to reason about due to its simple model. Reactive Extensions can get pretty complex and is not always intuitive, but you can create some elegant solutions with it. It will require some investment in time and tinkering to get a reasonable understanding of it.

Jack Vanlightly

April 19, 2018

Programming

Processing Pipelines Series - TPL Dataflow - Alternate Scenario

Jack Vanlightly

April 19, 2018

Programming

In the last post we built a TPL Dataflow pipeline based on the scenario from our first post in the series. Today we'll build another pipeline very similar to the first but with different requirements around latency and data loss.

In the first scenario we could not slow down the producer as slowing it down would cause data loss (it read from a bus that would not wait if you weren't there to consume the data). We also cared a lot about ensuring the first stage kept up with the producer and successfully wrote every message to disk. The rest was best effort, and we performed load-shedding so as not to slow down the producer.

Jack Vanlightly

April 18, 2018

Programming

Processing Pipelines Series - TPL Dataflow

Jack Vanlightly

April 18, 2018

Programming

TPL Dataflow is a data processing library from Microsoft that came out years ago. It consists of different "blocks" that you compose together to make a pipeline. Blocks correspond to stages in your pipeline. If you didn't read the first post in the series then that might not be a bad idea before you read on.

Jack Vanlightly

April 17, 2018

Programming

Processing Pipelines Series - Concepts

Jack Vanlightly

April 17, 2018

Programming

In this series we'll look at few different .NET technologies we can use to process streams of data in processing pipelines and directed acyclic graphs (DAGs). This is not about distributed data platforms for big data but real-time processing and computation running on a single machine. We'll take a single scenario and build it out multiple times, each with a different technology. Each application will be built as a console application with .NET Core.

Jack Vanlightly

June 10, 2017

Programming

DSL Parser - Sample Code

Jack Vanlightly

June 10, 2017

Programming

I while back I wrote a blog series about DSLs, grammars, tokenizers, parsers and a SQL generator. The idea was that you could write a DSL query to mine your error log data and the code would generate SQL. The series can be found here: http://jack-vanlightly.com/blog/2016/2/3/how-to-create-a-query-language-dsl.

The tokenizer while simple was very inefficient, so I wrote a better one, you can find that here: http://jack-vanlightly.com/blog/2016/2/24/a-more-efficient-regex-tokenizer

I have just published working code based on this series and the better Regex tokenizer on Github here: https://github.com/Vanlightly/DslParser.

Jack Vanlightly

November 15, 2016

Programming

new SqlConnection - The requested Performance Counter is not a custom counter

Jack Vanlightly

November 15, 2016

Programming

"The requested Performance Counter is not a custom counter, it has to be initialized as ReadOnly."

This error suddenly started happening today when debugging in Visual Studio 2015, why I don´t know. But it seems the counter related to the connection pool disappeared from my Windows machine.

Jack Vanlightly

November 9, 2016

Programming

Generating SQL from a Data Structure

Jack Vanlightly

November 9, 2016

Programming

We will continue with the same intermediate representation of our DSL and generate the artefacts required to perform a query: the query text and a collection of parameters. The query that will be generated is for a ranking query. It will list the top X errors between two dates.

Jack Vanlightly

November 8, 2016

Programming

How to Kill a Keep Alive with a Weak Reference (C#)

Jack Vanlightly

November 8, 2016

Programming

Taskling.NET uses a keep alive or heartbeat to signal that it is still running. This is useful because when running batch jobs in unstable hosts like IIS the process can be killed off with a ThreadAbortException and the job isn't always able to log it's demise. With a keep alive we know that the job really died if a few minutes pass without a keep alive and the status of the job is "In Progress".

But one problem is how do you reliably kill a keep alive?

Jack Vanlightly

November 7, 2016

Programming

How Row Locking Makes Taskling Concurrency Controls Possible

Jack Vanlightly

November 7, 2016

Programming

Your Taskling jobs can be configured with concurrency limits and those jobs will never have more than the configured number of executions of that job running at any time.

Some batch and micro-batch jobs need to be singletons, there to be only one execution running at any point in time. This may be to avoid data consistency issues when persisting results or because only a single session can be opened to a third party service etc. Other batch processes need more than one execution running at the same time in order to cope with the data volume but have a concurrency limit in order to not overwhelm downstream systems or third party services.