Friday, August 17, 2012

pvmanager: a framework to deal with live data (part 2)

I'll continue the overview of the design, focusing on the fluent API.

We left with some basic spec for the object that do the live processing of the data. A user will need to create a whole set for each pipeline, plus the set of instructions to give to the data sources (what data to read, where to put it and who to notify). We need to construct and expression language, which allows to:

  • define each part of the pipeline
  • allow to construct an expression by reusing, mixing and matching different parts
  • allow different users to add their own pieces without requiring modification of the framework

Expression classes

At the end of the day, we'll want something like:

PVReader reader = PVManager.read(<read expression>) 

For example:

// Read the channel named "my channel" at maximum 10 Hz (i.e. will skip the extra)
PVReader<Object> reader = PVManager.read(channel("my channel")).maxRate(ofHertz(10));

// Queue and read all the new values for channel "my channel" at maximum 1 Hz
// (i.e. all notification will arrive, for batch processing)
PVReader<List<Object>> reader = PVManager
.read(newValuesOf(channel("my channel")))
.maxRate(ofHertz(1));

// Read the channel named "my channel" at maximum 1 Hz (i.e. will skip the extra)
PVReader<Map<String, List<Object>>> reader = PVManager.read(
mapOf(newValuesOf(channels("channel1", "channel2", "channel3")))).maxRate(ofHertz(1));

In other word, we need to say "read this", and we need a way to say what this is. We may be tempted to have one Expression class. But if you look closely, some expressions (like channel) will get processed at the data source rate, some other (like mapOf) will get processed at the desired rate, and some (like newValuesOf) will convert a source rate expression to a desired rate expression (i.e. will implement one of those Collectors - queues, caches and the like).

So, we really need at least 2 kinds: a DesiredRateExpression and a SourceRateExpression. So we can define, for example:

public static SourceRateExpression<Object>
channel(String name) {...}

public static <T> DesiredRateExpression<List<T>>
newValuesOf(SourceRateExpression<T> exp) {...}

But if you look closely, again, some expressions (like channels - note the plural) actually give us a list of expressions. So we need extra 2: DesiredRateExpressionList and SourceRateExpressionList. So we can define:

public static SourceRateExpressionList<Object>
channels(String... names) {...}

public static <T> DesiredRateExpression<Map<String, T>>
mapOf(DesiredRateExpressionList<T> expressions) {...}

And so on. If we also want to support writable expressions, we will need a WriteExpression and a WriteExpressionList. If we want the same expressions to be read and write, we will need a DesiredRateReadWriteExpression and SourceRateReadWriteExpression. So, 5 single expression (source, desired, write + 2 combinations), and 5 lists (one for each).

This is where Java is pain, though: since we don't have multiple inheritance, these all need to be interfaces. But, since we still need to provide implementations, we need provide an extra 10 classes. Moreover, interfaces cannot have package private members, which means we may have to expose more things than we strictly need. So, the org.epics.pvmanager.expression has 10 interfaces plus 10 implementation classes. Kind of a pain. Fortunately, this is only a burden when you are implementing new expression, not when you are using the API.

When you are using it, you get an expression fully checked: if an operator needs a expression read at the source rate, that's the only thing it can get; if it requires a list, it can get both a list or a single expression (which is a list of one); if it needs a number, it gets a number, and so on. Any user can create his own operators, mix them with the others, and the power of the API is the combination of all the different operators.

No comments:

Post a Comment