LynxKite Architecture 101

09 Jun 2022 — Daniel Darabos, Chief Software Architect

I’m writing this blog post to help our new engineers starting in the summer. It can also be helpful for anyone interested in learning how LynxKite works. Every large application is like a thousand-year-old city: full of clever ideas, some of which became solid pillars while others got covered up with new clever ideas. Put on your archeologist’s hat and let’s dive in!

Arriving at the dig site

Let’s get, build, and run LynxKite:

git clone https://github.com/lynxkite/lynxkite.git
cd lynxkite
conda env create --name lk --file conda-env.yml
conda activate lk
cp conf/kiterc_template ~/.kiterc
./run.sh

If everything went well, you can open http://localhost:2200/ in your browser. Take the Flight Routes tutorial for a spin. (Or another tutorial.)

You can see that workspaces are built from boxes. Where are these boxes in the code? Let’s look at PageRank.

Dusting off the surface

A box on the UI is called a “frontend operation” in the code. PageRank in particular is found in GraphComputationOperations.scala:

register("Compute PageRank")(new ProjectTransformation(_) {
  params ++= List(
    Param("name", "Attribute name", defaultValue = "page_rank"),
    Choice("weights", "Weight attribute", options = FEOption.noWeight +: project.edgeAttrList[Double]),
    NonNegInt("iterations", "Number of iterations", default = 5),
    Ratio("damping", "Damping factor", defaultValue = "0.85"),
    Choice("direction", "Direction", options = Direction.attrOptionsWithDefault("outgoing edges")),
  )
  def enabled = project.hasEdgeBundle
  def apply() = {
    assert(params("name").nonEmpty, "Please set an attribute name.")
    val op = graph_operations.PageRank(params("damping").toDouble, params("iterations").toInt)
    val weightsName = params("weights")
    val direction = Direction(
      params("direction"),
      project.edgeBundle,
      reversed = true)
    val es = direction.edgeBundle
    val weights =
      if (weightsName == FEOption.noWeight.id) es.const(1.0)
      else direction.pull(project.edgeAttributes(params("weights"))).runtimeSafeCast[Double]
    project.newVertexAttribute(
      params("name"),
      op(op.es, es)(op.weights, weights).result.pagerank,
      help)
  }
})

It manages the UI of the box, translates the frontend operation into a series of backend operations, and updates the “project” state. (The box output.) Looking at these one by one:

params is an instance of ParameterHolder and makes it easy to define and access parameters. Parameters have a “kind” field which controls how they are displayed on the UI. Some “kinds” are simple text boxes, others are complex JavaScript monsters. They all live in harmony in app/scripts/operation. Changing yes/no parameters into checkboxes by adding a new “kind” could be a great starter task.
Backend operations are the stars of the show! They actually implement algorithms. We explicitly create op, an instance of graph_operations.PageRank. The op(op.es, es)(op.weights, weights).result.pagerank expression at the end fills in the inputs of the operation. op.es and op.weights are the input “slots” and es and weights are the entities that go into them. The lines of code that apply the direction and generate unit weights also create backend operations indirectly. No actual computation is performed at this point, but new “metagraph entities” are created.
Lastly, project.newVertexAttribute() mutates the “project” by adding the new attribute that we just computed. The “project” is the graph state that many boxes take as input/output. It is a bundle of metagraph entities (vertices, edges, attributes) structured to be presentable to the user. Because this box is a subclass of ProjectTransformation it has a project variable initialized to its input state and will use it to create its output.

A box can be re-executed many times. Before #220 we executed all of them on every little change. It’s cheap because they don’t touch the data. They work at the “metagraph” level. Plus it’s not like creating many backend operations will cause a pile-up of PageRanks. If the backend operation has already existed then we are basically getting a new reference to it.

Digging into the next layer

Time to look at a backend operation! Let’s head to PageRank.scala.

Adding new algorithms to LynxKite almost always means adding new backend operations. The inputs of frontend operations (boxes) are “states”, such as a project (a graph) or a table. The inputs of backend operations on the other hand are metagraph entities with carefully defined relationships. The same goes for the outputs.

Because we had to define inputs so many times, we made it very comfortable and type-safe. On the other hand a lot of Scala wizardry is taking place behind the scenes, which can obfuscate things a bit. I find it a good trade in the end, because we write operation signatures all the time, but basically never have to touch the “magic” code.

For PageRank the signature is:

class Input extends MagicInputSignature {
  val (vs, es) = graph
  val weights = edgeAttribute[Double](es)
}
class Output(implicit instance: MetaGraphOperationInstance, inputs: Input) extends MagicOutput(instance) {
  val pagerank = vertexAttribute[Double](inputs.vs.entity)
}

These es and weights identifiers are exactly the ones used in the frontend operation to specify the inputs (op.es and op.weights). We didn’t have to pass in vs because passing in es already unambiguously determined its value. In fact we could omit es too, since weights has to be an attribute on es.

Likewise pagerank is used to access the output of the operation.

In the PageRank.execute() method we finally arrive at the implementation of an algorithm. It is only here that we finally get to touch data. This is a Spark operation, so the data appears as RDDs.

Looking back at the history in Git shows that the code used to be fairly simple in 2014 but picked up a series of optimizations over time.

First we added our own implementations of some Spark operations that take advantage of how we store graph data. (Partitioned by the hash of the ID, but then sorted by ID within each partition.) These are the reduceBySortedKey() and sortedLeftOuterJoin() methods.

Later HybridRDDs were added to handle skewed datasets. (Graphs with a small number of very-high-degree vertices.) See HybridRDD.scala for how this works.

But if you look at the for (i <- 0 until iterations) loop, that is still readable as a PageRank implementation.

Where the tunnels lead

Scrolling down in GraphComputationOperations.scala, the next box under Compute PageRank is Find Steiner tree. It’s pretty much the same, but it uses graph_operations.Dapcstp instead of graph_operations.PageRank as the backend operation. When we look inside Dapcstp.scala, it has a signature just like PageRank, but no execute method!

LynxKite has a concept of execution domains. We have a Spark domain for running operations implemented with Spark and a Sphynx domain for operations that are implemented on our high-performance single-machine backend written in Go. (In reality we have half a dozen domains, see BigGraphEnvironment.scala for details.)

All domains can advertise that they have the data for a metagraph entity, that they can compute an operation, and how data can be relocated to them. They have a compute() method that has to make the operation happen. The Spark domain’s compute will call the execute() method on an operation, while the Sphynx domain’s compute sends an RPC to Sphynx.

We find the implementation of Dapcstp in dapcstp.go.

If you look around the sphynx directory, you will also find some operations implemented in Python and others used from the NetworKit library through a SWIG wrapper for its C++ API. Sphynx has proven to be a great way to integrate single-machine algorithms!

The pillars underneath

By now you know everything needed to start creating your own boxes with new backend operations. To finish this post I will point you to the parts of LynxKite that are propping up the whole operation.

MetaGraph.scala defines the metagraph entities, the operation base classes, and all the utilities for creating operation signatures. VertexSet, EdgeBundle, Attribute[T], Scalar, and Table are simple and very widely used classes defined here.
MetaGraphManager.scala is making the operations persistent. The MetaGraphManager is passed around a lot, often as an implicit parameter. It is needed anywhere where we want to be able to create backend operations.
DataManager.scala orchestrates the work of the domains. This is the class that calls a domain’s compute() to get something computed.
Workspace.scala is the internal representation of workspaces and boxes and it also has the code for executing a workspace. (Not in the execute() sense, just creating the output states.)
Project.scala is the code for the graph state (“project”) that boxes input and output. Unexpectedly, it also has the code for the directory structure in which workspaces, wizards, and snapshots are stored.
graph_api/Scripting.scala has the implicit classes that are responsible for the op(op.es, es).result form of invoking operations.
graph_util/Scripting.scala is less important, but has a few convenience methods that are often used in frontend operations. For example the reverse method on EdgeBundles comes from here.

Hopefully this is useful for someone diving into LynxKite. Unlike an archeological site, LynxKite cannot be cordoned off. There is a risk that in a few months time there will be new clever ideas plastered all over this. See What’s new in LynxKite? or watch the LynxKite repository on GitHub to keep on top of it:

Watch