What’s new in LynxKite?

5.4.1

Reduced the size of the released Docker image. #418

5.4.0

Changed the license from AGPL to Apache 2.0. #416
Added an “Embed String attribute” box that can use OpenAI or local models to create vector embeddings. #414 The embeddings can then be used in any machine learning box.
KITE_DRAWING_OVERALL now properly controls the limits on the size of visualizations. #410

5.3.1

Added a database parameter for Neo4j import and export boxes. #404
Fixed an issue with S3 access. #402

5.3.0

Upgraded to Apache Spark 3.3.2. #369
Added an “Ask OpenAI” built-in custom box. #353 It can answer natural language questions about the input graph in the form of a table. (For example, you could ask for “cities in the same time zone as Paris”.) Set the OPENAI_API_KEY environment variable before using it.
“Compute in Python” can now output tables when its input is a graph. #353

DataFrames can now be directly passed between PySpark and LynxKite, if LynxKite is running in a user-provided SparkSession. #327, #328 Example usage:

import lynx.kite
# Start LynxKite in this SparkSession.
lk = lynx.kite.LynxKite(spark=spark)
# Turn a LynxKite table into a Spark DataFrame.
df = lk.createExampleGraph().sql('select name, age from vertices').spark()
df = df.filter('age < 30')
# Turn a Spark DataFrame into a LynxKite table.
g = lk.from_spark(df).useTableAsVertices()

Custom boxes can now have parameters that choose from a fixed set of options. (#371)
Removed the SQL interface in the directory browser. (#332) It only worked with snapshots and the results could only be saved to a few file formats. Loading the data in a workspace is a much more powerful alternative.
Switched the frontend build from Gulp to Vite. #356
Switched the frontend test from Protractor to Playwright. #238

5.2.0

The Python API can now be used without a running LynxKite instance. If you pass in a SparkSession to LynxKite (lk = lynx.kite.LynxKite(spark=spark)), LynxKite will run in that SparkSession. #294 Useful if you want to run LynxKite as part of a pipeline, rather than as permanent fixture.
The LynxKite() constructor in the Python API now defaults to connecting to http://localhost:2200. #291
Added “Compute in R” and “Create graph in R” boxes that behave the same as their Python counterparts, but let you use R. #292
Set up an Earthly build. #296 This should make builds very reliable for everyone.
“Compute in Python” boxes can now output plots. Just set the output to matplotlib, or html. #297

5.1.0

Upgraded to Apache Spark 3.3.0. #272
LynxKite is now started more simply, with spark-submit. #269 This makes deployment much simpler in Hadoop environments.
The new box “Import from Neo4j files” can be used to import Neo4j data directly from files instead of reading from a running Neo4j instance. This can reduce the memory requirements from terabytes to gigabytes on large datasets. #268
Added two new “Import from BigQuery” boxes. #245
Changed the font styling on legends to make them more readable over maps. #267
The “Import from Parquet” box now has an option for using the source files directly instead of pulling the data into LynxKite. #261 This avoids an unnecessary copy and is more convenient to use through the Python API.
The “Weighted aggregate on neighbors” box now supports weighting by edge attributes. #257
The “Add rank attribute” box now supports ranking edges by edge attributes. #255

5.0.0

Added GPU implementations of several algorithms using RAPIDS cuGraph. #241 Enable GPU usage by setting KITE_ENABLE_CUDA=yes in .kiterc. The list of algorithms includes PageRank, connected components, betweenness and Katz centrality, the Louvain method, k-core decomposition, and ForceAtlas2, a new option in Place vertices with edge lengths.
Switched the internal storage of graph entities from custom SequenceFiles to Parquet. #237 This is an incompatible change, but the migration is simple: delete $KITE_DATA/partitioned. Everything will be recomputed when accessed, and will be stored in the new format.
Added methods in the Python API for conversion between PySpark DataFrames and LynxKite tables. #240
Domain preference is now configurable. #236 This is useful if you want the distributed Spark backend to take precedence over the local Sphynx backend.

4.4.0

Upgraded to PyTorch Geometric (PyG) 2.0.1. #206
Upgraded to NetworKit 10.0. #234
The workspace interface is much faster now. #220
Now using Conda for managing all dependencies. #209
Fixed an issue with Python boxes returning errors unnecessarily. #225
Fixed an issue with GCS. #224
Fixed CUDA issues with GCN and Node2vec boxes. #234

4.3.0

Upgraded to Apache Spark 3.1.2. This also brought us up to Scala 2.12, Java 11, Play Framework 2.8.7, and new versions of some other dependencies. #178 #184
The “Custom plot” box now lets you use the latest version of Vega-Lite by directly writing JSON instead of going through the Vegas Scala DSL.
Logistic regression models can now be configured to use elastic net regularization.
Boxes used as steps in a wizard are highlighted in the workspace view by a faint glow. #183
“Compute in Python” boxes can be used on tables. #160
Added a “Draw ROC curve” built-in custom box. #197
Performance and compatibility improvements. #188 #194

4.2.2

Fix for attributes becoming undefined. #176

4.2.1

Fix for Chrome 90. #162
Fixed a few other UI bugs. #164
Reduced memory use in Sphynx. #141

4.2.0

42 algorithms from NetworKit have been integrated into LynxKite. They include new centrality measures, random graph generators, community detection methods, graph metrics (diameter, effective diameter, assortativity), optimal spanning trees and more. (#102, #106, #111, #123)
Users can now opt in to sharing anonymous usage statistics with the LynxKite team. (#128)
Environment variables can be used to override .kiterc settings. (#110)
Added a built-in for parametric parameters (workspaceName) that can be used to force recomputation in wizards. (#131)

4.1.0

Neo4j 4.x support.
Revamped Neo4j import. Instead of importing tables, you can now import a whole graph. (#90)
Added Neo4j export. You can export vertex or edge attribute or the whole graph. (#91)
AVRO and Delta Lake import and export. (#63, #86)
Added the “Filter with SQL” box as a more flexible alternative to “Filter by attributes”.
Visualization option to not display edges. Great in large geographic datasets.
“Use table as vertex/edge attributes” boxes are more friendly and handle name conflicts better now.
Added aggregation support for Vector attributes. (Elementwise average, sum, etc.)
Added an option to disable generated suffixes for aggregated variables.
Fix for edge coloring. (#84)

4.0.1

Fixed issue with interactive tutorials. (#30)
Fixed issue with graph attributes in “Create graph in Python”. (#25)
Fixed issue with non-String attributes in “Use table as graph”. (#26)
Replaced trademarked box icons (it was an accident!) with free ones. Also switched to FontAwesome 5 everywhere to get a better selection of icons.
Improved the User Guide. (#38, #39)

4.0.0

We’ve open-sourced LynxKite!

We took this opportunity to make many changes that break compatibility with the LynxKite 3.x series. We can help migrate existing workspaces to LynxKite 4.0 if necessary.

Replaced the separate Long, Int, Double attribute types with number.
Instead of the (Double, Double) attribute type, 2D positions are now represented as Vector[number]. This type is widely supported and more flexible. Use “Bundle vertex attributes into a Vector” instead of “Convert vertex attributes to position”, which is now gone.
Renamed “scalars” to “graph attributes”. Renamed “projects” to “graphs”. These mysterious names were largely used for historical reasons.
Removed “Predict with a graph neural network” operation. (It was an early prototype, long since succeeded by the “Predict with GCN” box.)
Removed “Predict attribute by viral modeling” box. It is more flexible to do the same thing through a series of more elemental boxes. A built-in box (“Predict from communities”) has been added to serve as a starting point.
Made it easier to use graph convolutional boxes: added “Bundle vertex attributes into a Vector” and “One-hot encode attribute” boxes.
Replaced the “Reduce vertex attributes to two dimensions” and “Embed with t-SNE” boxes with the new “Reduce attribute dimensions” box which offers both PCA and t-SNE.
“Compute in Python” boxes now support Vector[Double] attributes.
“Create Graph in Python” box added.
Inputs and outputs for “Compute in Python” can now be inferred from the code.

3.2.1

More accurate progress indicators for box outputs.
Visualizations can now render edges as undirected straight lines.
Hover and progress animations for boxes.

3.2.0

Implemented a lot of common operations on Sphynx speeding up many workspaces significantly.
Added graph convolutional network operations: “Train a GCN regressor”, “Train a GCN classifier”, and “Predict with GCN”.
Added “Compute in Python” box.
Advanced settings in some boxes are hidden behind a click.
Long legends on visualizations can be scrolled.
Wizards can be maximized.
Revamped the “Visualize as slider” feature. The slider now appears on the visualization instead of appearing in the configuration. The slider can affect either the color or the visibility of vertices.
The currently viewed folder is now stored in the URL, so you can send a link to a specific folder. The default folder after logging in is your user folder.
Small improvements, like better defaults for graph visualizations and nicer trigger button on import boxes.

3.1.1

Hotfix for a Sphynx bug. (#9053)

3.1.0

Sphynx is now enabled by default. A couple operations run on Sphynx.
Wizards have been added as an experimental feature. Use the anchor box to turn a workspace into a wizard.
If you copied some boxes into a YAML file, you can now drag-and-drop this file to a workspace to insert those boxes.
New kiterc configuration options to allow public access LynxKite instances.
Added “Embed vertices”, “Embed with t-SNE”, and “Import well-known graph dataset” operations. They require PyTorch Geometric to be installed.
Smaller UI improvements and performance improvements.

3.0.2

Retry backend requests on 504 Gateway Timeout errors. This allows running without timeout errors even behind proxies that we cannot reconfigure, such as in client deployments or on demo.lynxkite.com.

3.0.1

Minor UI fixes and graph visualization improvements.

3.0.0

New multi-domain backend. This is the foundation for the high-performance Sphynx backend. (But Sphynx is not included yet.)
Users are welcomed by an interactive tutorial.
A choice of color maps for visualizing qualitative attributes.
Instruments (such as graph visualization and SQL) can be used on snapshots.
“Go to root folder” button and “Copy path to clipboard” option in directory browser.
Bug fixes and improvements, such as eliminating flicker on UI elements.

2.8.5

Click a vertex in a visualization to open a context menu for interactive graph navigation.

2.8.4

Popup box improvements: Parameters are full-width. Popups avoid overlapping. Popups reopen at previous position with previous dimensions.

2.8.3

Added new operation: Export to Hive.

2.8.2

Remote API bugfixes

2.8.1

For consistency, project tables such as vertices and edges can be accessed as input.vertices and input.edges now.
Bugfixes for HDFS use under Kerberos, and minor fixes and improvements in LynxKite.

2.8.0

Upgraded to Spark 2.4.3.

2.7.0

The first public evaluation release.