What’s new in LynxKite?
- The Python API can now be used without a running LynxKite instance. If you pass in a SparkSession
to LynxKite (
lk = lynx.kite.LynxKite(spark=spark)), LynxKite will run in that SparkSession.
Useful if you want to run LynxKite as part of a pipeline, rather than as permanent fixture.
LynxKite() constructor in the Python API now defaults to connecting to
- Added “Compute in R” and “Create graph in R” boxes that behave the same as their Python
counterparts, but let you use R. #292
- Set up an Earthly build.
#296 This should make builds very reliable for
- “Compute in Python” boxes can now output plots. Just set the output to
- Upgraded to Apache Spark 3.3.0. #272
- LynxKite is now started more simply, with
#269 This makes deployment much simpler
in Hadoop environments.
- The new box “Import from Neo4j files” can be used to import Neo4j data directly from files
instead of reading from a running Neo4j instance. This can reduce the memory requirements
from terabytes to gigabytes on large datasets. #268
- Added two new “Import from BigQuery” boxes. #245
- Changed the font styling on legends to make them more readable over maps.
- The “Import from Parquet” box now has an option for using the source files directly
instead of pulling the data into LynxKite. #261
This avoids an unnecessary copy and is more convenient to use through the Python API.
- The “Weighted aggregate on neighbors” box now supports weighting by edge attributes.
- The “Add rank attribute” box now supports ranking edges by edge attributes.
- Added GPU implementations of several algorithms using RAPIDS cuGraph.
Enable GPU usage by setting
The list of algorithms includes PageRank, connected components, betweenness and Katz centrality,
the Louvain method, k-core decomposition, and ForceAtlas2, a new option in
Place vertices with edge lengths.
- Switched the internal storage of graph entities from custom SequenceFiles to Parquet.
This is an incompatible change, but the migration is simple: delete
Everything will be recomputed when accessed, and will be stored in the new format.
- Added methods in the Python API for conversion between PySpark DataFrames and LynxKite tables.
- Domain preference is now configurable. #236
This is useful if you want the distributed Spark backend to take precedence over the
local Sphynx backend.
- Upgraded to PyTorch Geometric (PyG) 2.0.1. #206
- Upgraded to NetworKit 10.0. #234
- The workspace interface is much faster now. #220
- Now using Conda for managing all dependencies. #209
- Fixed an issue with Python boxes returning errors unnecessarily. #225
- Fixed an issue with GCS. #224
- Fixed CUDA issues with GCN and Node2vec boxes. #234
- Upgraded to Apache Spark 3.1.2. This also brought us up to Scala 2.12, Java 11,
Play Framework 2.8.7, and new versions of some other dependencies.
- The “Custom plot” box now lets you use the latest version of Vega-Lite
by directly writing JSON instead of going through the Vegas Scala DSL.
- Logistic regression models can now be configured to use elastic net regularization.
- Boxes used as steps in a wizard are highlighted in the workspace view by a faint glow.
- “Compute in Python” boxes can be used on tables. #160
- Added a “Draw ROC curve” built-in custom box. #197
- Performance and compatibility improvements.
- Fix for attributes becoming undefined. #176
- Fix for Chrome 90. #162
- Fixed a few other UI bugs. #164
- Reduced memory use in Sphynx. #141
- 42 algorithms from NetworKit have been integrated
into LynxKite. They include new centrality measures, random graph generators,
community detection methods, graph metrics (diameter, effective diameter, assortativity),
optimal spanning trees and more.
- Users can now opt in to sharing anonymous usage statistics with the LynxKite team.
- Environment variables can be used to override
- Added a built-in for parametric parameters (
workspaceName) that can be used to
force recomputation in wizards. (#131)
- Neo4j 4.x support.
- Revamped Neo4j import. Instead of importing tables, you can now import a whole graph.
- Added Neo4j export. You can export vertex or edge attribute or the whole graph.
- AVRO and Delta Lake import and export.
- Added the “Filter with SQL” box as a more flexible alternative to “Filter by attributes”.
- Visualization option to not display edges. Great in large geographic datasets.
- “Use table as vertex/edge attributes” boxes are more friendly and handle name conflicts better
- Added aggregation support for Vector attributes. (Elementwise average, sum, etc.)
- Added an option to disable generated suffixes for aggregated variables.
- Fix for edge coloring. (#84)
- Fixed issue with interactive tutorials. (#30)
- Fixed issue with graph attributes in “Create graph in Python”. (#25)
- Fixed issue with non-String attributes in “Use table as graph”. (#26)
- Replaced trademarked box icons (it was an accident!) with free ones.
Also switched to FontAwesome 5 everywhere to get a better selection of icons.
- Improved the User Guide. (#38, #39)
We’ve open-sourced LynxKite!
We took this opportunity to make many changes that break compatibility with the LynxKite 3.x series.
We can help migrate existing workspaces to LynxKite 4.0 if necessary.
- Replaced the separate
Double attribute types with
- Instead of the
(Double, Double) attribute type, 2D positions are now represented as
Vector[number]. This type is widely supported and more flexible.
Use “Bundle vertex attributes into a Vector” instead of “Convert vertex attributes to
position”, which is now gone.
- Renamed “scalars” to “graph attributes”. Renamed “projects” to “graphs”. These mysterious names
were largely used for historical reasons.
- Removed “Predict with a graph neural network” operation.
(It was an early prototype, long since succeeded by the “Predict with GCN” box.)
- Removed “Predict attribute by viral modeling” box. It is more flexible to do the same
thing through a series of more elemental boxes.
A built-in box (“Predict from communities”) has been added to serve as a starting point.
- Made it easier to use graph convolutional boxes: added “Bundle vertex attributes into a Vector”
and “One-hot encode attribute” boxes.
- Replaced the “Reduce vertex attributes to two dimensions” and “Embed with t-SNE” boxes with
the new “Reduce attribute dimensions” box which offers both PCA and t-SNE.
- “Compute in Python” boxes now support
- “Create Graph in Python” box added.
- Inputs and outputs for “Compute in Python” can now be inferred from the code.
- More accurate progress indicators for box outputs.
- Visualizations can now render edges as undirected straight lines.
- Hover and progress animations for boxes.
- Implemented a lot of common operations on Sphynx speeding up many workspaces significantly.
- Added graph convolutional network operations: “Train a GCN regressor”,
“Train a GCN classifier”, and “Predict with GCN”.
- Added “Compute in Python” box.
- Advanced settings in some boxes are hidden behind a click.
- Long legends on visualizations can be scrolled.
- Wizards can be maximized.
- Revamped the “Visualize as slider” feature. The slider now appears on the visualization
instead of appearing in the configuration. The slider can affect either the color or the visibility
- The currently viewed folder is now stored in the URL, so you can send a link to a specific folder.
The default folder after logging in is your user folder.
- Small improvements, like better defaults for graph visualizations and nicer trigger button on
- Hotfix for a Sphynx bug. (#9053)
- Sphynx is now enabled by default. A couple operations run on Sphynx.
- Wizards have been added as an experimental feature. Use the anchor box to turn a workspace
into a wizard.
- If you copied some boxes into a YAML file, you can now drag-and-drop this file
to a workspace to insert those boxes.
kiterc configuration options to allow public access LynxKite instances.
- Added “Embed vertices”, “Embed with t-SNE”, and “Import well-known graph dataset”
operations. They require PyTorch Geometric to be installed.
- Smaller UI improvements and performance improvements.
- Retry backend requests on 504 Gateway Timeout errors. This allows running without timeout errors
even behind proxies that we cannot reconfigure, such as in client deployments or on
- Minor UI fixes and graph visualization improvements.
- New multi-domain backend. This is the foundation for the high-performance Sphynx backend.
(But Sphynx is not included yet.)
- Users are welcomed by an interactive tutorial.
- A choice of color maps for visualizing qualitative attributes.
- Instruments (such as graph visualization and SQL) can be used on snapshots.
- “Go to root folder” button and “Copy path to clipboard” option in directory browser.
- Bug fixes and improvements, such as eliminating flicker on UI elements.
- Click a vertex in a visualization to open a context menu for interactive graph navigation.
- Popup box improvements: Parameters are full-width. Popups avoid overlapping. Popups reopen
at previous position with previous dimensions.
- Added new operation: Export to Hive.
- For consistency, project tables such as
edges can be accessed as
- Bugfixes for HDFS use under Kerberos, and minor fixes and improvements in LynxKite.
- The first public evaluation release.